WordPressWP-CLIMigrationPHPFeatured Image

How to Fix Duplicate and Missing Featured Images After a WordPress Migration (WP-CLI + MU Plugin)

After migrating a WordPress blog, duplicate hero images and missing thumbnails are a nightmare. This WP-CLI command fixes both problems safely — with dry-run mode, batch processing, and smart URL matching.

Milan PavlákMilan Pavlák
11 min čítania
How to Fix Duplicate and Missing Featured Images After a WordPress Migration (WP-CLI + MU Plugin)

Migrating a WordPress blog with hundreds — or thousands — of posts is never just a copy-paste job. Even when the content arrives cleanly, one problem surfaces almost every time: featured images are a mess.

I ran into this recently on a large blog migration. The old theme was built years ago, back when featured images were an afterthought. Editors worked with what they had:

  • On some posts, they set a proper Featured Image via the post meta.
  • On others — because the old theme never displayed thumbnails — they just dropped the image as the first element inside the post content.

Both approaches worked fine on the old site. On the new theme? Complete chaos.


The Two Problems This Creates

1. Duplicate hero images

If a post has both a Featured Image and the same image as the first <img> tag in the content, modern themes render both. You end up with the same photo displayed twice at the top of the post — once by the theme's hero block, once by the content renderer.

2. Broken archive cards and SEO thumbnails

If a post has no Featured Image (because the editor relied on inline placement), your blog archive, social share previews, SEO plugins like Yoast or RankMath, and related post modules all come up empty. No thumbnail. Just a grey placeholder — or worse, nothing.

The fix sounds simple: go through every post, set the featured image if it's missing, and remove the duplicate if it already exists. In practice, doing this manually through wp-admin on 3,000 posts is not realistic.


Why Bulk Plugins Don't Cut It Here

There are a handful of WordPress plugins that handle adjacent problems — Auto Post Thumbnail, Bulk Featured Image, Regenerate Thumbnails — but none of them solve this specific combination cleanly.

The gaps are real:

  • No dry-run mode — you're running blind
  • No match-only logic — they can accidentally remove legitimate inline images
  • No batch offset — they time out on large datasets
  • They run through wp-admin, which means PHP timeouts on big sites
  • They can't skip plugins and themes during execution, which causes hangs

For a migration job, you need something you can run in a terminal, inspect first, and roll back if needed. That's WP-CLI.


The Solution: A WP-CLI Command as an MU Plugin

The command below handles both problems with a clear set of rules:

Case A — Featured image already exists: If the first <img> in the content matches the featured image URL, remove it from the content. No match? Leave it alone. No accidental removals.

Case B — Featured image is missing: Find the first <img> in the content, sideload it into the Media Library if needed, set it as the Featured Image, then remove it from the content to prevent future duplication.

The code lives in an MU plugin (must-use plugin), meaning it loads automatically without activation — even when you run WP-CLI with --skip-plugins --skip-themes.


Installation

Create a new file at:

wp-content/mu-plugins/featured-image-migrator.php

Then paste the full code below.


Full Code

<?php
/**
 * Plugin Name: Featured Image Migrator (WP-CLI)
 * Description: WP-CLI command to deduplicate first content image vs featured image,
 *              and backfill missing featured images from first content image.
 * Author: Milan Pavlak
 * Version: 1.0.0
 */

if (!defined('ABSPATH')) {
    exit;
}

if (!defined('WP_CLI') || !WP_CLI) {
    return;
}

class Featured_Image_Migrator_Command {

    private $stats = [
        'set_featured'                   => 0,
        'removed_first_img'              => 0,
        'skipped_removal_not_matching'   => 0,
        'no_first_img'                   => 0,
        'sideload_failed'                => 0,
    ];

    /**
     * Deduplicate and backfill featured images across posts.
     *
     * ## OPTIONS
     *
     * [--ids=<ids>]
     * : Comma-separated list of post IDs to process.
     *
     * [--limit=<n>]
     * : Number of posts to process (default: 200).
     *
     * [--offset=<n>]
     * : Offset for pagination (default: 0).
     *
     * [--dry-run]
     * : Preview changes without writing anything.
     *
     * [--remove-first-image]
     * : Remove the first image from content after moving it to featured.
     *
     * [--remove-only-if-matches]
     * : When featured exists, only remove the first content image if it matches the featured URL.
     *
     * [--status=<status>]
     * : Post statuses to include (default: publish). Example: publish,draft
     */
    public function __invoke($args, $assoc_args) {
        $dry_run              = !empty($assoc_args['dry-run']);
        $remove_first_image   = !empty($assoc_args['remove-first-image']);
        $remove_only_if_matches = !empty($assoc_args['remove-only-if-matches']);

        $limit  = isset($assoc_args['limit'])  ? max(1, (int)$assoc_args['limit'])  : 200;
        $offset = isset($assoc_args['offset']) ? max(0, (int)$assoc_args['offset']) : 0;

        $status   = isset($assoc_args['status']) ? trim((string)$assoc_args['status']) : 'publish';
        $statuses = array_filter(array_map('trim', explode(',', $status)));

        $ids = [];
        if (!empty($assoc_args['ids'])) {
            $ids = array_filter(array_map('intval', explode(',', (string)$assoc_args['ids'])));
        }

        $q_args = [
            'post_type'      => 'post',
            'post_status'    => $statuses ?: ['publish'],
            'fields'         => 'ids',
            'orderby'        => 'ID',
            'order'          => 'ASC',
            'posts_per_page' => $limit,
            'offset'         => $offset,
            'no_found_rows'  => true,
        ];

        if (!empty($ids)) {
            $q_args['post__in']      = $ids;
            $q_args['posts_per_page'] = count($ids);
            $q_args['offset']        = 0;
            $q_args['orderby']       = 'post__in';
        }

        $query    = new WP_Query($q_args);
        $post_ids = $query->posts;

        WP_CLI::log("Processing " . count($post_ids) . " posts. dry-run=" . ($dry_run ? 'yes' : 'no'));

        foreach ($post_ids as $post_id) {
            $this->process_post((int)$post_id, $dry_run, $remove_first_image, $remove_only_if_matches);
        }

        WP_CLI::log("---- STATS ----");
        foreach ($this->stats as $k => $v) {
            WP_CLI::log("$k: $v");
        }

        WP_CLI::success("Done.");
    }

    private function process_post(int $post_id, bool $dry_run, bool $remove_first_image, bool $remove_only_if_matches): void {
        $post = get_post($post_id);
        if (!$post || empty($post->post_content)) {
            $this->stats['no_first_img']++;
            return;
        }

        $content = (string)$post->post_content;
        $first   = $this->extract_first_image($content);

        if (!$first) {
            $this->stats['no_first_img']++;
            return;
        }

        $first_src     = $first['src'];
        $first_snippet = $first['snippet'];
        $featured_id   = get_post_thumbnail_id($post_id);

        // Case A: Featured image already exists
        if ($featured_id) {
            if (!$remove_first_image) {
                return;
            }

            $featured_url = wp_get_attachment_url($featured_id);
            if (!$featured_url) {
                return;
            }

            if ($remove_only_if_matches) {
                if (!$this->urls_match_strict($first_src, $featured_url)) {
                    $this->stats['skipped_removal_not_matching']++;
                    WP_CLI::log("Post {$post_id}: featured exists but first image does NOT match — skipping.");
                    return;
                }
            }

            if ($dry_run) {
                WP_CLI::log("Post {$post_id}: [DRY RUN] would remove first image (src={$first_src})");
                $this->stats['removed_first_img']++;
                return;
            }

            $new_content = $this->remove_first_image_snippet($content, $first_snippet);
            if ($new_content !== $content) {
                wp_update_post(['ID' => $post_id, 'post_content' => $new_content]);
                $this->stats['removed_first_img']++;
            }
            return;
        }

        // Case B: No featured image — promote first content image
        $attachment_id = $this->ensure_attachment_from_url($first_src, $post_id, $dry_run);
        if (!$attachment_id) {
            $this->stats['sideload_failed']++;
            WP_CLI::log("Post {$post_id}: sideload failed (src={$first_src})");
            return;
        }

        if ($dry_run) {
            WP_CLI::log("Post {$post_id}: [DRY RUN] would set featured => attachment {$attachment_id}");
            if ($remove_first_image) {
                WP_CLI::log("Post {$post_id}: [DRY RUN] would remove first image from content.");
            }
            $this->stats['set_featured']++;
            if ($remove_first_image) $this->stats['removed_first_img']++;
            return;
        }

        set_post_thumbnail($post_id, $attachment_id);
        $this->stats['set_featured']++;

        if ($remove_first_image) {
            $new_content = $this->remove_first_image_snippet($content, $first_snippet);
            if ($new_content !== $content) {
                wp_update_post(['ID' => $post_id, 'post_content' => $new_content]);
                $this->stats['removed_first_img']++;
            }
        }
    }

    /**
     * Extract the first <img> tag and its surrounding <p> wrapper if present.
     */
    private function extract_first_image(string $html): ?array {
        if (!preg_match('/<img\b[^>]*\bsrc\s*=\s*(["\'])(.*?)\1[^>]*>/i', $html, $m, PREG_OFFSET_CAPTURE)) {
            return null;
        }

        $img_tag = $m[0][0];
        $img_pos = $m[0][1];
        $src     = $m[2][0];
        $snippet = $img_tag;

        $before     = substr($html, 0, $img_pos);
        $p_open_pos = strripos($before, '<p');

        if ($p_open_pos !== false) {
            $after_from_p = substr($html, $p_open_pos);
            $p_close_pos  = stripos($after_from_p, '</p>');
            if ($p_close_pos !== false) {
                $p_block = substr($after_from_p, 0, $p_close_pos + 4);
                if (stripos($p_block, $img_tag) !== false) {
                    $snippet = $p_block;
                }
            }
        }

        return ['src' => $src, 'snippet' => $snippet];
    }

    private function remove_first_image_snippet(string $content, string $snippet): string {
        $pos = strpos($content, $snippet);
        if ($pos === false) {
            $content2 = preg_replace('/<img\b[^>]*\bsrc\s*=\s*(["\'])(.*?)\1[^>]*>/i', '', $content, 1);
            return is_string($content2) ? $this->cleanup_html($content2) : $content;
        }

        $new = substr($content, 0, $pos) . substr($content, $pos + strlen($snippet));
        return $this->cleanup_html($new);
    }

    private function cleanup_html(string $html): string {
        $html = preg_replace('/<p>\s*(?:&nbsp;|\xC2\xA0)?\s*<\/p>/i', '', $html);
        return trim($html);
    }

    /**
     * Normalize and compare two image URLs.
     * Strips query strings and WordPress size suffixes (-300x200) before comparing.
     */
    private function urls_match_strict(string $a, string $b): bool {
        return $this->normalize_img_url($a) === $this->normalize_img_url($b);
    }

    private function normalize_img_url(string $url): string {
        $url = preg_replace('/\?.*$/', '', $url);
        $url = preg_replace('/-\d+x\d+(?=\.\w+$)/', '', $url);
        return strtolower($url);
    }

    /**
     * Return an attachment ID for a URL.
     * Tries to find it in the Media Library first. Sideloads only if necessary.
     */
    private function ensure_attachment_from_url(string $url, int $post_id, bool $dry_run): int {
        $url = trim($url);
        if ($url === '') return 0;

        $maybe_id = attachment_url_to_postid($url);
        if ($maybe_id) return (int)$maybe_id;

        if ($dry_run) return 1;

        if (!function_exists('media_sideload_image')) {
            require_once ABSPATH . 'wp-admin/includes/media.php';
            require_once ABSPATH . 'wp-admin/includes/file.php';
            require_once ABSPATH . 'wp-admin/includes/image.php';
        }

        $result = media_sideload_image($url, $post_id, null, 'id');
        if (is_wp_error($result)) return 0;

        $id = (int)$result;
        return $id > 0 ? $id : 0;
    }
}

WP_CLI::add_command('featured-migrator', 'Featured_Image_Migrator_Command');

Usage

Step 1 — Always dry-run first

Test on a handful of specific posts before touching anything:

php /home/wp-cli.phar --path=/path/to/wp --skip-plugins --skip-themes \
  featured-migrator --ids=101,202,303 --dry-run --remove-first-image --remove-only-if-matches

Read the output carefully. Confirm the logic matches your expectations.

Step 2 — Run for real on those test posts

php /home/wp-cli.phar --path=/path/to/wp --skip-plugins --skip-themes \
  featured-migrator --ids=101,202,303 --remove-first-image --remove-only-if-matches

Step 3 — Process the full site in batches

On shared hosting, avoid processing thousands of posts in a single run. Use --limit and --offset to paginate:

# Batch 1
php /home/wp-cli.phar --path=/path/to/wp --skip-plugins --skip-themes \
  featured-migrator --limit=200 --offset=0 --remove-first-image --remove-only-if-matches

# Batch 2
php /home/wp-cli.phar --path=/path/to/wp --skip-plugins --skip-themes \
  featured-migrator --limit=200 --offset=200 --remove-first-image --remove-only-if-matches

# Continue incrementing --offset by 200 until done

At the end of each run, the command prints a stats summary:

---- STATS ----
set_featured: 47
removed_first_img: 112
skipped_removal_not_matching: 8
no_first_img: 33
sideload_failed: 2

How the Smart Matching Works

One of the more subtle problems with "remove first image" scripts is false positives. A post might have a Featured Image set and a completely different, legitimate image as the first element in the body — an infographic, a diagram, a chart. You don't want that removed.

The --remove-only-if-matches flag solves this. Before any removal, the command:

  1. Grabs the Featured Image URL from the Media Library
  2. Normalizes both URLs — strips query strings, removes WordPress size suffixes like -768x512
  3. Compares them case-insensitively

Only if they match does the first content image get removed. Otherwise, the post is left untouched and counted under skipped_removal_not_matching so you can audit it.


Common Issues and Fixes

WP-CLI hangs or times out

This usually means a plugin or theme is firing heavy logic during boot — external API calls, large queries, slow init hooks.

Fix: always run with --skip-plugins --skip-themes. The MU plugin placement ensures your command still loads regardless.

Sideload failures

The sideload_failed stat covers:

  • Broken image URLs left over from the old host
  • 403/404 responses (hotlink protection on the old domain)
  • Non-image URLs accidentally in src attributes
  • Files that were deleted before or during migration

For a large site, consider extending the command with a --csv-report flag to export a list of failed post IDs and URLs for manual review.

Gutenberg blocks

The current regex targets classic <img> tags in raw post_content. If your migrated posts contain Gutenberg block markup (<!-- wp:image -->), the logic needs to be adapted to parse block attributes instead. This is a worthwhile extension for any site that was already using the block editor before migration.


Why an MU Plugin and Not a Standalone Script

You could technically write this as a standalone PHP file and require it via --require. The MU plugin approach is cleaner for a few reasons:

  • It's always available — no need to remember a file path argument on every run
  • It loads before plugins and themes, avoiding conflicts
  • It's version-controlled alongside your site
  • After the migration is complete, you simply delete the file

Wrapping Up

A WordPress migration doesn't end when the content arrives in the database. The work of making a decade of inconsistent editorial habits behave correctly under a new theme is where the real time goes — and featured images are one of the most visible places that breaks.

This command gives you a safe, auditable, batch-capable way to fix both sides of the problem in one pass. Dry-run first, review the stats, process in batches, and your archive cards and post headers will be consistent across the entire blog.

The code is production-tested and intentionally conservative — it skips rather than guesses when something looks ambiguous.

If you're working on a WordPress migration and hit edge cases this doesn't cover, feel free to reach out.

Let's Connect

Ready to discuss your project? Reach out through any of these channels.

Based in Bratislava, Slovakia. Available for projects worldwide.

Slovenská verzia stránky sa stále pripravuje a jej obsah nie je 100%

How to Fix Duplicate and Missing Featured Images After a WordPress Migration (WP-CLI + MU Plugin)