• I had need to import a Blogger blog and images to WordPress. The Cache Images plugin was the closest thing to a solution but it would only import images from the <img> tags and not the usual links to larger images from the <a> tags around <img> tags.

    Not sure how well this will work but here’s a patch to import the images from links in unified diff format from the v2.0 downloaded plugin:

    Index: cache-images/cache-images.php
    ===================================================================
    RCS file: /htdocs/wp-content/plugins/cache-images/cache-images.php,v
    retrieving revision 1.1.1.1
    diff -u -w -r1.1.1.1 cache-images.php
    --- cache-images/cache-images.php       11 Oct 2010 13:01:38 -0000      1.1.1.1
    +++ cache-images/cache-images.php       11 Oct 2010 13:14:29 -0000
    @@ -90,13 +90,13 @@
    
     <?php if ('2' == $_POST['step']) : ?>
     <?php
    -$posts = $wpdb->get_results("SELECT post_content FROM $wpdb->posts WHERE post_content LIKE ('%<img%')");
    +$posts = $wpdb->get_results("SELECT post_content FROM $wpdb->posts WHERE post_content REGEXP ('<(img|a) ')");
    
     if ( !$posts )
            die(__("No posts with images were found.", "cache-images") );
    
     foreach ($posts as $post) :
    -       preg_match_all('|<img.*?src=[\'"](.*?)[\'"].*?>|i', $post->post_content, $matches);
    +       preg_match_all('#<(?:a .*?href|img .*?src)=[\'"](.*?\.(?:jpg|png|gif))[\'"].*?>#i', $post->post_content, $matches);
    
            foreach ($matches[1] as $url) :
                    $url = parse_url($url);
    @@ -212,12 +212,12 @@
            $domain = $_POST["domain"];
    
            if ($action == "getlist") {
    -               $postid_list = $wpdb->get_results("SELECT DISTINCT ID FROM $wpdb->posts WHERE post_content LIKE ('%<img%') AND post_content LIKE ('%$domain%')");
    +               $postid_list = $wpdb->get_results("SELECT DISTINCT ID FROM $wpdb->posts WHERE post_content REGEXP ('<(img|a) ') AND post_content LIKE ('%$domain%')");
    
                    foreach ( $postid_list as $v ) {
                            $postid = $v->ID;
                            $post = $wpdb->get_results("SELECT post_content FROM $wpdb->posts WHERE ID = '$postid'");
    -                       preg_match_all('|<img.*?src=[\'"](.*?)[\'"].*?>|i', $post[0]->post_content, $matches);
    +                       preg_match_all('#<(?:a .*?href|img .*?src)=[\'"](.*?\.(?:jpg|png|gif))[\'"].*?>#i', $post[0]->post_content, $matches);
                            foreach ( $matches[1] as $url ) :
                                    if ( strstr( $url, get_option('siteurl') . '/' . get_option('upload_path') ) || !strstr( $url, $domain) || (($res) && in_multi_array($url, $res)))
                                            continue; // Already local
    @@ -239,7 +239,7 @@
                            $my_body = wp_remote_retrieve_body($response);
    
                            if (strpos($my_body, 'src')) {
    -                               preg_match_all('|<img.*?src=[\'"](.*?)[\'"].*?>|i', $my_body, $matches);
    +                               preg_match_all('#<(?:a .*?href|img .*?src)=[\'"](.*?\.(?:jpg|png|gif))[\'"].*?>#i', $my_body, $matches);
                                    foreach ( $matches[1] as $url ) :
                                            $spisak = $url;
                                    endforeach;
    @@ -252,7 +252,7 @@
                    $upload = media_sideload_image($url, $postid);
    
                    if ( !is_wp_error($upload) ) {
    -                       preg_match_all('|<img.*?src=[\'"](.*?)[\'"].*?>|i', $upload, $locals);
    +                       preg_match_all('#<(?:a .*?href|img .*?src)=[\'"](.*?\.(?:jpg|png|gif))[\'"].*?>#i', $upload, $locals);
                            foreach ( $locals[1] as $newurl ) :
                                    $wpdb->query("UPDATE $wpdb->posts SET post_content = REPLACE(post_content, '$orig_url', '$newurl');");
                            endforeach;

    Patch is copyright Ronny Adsetts and comes with no warranty. The plugin doesn’t appear to have any licence information but I assume it’s GPL compatible.

    It works for me but YMMV.

    Regards,
    Ronny

    http://wordpress.org/extend/plugins/cache-images/

Viewing 4 replies - 1 through 4 (of 4 total)
  • Thread Starter Ronny Adsetts

    (@ronnyadsetts)

    The preg_match_all() regex is not quite right above – it’s being a little greedy sometimes. I’ll post a better patch once I’ve finished testing but replacing the 4 regexes with #<(?:a[^>]+href|img[^>]+src)=[\'"]([^>]+\.(?:jpg|png|gif))[\'"][^>]+>#i should do it for now.

    Plugin Author Milan Dinić

    (@dimadin)

    Hey, Ronny, thank you for your work! I will need to do a rewrite of a plugin to enable this option, and also to enable automatic caching.

    Since it needs some more work, I can’t release update immediately, but in reasonable time.

    Thread Starter Ronny Adsetts

    (@ronnyadsetts)

    Hey dimadin,

    Glad it’s appreciated and sorry I couldn’t spend more time on it. If you need testers for a new version of the plugin, feel free to let me know.

    Ronny

    Plugin Author Milan Dinić

    (@dimadin)

    Ronny, try development version, I added option to cache linked images. Please test it to see if there are bugs.

Viewing 4 replies - 1 through 4 (of 4 total)
  • The topic ‘[Plugin: Cache Images] [patch] Importing linked images from Blogger’ is closed to new replies.