WordPress.org

Ready to get started?Download WordPress

Forums

Blogger Importer
[resolved] Skipped images, some errors etc (22 posts)

  1. Lyk
    Member
    Posted 4 months ago #

    Hello!
    Here are my questions/points:

    1a) I also have 4 comments skipped and again I do not know which ones. The only relating thing in the log are some lines like:

    PHP Notice: Undefined index: fragment in /var/www/vhosts/fashionhasit.com/httpdocs/wordpress/wp-content/plugins/blogger-importer/comment-entry.php on line 23

    Other than that, there are also some entries

    Error saving blogger status

    followed by a printout of some array with what seems to be the current status.

    1b) Some images are skipped, but I cannot see which ones. I have enabled dbg in log files.
    Again some lines like:

    PHP Notice: The data could not be converted to UTF-8. You MUST have either the iconv or mbstring extension installed. Upgrading to PHP 5.x (which includes iconv) is highly recommended. in /path/wp-includes/class-simplepie.php on line 1446
    [] PHP Fatal error: Maximum execution time of 60 seconds exceeded in /path/wp-includes/class-wp-image-editor-gd.php on line 176
    [date] Error saving blogger status
    [date] array (....

    Checking with phpinfo() I can see that the hosting has php v5.3.28 with iconv enabled.

    2) I have to reload the page and hit the import button quite a few times, since it stops every 50-100 images. The strange thing however is the following:
    When it stops the topmost image in the media (I guess the last one imported) has no small image icon! If I open the picture I can see it normally except one thing: the sidebar that show filename, type, size and dimensions is NOT showing the dimensions! i.e. it does not show the line Dimensions: smth × smth . However, the image is really there with the expected dimensions. Refreshing the page etc, does not make any difference.

    3) What defines the max excecution time I can use? What values are suggested for globals in the .php when there are many pictures?

    Background: Importing a blog with decent size (500 posts, 5.000+ images). In the .php file of the plugin I increased images sizes to 1600 max

    Thank you!

  2. Workshopshed
    Member
    Plugin Author

    Posted 4 months ago #

    Hi Lyk,
    I've dropped the "Error saving blogger status" code from the beta version as it's not really helping diagnose anything. I believe it occurs when you try to save the same data that's already in the option.
    I recently discovered that blogger is sending comments that are not linked to any posts, i.e. the post was deleted. I've also supressed that message in the latest beta. I'll try to setup some data like that in my test blog so it's correctly handled, it would be good to log failures too.
    I have seen the "The data could not be converted to UTF-8" error on my own data but I've yet to track down why it occurs. However I do agree that the message about iconv is misleading.
    I think the error "PHP Fatal error: Maximum execution time of 60 seconds exceeded in /path/wp-includes/class-wp-image-editor-gd.php on line 176" is related to the fact you need to keep pressing the button. I also think this is related to your large images as line 176 points to a function "imagecopyresampled", it might also be that you have a lot of images per post. It should be possible to implement a better timeout mechanism for image processing, in the mean time I would suggest editing blogger-importer-blog.php. Find the function "process_images" and on like 529 you should have $batchsize = 20;
    reduce this to a smaller value and that should reduce the occurance of your timeout issues. It is possible to increase the timeout but I don't think that is a good approach and might just give you more issues.

  3. Lyk
    Member
    Posted 4 months ago #

    Hello and thanks for your answer!

    Yes the images are large, but in most case 5-7 per post.
    I will try changing the batchsize to 5 and see how it works. (will also decrease the max image size)

    Would it also help to change any of these?

    const MAX_RESULTS = 25; // How many records per GData query (int)
    const MAX_EXECUTION_TIME = 20;// How many seconds to let the script run (int)

    What worries me most is the thing that I mention in 2 regarding the dimensions (and the very small thumbnail in the media library) not showing etc. Why might that be? It is strange because it could indicate that something is not imported correctly or is corrupted.

    Thank again!

    PS: will the beta version be released soon for general use?

    PS2: How the image importation mechanism works? I am wondering if it appears to wordpress like a normal upload... Basically, I am wondering if other plugins that do something during the uploading process (e.g. sent images to S3 or resize them etc) will work. Any insight on that?

  4. Workshopshed
    Member
    Plugin Author

    Posted 4 months ago #

    Those 2 settings currently only affect the Posts and Comments imports. My thought is that the Max_Execution_Time should be extended to include images (or perhaps the importer could read the system setting?)

    Those images that don't have a thumbnail are probably corrupt, the error indicates that it stopped processing half way through resizing the images.

    The way the import mechanism works is that it downloads all of the posts. It then runs through those in batches looking for images of the form:

    <a href="xxx"><img src="yyy"></a>
    or
    <img src="yyy">

    For each image

    1. downloads the biggest image size.
    2. Use big image to re-generate the smaller sizes based on your wordpress settings, it is this step that times out in your case.
    3. Creates an "attachement" record in the post table.
    4. updates the html in the post to point at the new images.
    5. If this is the first image then set it as the "featured" image

    This is very similar to the way the WordPress editor works.

    If you want to mention specific plugins, I will see how those interact.

    You can download the beta from here
    http://wordpress.org/plugins/blogger-imprter/developers/

    Plan to follow the WordPress core team mentality and release smaller fixes more often. Would be nice to get these image and comment issues fixed before the next release.

  5. Lyk
    Member
    Posted 4 months ago #

    Hello, again :)

    So, right now I am importing with 1024 image size and $batchsize 4 and it seems much better. Already half way through with not a single stop.

    Smaller batchsize seems to help, unless it has to do with the smaller image size. Now I use 1024, but before I was using 1600.

    The strange thing is that the "corrupted" images seems quite fine, except for the things I mention. That means, I can see it in full size and it appears in the posts. But anyway, this only happen with every stop (case where I had to refresh and click import again). So if there are not stops, all the images are fine.

    As for the specific plugin I am thinking of this one
    http://wordpress.org/plugins/amazon-s3-and-cloudfront/
    so that I can have the images only in S3 and not on the hosting shared server.

    Thank you for your time again, much appreciated :)

  6. Lyk
    Member
    Posted 4 months ago #

    So the importation finished without any stops this time. This is really good since I didn't have to click every few mins & there must be n corrupted images.

    However, I noticed a strange issue:

    For some of the posts, lower resolution images are used e.g.
    a 300px × 200px image is used in an <img> that looks like:

    <img src="http://url/image_name-300x200.jpg" height="426" width="640">

    Of course this results in a visibly ugly image in the post.
    The bigger resolution image exist and in fact it is even used as the link when one clicks the crappy image:

    <a href="url/image_name.jpg"> <img src="http://url/image_name-300x200.jpg" height="426" width="640"> </a>

    I guess this has to do with Step 2 that you mention above:

    Use big image to re-generate the smaller sizes based on your wordpress settings

    On which settings is this based on? The ones I see in wp-admin/options-media.php ?

  7. Workshopshed
    Member
    Plugin Author

    Posted 4 months ago #

    Looking at the S3 plugin you mentioned it will upload the files to S3 but I'm not sure that it's going to handle the re-mapping of the URL's correctly.

    Regarding the size issue you mention, the importer is designed to standardise all of the image sizes. The find and replace for the images assumes that the small image is always going to be the "medium" size as defined by media settings in WordPress. The image processing is not clever enough to get the size settings and pick the right one to swap in. It would be good to the right size or alternatively strip out the size and width settings.
    I don't believe that the blogger post editor adds those settings so they've probably been manually added.

  8. Lyk
    Member
    Posted 4 months ago #

    Hm I guess I will have to try that. But thanks!

    As for the size I am not sure that this is the issue. And here is why:
    * In the same post in blogger there are 2 pictures 1066 x 1600
    * They are both <img width="426" height="640" ...
    * In the wp they are both:

    <img width="426" border="0" height="640" src="url/pic.jpg"></img>

    However, one of them is actually using pic-199x300.jpg as source for the img.

    By the way, the href for the problematic one in blogger is something like https://images-blogger-opensocial.googleusercontent.com/gadgets/proxy?url=http.
    I do not know if that makes any difference but let's notice some things:
    1) the problem also appears to other images where the blogger href is fine.
    2) the problematic href does not work even in blogger, i.e. if I click the image is "recognized" as .txt. (If I open the file, it is really the image thought. The mime or extension seems to be wrong for some erason)
    3) Even though the img tag is to the thumbnail of the image, the problematic href is preserved in wp too. I guess this is normal since it is not recognized as image.

    But anyway, this seems like a separate problem and as I mentioned the res issue happened with other images too, which have correct hrefs.

    Thank you again :)

  9. Lyk
    Member
    Posted 4 months ago #

    Any suggestion on how to overcome this?

    One more comment feedback: After importation finished successfully I applied a theme (non-free) and things went really messy. Things like: on the first page, the 3rd post was inside the 2nd and the 4rth inside the 3rd and then a mess.

    Not sure if this is a problem with the theme since I had no issues with that theme before and I am also using it.

    Could it be some extra/missing tags iniside the post or...?

  10. Lyk
    Member
    Posted 4 months ago #

    Did some more digging and it seems that it just happens that some of the last posts are "corrupted" (I guess having unclosed tags or something?) and this explain the problem I mention in the post above.

    I am saying this because after deleting some of these posts everything seems fine.

    Will try to check manually and be more specific.

  11. Workshopshed
    Member
    Plugin Author

    Posted 4 months ago #

    Have a look at the posts in the visual editor to see if they are correctly layed out or paste the content into a HTML validator.

    Regarding the images, I've seen blogger link to a HTML file as an image before but not a txt, that's just wierd and I'm not sure it's sensible to try and handle that case.

    In the short term I think your best option for the resize is to look out one of those find and replace scripts or plugins to make the appropriate changes to the image tags. As mentioned the image processing works on the assumption that you are changing your images to be the "medium" size and hence replaces the SRC with that file. To be consistent I suppose it should strip out the height and width attributes but that does not really help you if you want the large size. To make it detect the sizes and do something more sophisticated with them would be a challange.

  12. Lyk
    Member
    Posted 4 months ago #

    For the post breaking the theme yes, this is what I will try to do. Will report back with the finding in case something is useful.

    For the images yes, I guess that is more a blogger problem and not really related with the plugin.

    So if I change the medium size from settings->Media to something bigger it should work?

    Right now I am trying with the dev version to see how it behaves

    Thanks for you answer!

  13. Workshopshed
    Member
    Plugin Author

    Posted 4 months ago #

    Yes, changing the medium size to a larger size should result in the larger pictures. You will likely need to delete all the posts and images and re-import for that to work.

    Current fixed in the dev release are mostly around the area of comments.

  14. Lyk
    Member
    Posted 4 months ago #

    Ok, will try that too.
    Yes, I have imported quite a few times and I clean everything before a new import.

    Small notice: with the dev version, 1600px max images size in the plugin's option (.php file) and batch size of 2 it still stopped after apprx 300-400 images.
    However, the last imported images seem to be fine and appear correctly in the media library.

    PS: I now continue the importation with batch size of 1. I guess it does not matter that I changed the batch size in the same importation session (but of course after it had stopped). Is there any other way to make it stop less often?

  15. Workshopshed
    Member
    Plugin Author

    Posted 4 months ago #

    Assuming it's the same issue with the timeouts. You could try the following:

    in function process_images()

    Change

    foreach ($loadedposts as $importedpost)
                {
                    $importedcontent = $importedpost->post_content;

    to

    foreach ($loadedposts as $importedpost)
                {
                    set_time_limit ( 120 );
                    $importedcontent = $importedpost->post_content;

    this will increase the timeout from 60 seconds per batch to 120 seconds per post. That should mean you can change the batch setting back to say 20 and hence process more images per batch.

  16. Lyk
    Member
    Posted 4 months ago #

    It is not a huge problem since every image is fine, but will try it since (I guess) it does not have any side-effects.

    One more question: How can I cancel the "Set the first image as featured"? I have to admit that I have not checked the code, but I thought I would ask first. Reason I want to skip it is that most posts were starting with the image and the result is have the same image twice at the beginning of each post.

    Thanks again

  17. Lyk
    Member
    Posted 4 months ago #

    Update I narrowed it down to 3 recent posts that where causing issues and in the end it seems that the mess was caused by some inline styles like:
    margin-left: auto; margin-right: auto;

    I guess these were there in the blogger blog before the importation and they were just transferred. Since they are just 3 (hopefully there are not any other I did not notice) I will correct them by hand if I do not find any relevant plug-in.

    However, it might be a good idea to provide an option during the importation to strip the content from any inline styling etc. Not sure how easy that is though

  18. Lyk
    Member
    Posted 4 months ago #

    Update:

    I ended up importing with the following settings:

    • MAX_IMAGE_SIZE 1600
    • batch size for images 1
    • set_time_limit( 180 );
    • Size in Settings->Media: 700 for medium, 1600 for large
    • Using the beta plugin

    There were quite a few stops but without any corruption. Also the problem I mentioned above did not occur this time(!)
    For now everything seems to be ok

  19. Workshopshed
    Member
    Plugin Author

    Posted 4 months ago #

    Hi Lyk,
    to disable the setting of the featured image find the following lines in the blogger-importer-blog file

    if ($imgcount == 0)
                {
                    set_post_thumbnail($post_id, $att_id);
                }

    and disable them as follows

    //if ($imgcount == 0)
                //{
                //    set_post_thumbnail($post_id, $att_id);
                //}
  20. Lyk
    Member
    Posted 3 months ago #

    Thank you very much :)

    I have one last question. Now as you know, every first picture is set as featured image.
    I noticed the following:
    Say I have an 5.img as first image which is 1600px × 1066px (512KB in size). The featured image is an image name 5-1000x600.jpg which of course is 1000x600(566 K in size). Why is that the case?

    Does this have to do with the theme I am using, wordpress itself or the importation from blogger?

    Both images appear scaled down, with is fine but the smaller one has a bigger size which is strange. Is this a result of wordpress's downscaling?

  21. Workshopshed
    Member
    Plugin Author

    Posted 3 months ago #

    I'm not sure why the smaller resolution has resulted in a larger file but I do know that JPG has some clever compression settings so it might be something to do with that.
    That is the wordpress core that does the resizing the importer just mimics the same process.

    My settings are:
    Thumbnail 150 x 150
    Medium 300 x 300
    Large 1024 x 1024

    When I upload a file called "Workshop.jpg" which is 800x533 it creates two smaller files Workshop-300x199.jpg and Workshop-150x150.jpg. So in my case the Workshop-300x199.jpg would be the one attached to the post.
    Does that help explain what is going on?

  22. Lyk
    Member
    Posted 3 months ago #

    Hello and sorry for the late reply :/

    Yes, I think I understood how it works. As for the size issue, who know... :P

Reply

You must log in to post.

About this Plugin

About this Topic

Tags

No tags yet.