WordPress.org

Support

Support » Miscellaneous » [Resolved] Switched Blogger to WP, now Google can't find old Posts

[Resolved] Switched Blogger to WP, now Google can't find old Posts

  • I am 61 and rather un-tech-savvy to say the least. Here goes.

    I’ve switched from Blogger to WP last week. Now I’m worried it was a huge mistake. I’ve got the site up, it looks fine, at http://www.spirituality-and-religion.com . Here’s the problem. I’ve been blogging for years and have hundreds of old posts. People google topics and my posts often come up and I’ve been getting lots of traffic on Blogger. Now I have none. If I google, for instance, “Plato Ship Cort” (Cort is my name) one of the top listings is an old post of mine called “Plato’s Parable of the Ship”. But when I click the listing, I get a “PAGE NOT FOUND” notice. So no one finds my info anymore!

    Bluehost Tech support says its a Google issue, thay can’t help me. They did suggest I go to Google Webmaster Tools and “Request a Reconsideration”. I did this, though (1) the site says that a ‘reconsideration’ is about getting them to let you back online after they’ve punished you for breaking their guidelines (I didn’t, that’s not the issue), and (2) anyway they say it can take Weeks before they act on my request. Meanwhile, my site is essentially useless!

    However. When I got to the Webmaster Tools page they had a Message waiting for me! Here’s what it says (I’ve copied their entire message in bold):

    http://www.spirituality-and-religion.com/: Googlebot can’t access your site
    August 16, 2012

    Over the last 24 hours, Googlebot encountered 59 errors while attempting to access your robots.txt. To ensure that we didn’t crawl any pages listed in that file, we postponed our crawl. Your site’s overall robots.txt error rate is 100.0%.

    You can see more details about these errors in Webmaster Tools.

    Recommended action
    If the site error rate is 100%:

    Using a web browser, attempt to access http://www.spirituality-and-religion.com//robots.txt. If you are able to access it from your browser, then your site may be configured to deny access to googlebot. Check the configuration of your firewall and site to ensure that you are not denying access to googlebot.
    If your robots.txt is a static page, verify that your web service has proper permissions to access the file.
    If your robots.txt is dynamically generated, verify that the scripts that generate the robots.txt are properly configured and have permission to run. Check the logs for your website to see if your scripts are failing, and if so attempt to diagnose the cause of the failure.

    So I went to http://www.spirituality-and-religion.com//robots.txt, and this is what it says”

    User-agent: *
    Disallow: /wp-admin/
    Disallow: /wp-includes/

    Please understand, I haven’t a clue what they’re talking about, and can’t believe I can’t call a tech number for WP support or Google support, and that Bluehost can’t help me after all the money I’ve sent them for 3 years of hosting.

    I have a bit more info. A page on google tells me this:

    Blocked URLs

    If your site has content you don’t want Google or other search engines to access, use a robots.txt file to specify how search engines should crawl your site’s content.

    Check to see that your robots.txt is working as expected. (Any changes you make to the robots.txt content below will not be saved.)
    robots.txt file

    Blocked URLs Downloaded Status
    http://www.spirituality-and-religion.com/robots.txt 2,442 Aug 13, 2012 200 (Success)
    http://www.spirituality-and-religion.com/robots.txt content – edit to test changes
    User-agent: * Disallow: /~spiriuj3/wp-admin/ Disallow: /~spiriuj3/wp-includes/

    URLs Specify the URLs and user-agents to test against.
    http://www.spirituality-and-religion.com/

    I don’t know why this robot.txt thing is there. I don’t want to block google from finding my posts. I’m thinking that maybe WP put this there to block people from finding my EDITING info, but somehow It has blocked everything!

    Can anyone help me??????

Viewing 15 replies - 1 through 15 (of 25 total)
  • Moving from one blog program to the other, probably did make your links for your posts in a new place, so it’s good you’re having Google re-scan your site.
    For that other error, make sure you have this chosen. Dashboard >> Settings >> Privacy >> Allow search engines to index this site
    If that is not the problem look at all your Plugins to see if any of them have a Privacy setting – or you could just turn your Plugins off and see if that fixes it.

    Thanks,
    Maybe the eventual google scan will fix it. My general settings do allow search engines to index the site. I just have a few plugins. How/where do I determine if they have a Privacy setting?

    It depends on the plugin, each one can be different. You can either look around, or check the official website for each plugin for help on it.

    Your robots.txt is just fine 🙂

    If I google, for instance, “Plato Ship Cort” (Cort is my name) one of the top listings is an old post of mine called “Plato’s Parable of the Ship”. But when I click the listing, I get a “PAGE NOT FOUND” notice. So no one finds my info anymore!

    Okay. That’s something we can fix!

    That URL is http://www.spirituality-and-religion.com/2011/10/platos-parable-of-ship.html

    And it should be …

    http://www.spirituality-and-religion.com/2011/10/platos-parable-of-the-ship/

    The easiest thing to do is go in to that post on WP and edit the slug so it’s platos-parable-of-ship (notice the missing the)

    Then put this RewriteRule (.+)\.html$ /$1/ [L,R] above the #Begin WordPress call in your .htaccess – This will take all the .html URLs and redirect them to the same-named WP ones.

    I really appreciate this. It sound like the solution. I just need a bit more help if you can. I really don’t know what this all means:

    put this RewriteRule (.+)\.html$ /$1/ [L,R] above the #Begin WordPress call in your .htaccess

    Could you tell me where to go, how to get there, and (well, maybe I can figure out what to do once I’m there, but…) how to do it?

    You have to edit the file .htaccess on your server.

    Do you know how to edit the files on your server?

    Not a clue I’m afraid. My host is Bluehost, and I did find my cPanel which includes a section called “Files” and includes these icons:

    Site Backup & Restore
    File Manager
    Legacy File Manager
    Web Disk
    Disk Space Usage
    File Count
    FTP Accounts
    FTP Session Control
    Website Movers
    Unlimited FTP

    Do I go somewhere in here? (I don’t see anything about an htfile, though I don’t know what an htfile is) (I clicked the one called File Manager, but didn’t see any htfile, and nothing in there was clickable anyway)

    I did it! I actually did it (I think) I found my .htacccess file, I found #Begin WordPress , and there was a space above it where I pasted in RewriteRule (.+)\.html$ /$1/ [L,R] . Then I pressed ‘save’ and it said ‘Success!’

    Does this take time to take effect? ‘cas I searched for my Plato blog post, and it still says “Page not found”

    My website had this same problem. I found it easiest to re-submit my sitemap to google. Here are the steps that google suggests:

    Submitting your Google Site’s sitemap to Google Webmaster Tools

    On your Webmaster Tools home page, select your site.
    In the left sidebar, click Site configuration and then Sitemaps.
    Click the Add/Test Sitemap button in the top right.
    Enter /system/feeds/sitemap into the text box that appears.
    Click Submit Sitemap.

    Ok. Done this too. Hers’s hoping.
    Thanks.
    Any other suggestion, I’ll try all possibilities

    Well, it’s close to 11 pm on Tuesday, 8/21, and I just received the following email from Google, which seems to suggest that all my efforts so far have failed. PLEASE keep trying to help me find a solution. I really appreciate your help. This message might as well be in Swahili or Martian for all it means to me.

    Over the last 24 hours, Googlebot encountered 359 errors while attempting to access your robots.txt. To ensure that we didn’t crawl any pages listed in that file, we postponed our crawl. Your site’s overall robots.txt error rate is 47.3%. You can see more details about these errors in Webmaster Tools.

    Recommended action
    If the site error rate is 100%:

    Using a web browser, attempt to access http://www.spirituality-and-religion.com//robots.txt. If you are able to access it from your browser, then your site may be configured to deny access to googlebot. Check the configuration of your firewall and site to ensure that you are not denying access to googlebot.
    If your robots.txt is a static page, verify that your web service has proper permissions to access the file.
    If your robots.txt is dynamically generated, verify that the scripts that generate the robots.txt are properly configured and have permission to run. Check the logs for your website to see if your scripts are failing, and if so attempt to diagnose the cause of the failure.

    If the site error rate is less than 100%:

    Using Webmaster Tools, find a day with a high error rate and examine the logs for your web server for that day. Look for errors accessing robots.txt in the logs for that day and fix the causes of those errors.
    The most likely explanation is that your site is overloaded. Contact your hosting provider and discuss reconfiguring your web server or adding more resources to your website.

    After you think you’ve fixed the problem, use Fetch as Google to fetch http://www.spirituality-and-religion.com//robots.txt to verify that Googlebot can properly access your site.

    Actually, I did just notice that a few days ago their message said my error rate was 100%. Now it’s only 47.3%. So maybe I AM making progress.

    Nitun

    @nitunlanjewar

    Hello Cort,
    As you imported content from blogger.com to WordPress blog, there are little changes in post slug URLs (like articles:- ‘the’,’a’, ‘an’,etc) in WordPress as mentioned above by Ipstenu (Mika Epstein). WordPress create a post slug as per the post title and blogger.com skip those English “articles”.
    Hence To fix such post slug issues in bulk for all the imported posts you just need to run a fix.php file mentioned in this Blogger to WordPress DIY tutorial under the section “Fixing permalinks for imported post“.

    Another thing: I found that few other links are not redirecting properly
    1. http://www.spirituality-and-religion.com/2011_08_01_archive.html
    2. http://www.spirituality-and-religion.com/search/label/crucifixion
    3. http://www.spirituality-and-religion.com/feeds/posts/default (very important)

    I can suggest you to add some .htacess rules (from the same DIY tutorial ) which will redirect old blogger archives and labels to respective WordPress archives and categories respectively.

    Hope this will help you.
    Thanks,
    –Nitun

    good grief. It looks like this will do it. But I can’t make heads or tales of this tutorial. Maybe after a night’s sleep.

    Nitun

    @nitunlanjewar

    If you used WordPress default importer tool to import blogger.com content then that DIY tutorial will definitely help you.

Viewing 15 replies - 1 through 15 (of 25 total)
  • The topic ‘[Resolved] Switched Blogger to WP, now Google can't find old Posts’ is closed to new replies.
Skip to toolbar