WordPress.org

Forums

robots.txt redux (6 posts)

  1. jbeaumont1
    Member
    Posted 1 year ago #

    OK, OK, I know this has been asked a million times but I must be missing something.

    I have a WordPress site and when I check Google Webmaster Tools it states "Severe health issues are found on your site". When I select the site, go to Crawl, and then the robots.txt Tester, it states "Latest version seen on 7/13/14, 1:58 PM OK (200) 26 bytes". It states that the contents of my robots.txt file is:

    User-agent: *
    Disallow: /

    This is not true.

    First, I have XML Sitemap Generator for WordPress 4.0.7 as my plugin and "Add sitemap URL to the virtual robots.txt file." is NOT checked. I have my own robots.txt file in my Home directory that contains the following:

    User-agent: *
    Disallow: /~kinnicum/wp-admin/
    Disallow: /~kinnicum/wp-includes/

    So, what gives? I've notified Google about my sitemap as generated by the plugin, so why has it been several days and it still thinks my site contains a robots.txt file that is blocking search engines from indexing my site? I thought that if I had an actual robots.txt file that the virtual robots.txt file that WP creates would be ignored? Where do I locate the code for the virtual robots.txt file?

  2. ClaytonJames
    Member
    Posted 1 year ago #

    I can't seem to find a robots.txt file on your site. Just the XML sitemap index. ~kinnicum/robots.txt throws a 404.

    There is one located here, however: http://173.xxx.xx.xx/robots.txt

    ..and it reads as follows;

    User-agent: *
    Disallow: /

  3. jbeaumont1
    Member
    Posted 1 year ago #

    Um, OK, how the heck do I get there? We use justhost.com to host the site and using File Manager I cannot seem to navigate to the actual IP address. There is a robots.txt in the home directory and it doesn't match the one you found.

  4. ClaytonJames
    Member
    Posted 1 year ago #

    I don't have a clue.

    My first thought was why does justhost have you using use an numeric ip address rather than an FQDN in your URL, and my second would be why would there still be a tilde in your URL pointing to your user directory? A temporary URL possibly?

    You might need to ask your hosts support group about the robots.txt file and what exactly might be happening with your configuration.

    [edit] ..assuming of course, I've guessed your site location correctly!!

  5. jbeaumont1
    Member
    Posted 1 year ago #

    Well, that is the odd thing. We had a pretty crappy site at:

    kinnicumfishandgame.org

    And then I found WordPress and created an entirely new and better looking site, and when developing it, WordPress devised http://173.xxx.xx.xx/~kinnicum. The WordPress documentation to change this once the website was deployed I followed but it did not work, so I ended up finding some WordPress documentation that guided me into how to move it from the WordPress directory on my site to the root. So I did and ended up simply redirecting http://www.kinnicumfishandgame.org to http://173.xxx.xx.xx/~kinnicum.

    Ironically, I am an IT person but this website stuff is fairly new to me and it is confusing as hell. :-p

  6. jbeaumont1
    Member
    Posted 1 year ago #

    Never mind, I think I figured it out. I went into Settings > General and changed the WordPress Address (URL) from the temp IP address to http://www.kinnicumfishandgame.org and the Site Address (URL) from the temp IP address to http://www.kinnicumfishandgame.org. These were the original URLs for the old site before I created the WordPress site. I moved the WordPress site from its WordPress subdirectory to our root directory awhile ago and as I said, was using the temp IP and redirecting. Obviously I needed to make the change to the original URLs as well as remove the redirect.

    I removed the sites I had registered at Google Webmaster and Bing Webmaster and added the site http://www.kinnicumfishandgame.org to both and re-verified. Now it is looking better as Google Webmaster and Bing Webmaster are both reporting there is no robots.txt file (and there is not as it should be) and they are both able to index the site.

    Yay!

Topic Closed

This topic has been closed to new replies.

About this Topic