robots.txt redux

jbeaumont1
(@jbeaumont1)

9 years, 8 months ago

OK, OK, I know this has been asked a million times but I must be missing something.

I have a WordPress site and when I check Google Webmaster Tools it states “Severe health issues are found on your site”. When I select the site, go to Crawl, and then the robots.txt Tester, it states “Latest version seen on 7/13/14, 1:58 PM OK (200) 26 bytes”. It states that the contents of my robots.txt file is:

User-agent: *
Disallow: /

This is not true.

First, I have XML Sitemap Generator for WordPress 4.0.7 as my plugin and “Add sitemap URL to the virtual robots.txt file.” is NOT checked. I have my own robots.txt file in my Home directory that contains the following:

User-agent: *
Disallow: /~kinnicum/wp-admin/
Disallow: /~kinnicum/wp-includes/

So, what gives? I’ve notified Google about my sitemap as generated by the plugin, so why has it been several days and it still thinks my site contains a robots.txt file that is blocking search engines from indexing my site? I thought that if I had an actual robots.txt file that the virtual robots.txt file that WP creates would be ignored? Where do I locate the code for the virtual robots.txt file?

Viewing 5 replies - 1 through 5 (of 5 total)

Clayton James
(@claytonjames)

9 years, 8 months ago

I can’t seem to find a robots.txt file on your site. Just the XML sitemap index. ~kinnicum/robots.txt throws a 404.

There is one located here, however: http://173.xxx.xx.xx/robots.txt

..and it reads as follows;

User-agent: *
Disallow: /

Thread Starter jbeaumont1
(@jbeaumont1)

9 years, 8 months ago

Um, OK, how the heck do I get there? We use justhost.com to host the site and using File Manager I cannot seem to navigate to the actual IP address. There is a robots.txt in the home directory and it doesn’t match the one you found.

Clayton James
(@claytonjames)

9 years, 8 months ago

I don’t have a clue.

My first thought was why does justhost have you using use an numeric ip address rather than an FQDN in your URL, and my second would be why would there still be a tilde in your URL pointing to your user directory? A temporary URL possibly?

You might need to ask your hosts support group about the robots.txt file and what exactly might be happening with your configuration.

[edit] ..assuming of course, I’ve guessed your site location correctly!!

Thread Starter jbeaumont1
(@jbeaumont1)

9 years, 8 months ago

Well, that is the odd thing. We had a pretty crappy site at:

kinnicumfishandgame.org

And then I found WordPress and created an entirely new and better looking site, and when developing it, WordPress devised http://173.xxx.xx.xx/~kinnicum. The WordPress documentation to change this once the website was deployed I followed but it did not work, so I ended up finding some WordPress documentation that guided me into how to move it from the WordPress directory on my site to the root. So I did and ended up simply redirecting http://www.kinnicumfishandgame.org to http://173.xxx.xx.xx/~kinnicum.

Ironically, I am an IT person but this website stuff is fairly new to me and it is confusing as hell. :-p

Thread Starter jbeaumont1
(@jbeaumont1)

9 years, 8 months ago

Never mind, I think I figured it out. I went into Settings > General and changed the WordPress Address (URL) from the temp IP address to http://www.kinnicumfishandgame.org and the Site Address (URL) from the temp IP address to http://www.kinnicumfishandgame.org. These were the original URLs for the old site before I created the WordPress site. I moved the WordPress site from its WordPress subdirectory to our root directory awhile ago and as I said, was using the temp IP and redirecting. Obviously I needed to make the change to the original URLs as well as remove the redirect.

I removed the sites I had registered at Google Webmaster and Bing Webmaster and added the site http://www.kinnicumfishandgame.org to both and re-verified. Now it is looking better as Google Webmaster and Bing Webmaster are both reporting there is no robots.txt file (and there is not as it should be) and they are both able to index the site.

Yay!

Viewing 5 replies - 1 through 5 (of 5 total)

The topic ‘robots.txt redux’ is closed to new replies.

robots.txt redux

Tags

Topics

Topics with no replies

Non-support topics

Resolved topics

Unresolved topics

All topics