WordPress.org

Ready to get started?Download WordPress

Forums

Robots.txt (15 posts)

  1. offshorewindwire
    Member
    Posted 4 years ago #

    A robots.txt file is preventing the googlebot from crawling my site. I changed the privacy settings. I cannot find the robot.txt anywhere in my file manager. But it's still blocking...

    How can I get rid of this thing? Any help would be very much appreciated.

  2. esmi
    Forum Moderator
    Posted 4 years ago #

    How do you know that it's a robots.txt file? Are you using a plugin like the Google XML Sitemaps Generator which creates a virtual robots.txt file?

    Can you post a link to your site?

  3. offshorewindwire
    Member
    Posted 4 years ago #

    I solved the problem shortly after you posted your reply. But thank you for responding.

  4. leslienorth
    Member
    Posted 4 years ago #

    I am having the same problem. Can someone help?

    http://thetravelmonster.com/robots.txt

  5. esmi
    Forum Moderator
    Posted 4 years ago #

    Error 500 - internal server error. Do you actually have a robots.txt file?

  6. Inspired2Write
    Member
    Posted 4 years ago #

    Feeling dumb over this issue, as I've been trying to access my robot.txt file for some time so I can add codes to it, but can't seem to find it. I know it's located in my root directory because I can view it from my browser when I put in the url, but why can't I find the file from my ftp to add anything to it? I've read other threads from others who have had the same issue, but none of the other threads reveal where to find the file. Seems crazy that I can view all other files in my root, but why not the robot.txt?

    I see in my sitemap generator page from my admin dashboard it does state this:

    The virtual robots.txt generated by WordPress is used. A real robots.txt file must NOT exist in the blog directory!

    As mentioned above, my browser window shows it's located in my root like this: http://www.myurl.com/robot.txt and it shows the following codes:
    User-agent: * Disallow:

    I'm using the XML Sitemap Generator, but the only files I see in the root for that is the sitemap.xml and the sitemap.xml.gz zipped folder. My url for the sitemap is this:
    http://www.myurl.com/sitemap.xml

    Google finds both my robot.txt and my sitemap, but again I want to add some codes to the robot.txt to improve my site. My actual url I'm referring to is linkandblog (dot com). Thank you.

  7. Inspired2Write
    Member
    Posted 4 years ago #

    Anyone? Still trying to figure out how to access my robot.txt file so I may optimize it. Really hoping to get this accomplished. Thanks.

  8. Saildude
    Member
    Posted 4 years ago #

    If the file is "virtual" it is only generated when someone visits, it is not physically on your site.

    Have you double checked your Dashboard >> Privacy >> "My blog is visible to anyone" ??

    One of my sites is a test site and I have that set to "block search engines" - there is no robots.txt file, but when I view the source code for the page it shows a robots.txt file that is "disallow *" - generated by WordPress.

    If that is not the problem look at all your Plug-ins to see if any of them have a Privacy setting - or you could just turn your Plug-ins off and see if the robots.txt file goes away when you look at the page code.

  9. Inspired2Write
    Member
    Posted 4 years ago #

    If the file is "virtual" it is only generated when someone visits, it is not physically on your site.

    Yes, I believe it is 'virtual', which is why I can't find it. But how then would I be able to optimize the robot.txt file? I wanted to eliminate problems like the duplicate title tags, which show up in my Google dashboard. I was going to install the All in One SEO Pack, but my research led me to a thread here, which indicated it won't work with the way my host has their server configured.

    Have you double checked your Dashboard >> Privacy >> "My blog is visible to anyone" ??

    Yes, it is visible. I don't necessarily want the robot.txt file to go away, I want to be able to optimize it for like the Google bot, and also if possible use it to block out more of the bad bots.

  10. Inspired2Write
    Member
    Posted 4 years ago #

    Would I then have to delete the virtual robot.txt file to create one that can be optimized? If so, where do I find the code to delete it? In the sitemap plugin???

  11. Inspired2Write
    Member
    Posted 4 years ago #

    Okay, trying one more time here...

    I have a 'virtual' robots.txt file, (generated from the Google sitemaps pluin). But, I don't see any way to modify the robots.txt since it's virtual.

    Can I therefore use this plugin to create the robots.txt changes I need, without it conflicting with the existing 'virtual' robots.txt file?

  12. Saildude
    Member
    Posted 4 years ago #

    One of the stock troubleshooting tricks is to disable ALL your plug-ins to see if the robots.txt file goes away or can be changed from the stock dashboard to allow (or goes away) (WordPress generates a virtual robots.txt file only if disallow is selected in Privacy)

    If just disabling does not work - rename the Plug-ins directory and try the default Theme to make sure it is not a Theme interaction. Also some Plug-ins can hang around even when disabled so thus the rename the directory.

    Then enable the Plug-ins one at a time until you see the bad robots.txt file show up -

  13. steeephen
    Member
    Posted 4 years ago #

    @ Inspired2Write,

    I have the same issue as you - I'm using Google XML sitemaps but wish to edit my robot's text file.

    I noticed the XML sitemap plugin has the following checkbox:

    Add sitemap URL to the virtual robots.txt file.

    The virtual robots.txt generated by WordPress is used. A real robots.txt file must NOT exist in the blog directory!

    I think that the line 'a real robots.txt file must NOT exist...' only means it must not exist if you want the plugin to add the sitemap url to the virtual robots text, as it has to write to this file.

    I also think that if you copy the contents of the virtual file and paste it into a real robots file, you should be able to then upload and edit it as you wish (as long as you untick the checkbox mentioned above).

    I don't think there is any way to use the plugin to actually edit the robots text file.

    I have to weigh up 2 choices:

    • either have a virtual robots file which I know works, with the correct sitemap url but have the search engines pick up admin pages that I don't want indexed
    • create and upload my own robots file and cross my fingers that it does not clash with the virtual robots file generated by Google XML Sitemap plugin / WordPress.

    I don't actually know if this will work and like you, I've been searching for the answer - but I'm going to create my own file this weekend and I'll post back here if it was successful (or not).

    At least your current robots.txt file isn't blocking anything.

    One more thing - you mentioned All in one SEO pack - I use this on all my sites as it's an extremely useful plugin and does not clash with any other plugin I've got installed.

    S.

  14. zonerdck
    Member
    Posted 4 years ago #

    For those of you using Google XML Sitemap generator. You can add onto the sitemap by adding some code in the sitemap-core.php file for this plugin.

    Around line 3151 you will see this line of code

    echo "\nSitemap: " . $smUrl . "\n";

    This adds the sitemap to the robots.txt file What I did was ad this code to allow all robots for now which once completed your code should look like this.

    echo  "\nUser-agent: *";
    echo  "\nAllow: /\n";
    echo  "\nSitemap: " . $smUrl . "\n";

    You can add as much content as you want from there.

  15. zonerdck
    Member
    Posted 4 years ago #

    Update to that last post the actual robots.txt comes from /functions.php file in the /wp-includes/ folder.

    Around line 1714 you will see this code...

    function do_robots() {
    	header( 'Content-Type: text/plain; charset=utf-8' );
    
    	do_action( 'do_robotstxt' );
    
    	if ( '0' == get_option( 'blog_public' ) ) {
    		echo "User-agent: *\n";
    		echo "Disallow: /\n";
    	} else {
    		echo "User-agent: *\n";
    		echo "Disallow:\n";
    	}
    }

    For my final working robots.txt file I used this but you can feel free to change the directories to suit your needs...

    function do_robots() {
    	header( 'Content-Type: text/plain; charset=utf-8' );
    
    	do_action( 'do_robotstxt' );
    
    	if ( '0' == get_option( 'blog_public' ) ) {
                         echo  "User-agent: *";
                         echo  "\nDisallow: /wp-admin";
                         echo  "\nDisallow: /wp-includes";
                         echo  "\nDisallow: /wp-content";
                         echo  "\nDisallow: /stylesheets";
                         echo  "\nDisallow: /_db_backups";
                         echo  "\nDisallow: /cgi";
                         echo  "\nDisallow: /store";
                         echo  "\nDisallow: /wp-includes\n";
    	} else {
                         echo  "User-agent: *";
                         echo  "\nDisallow: /wp-admin";
                         echo  "\nDisallow: /wp-includes";
                         echo  "\nDisallow: /wp-content";
                         echo  "\nDisallow: /stylesheets";
                         echo  "\nDisallow: /_db_backups";
                         echo  "\nDisallow: /cgi";
                         echo  "\nDisallow: /store";
                         echo  "\nDisallow: /wp-includes\n";
    	}
    }

Topic Closed

This topic has been closed to new replies.

About this Topic

Tags