WordPress.org

Ready to get started?Download WordPress

Forums

Weird robots.txt 404 problem on multisite (13 posts)

  1. martin_62
    Member
    Posted 2 years ago #

    This is really weird. On my multisite installation (subfolders) I can see some of the blog's robots.txt. But on some of them I'm getting this 404 error. I've litterally tried everything, just not killing this machine!

    Here it works: http://liweitari.org/robots.txt
    Here it doesn't: http://swissgirldesigns.com/robots.txt

    In both cases the domains are mapped to a subfolder of http://www-f.site.ch

    I'm using the robots-plugin: "MS Robots.txt Manager"

    With exhausted greetings.
    Martin

  2. How did you activate the plugin? Network wide?

  3. martin_62
    Member
    Posted 2 years ago #

    Yes, this plugin you can only activate network wide from Network as Super Admin. It's specially designed for WP Multisite.

  4. There's a bug in 3.4 that actually let you per-site activate network-only plugins, which is why I asked.

    Which domain mapping plugin are you using?

  5. martin_62
    Member
    Posted 2 years ago #

    It's
    WordPress MU Domain Mapping, Version: 0.5.4.2

    - Martin

  6. martin_62
    Member
    Posted 2 years ago #

    BTW, I switched all sitewide activated plugins off, also all plugins on swissgirldesigns.com - no difference. robots.txt produces a 404-error, while other blogs on the same system are ok.

  7. http://halfelf.org/robots.txt works for me (mapped domain off tech.ipstenu.org) so I have a feeling it's the MS Robots.txt Manager plugin :/

  8. martin_62
    Member
    Posted 2 years ago #

    I also switched off MS Robots.txt - no difference. While it was switched off, I saved the privacy and permalink again.

    I tried to add another domain mapping - no difference. In those blogs, it doesn't work, I can do whatever I want, robots.txt gets a 404 error.

    In those blogs, where it's working, I can fiddle around, but it still works.

    swissgirldesigns.com has 446 posts
    k.brunner.beoblog.com just 1 post

    In both it's not working.

  9. tdmarsol
    Member
    Posted 2 years ago #

    Hi Everyone,

    It is happening to my Multisite install of WP 3.4.1. The robots.txt file looks fine, by itself:

    User-agent: *
    Disallow: /wp-admin/
    Disallow: /wp-includes/

    The file contents change when I switch between allow, do not allow options. Do not allow file contents looks like this:

    User-agent: *
    Disallow: /

    When I switch it back, it looks fine again. I am using the XML Sitemap generator in the Yoast WordPress SEO Plugin--and, those Sitemaps are formed correctly.

    I have turned the Yoast WordPress SEO plugin off, and the contents still look fine--no change from above. Google still says that Robots.txt is preventing them from crawling the site.

    I also tried Network Activation of Yoast SEO vs. individual site activation, and all of those Robots.txt files look identical, and Google continues to say that it is being prevented from Crawling the site because of the setting on the robots.txt file.

    Using Domain Mapping Plugin 0.5.4.

    I also switched the theme to TwentyEleven, and no change in either the content of Robots.txt or Google's response.

    Any thoughts?

    Thank you,

    -Tim

  10. Tim, no idea what you mean when you switch 'it' between allow and not allow. Is that in the plugin or WP proper?

    This is abnormal behavior on both parts, which means its your plugins, guys. Start with turning ALL plugins off. Yes. All.

  11. tdmarsol
    Member
    Posted 2 years ago #

    Hi Mika,

    Thank you for replying--by switching it, I meant changing the Settings >> Privacy from Allow Search Engines to Index and Do Not Allow Search Engines to Index.

    Also, this is a strange one--because of the intermittent nature of it.

    I renamed the plugin directory of the multisite installation to disable all plugins, and the behavior was the same. Google still receives a Do Not Index order (or an error that they are misinterpreting to be a Do Not Index order), even when the actual file says something else.

    I also used the Tools >> Export, and exported all content to an XML file and created a new account on the Multisite. Then, I Imported all of the content and changed the Domain Mapping to the new site--and, the Robots.txt file behavior was the same.

    Right now, I have exported the WP Multisite database in SQL and am comparing the differences between a site that is producing a valid Robots.txt file.

    I'll keep you posted.

    Thank you,

    -Tim

  12. tdmarsol
    Member
    Posted 2 years ago #

    Hey All,

    Update--the problem that I was having was actually on the Google side of things--though that still does not explain the robots.txt issue.

    Basically, I added the site to Google's Webmaster Tools as: sitename.net, and deleted the previous entry of http://www.sitename.net--changing all of the settings in WordPress to have the non www-version be the primary domain ahead of time . . . and, all the Google-based errors went away.

    I know that doesn't help us determine anything--though it was a resolution, of sorts, for Google not indexing the website.

    Thank you,

    -Tim

  13. Thank you for replying--by switching it, I meant changing the Settings >> Privacy from Allow Search Engines to Index and Do Not Allow Search Engines to Index.

    Oh. Okay, that's what's supposed to happen :D Robots is what controls Google Privacy.

Topic Closed

This topic has been closed to new replies.

About this Topic