WordPress.org

Support

Support » Requests and Feedback » Suggestion : wordpress should include a preformatted robots.txt

Suggestion : wordpress should include a preformatted robots.txt

  • Hello, I’m making an improvement suggestion for WordPress, hoping to help : we really need a preformatted robots.txt file !

    This suggestion comes from personal experience. You see, I has horrified, when I asked Google to search for stuff on my blog (“keywords” site:myblog.net). The results were just HORRIBLE.

    I was returned :
    – internal search results with my blog’s own search engine
    – archives pages (myblog.net/year/month/)
    – given old posts pages obviously not valid anymore as posts piled up (myblog.net/page/238/)
    – given category pages obviously not valid either as posts piled up (myblog.net/category/page/238)
    – given tag pages obviously not valid either as tagged posts piled up (myblog.net/tags/page/238)
    – given “sort posts by wp post-ratings ” archives pages

    The real valid result, a direct link to my post, was indexed faaaar below.

    How can you expect visitors to come back a second time with renewed hope to find the stuff they’re searching ?

    It was then that I realized that, cool as it may be, WordPress is NOT google-friendly and lets Google index tons of discouraging useless stuff, instead of just letting him index direct links to precise posts.

    YES : I was foolish to be fooled, I should have double-checked. I don’t want to put the blame on WordPress when I’m the one at fault.
    But I strongly believe a vast majority of the wordpress bloggers will commit the same mistake as me.

    So, just in case, well, here I am making a suggestion, that wordpress.org should include a preformatted robots.txt file to fix that problem.

    Here’s my fixed robots.txt, this time forcing Google to only index direct post links :
    User-agent: *
    Disallow: /cgi-bin
    Disallow: /wp-admin
    Disallow: /wp-includes
    Disallow: /wp-content/plugins
    Disallow: /wp-content/cache
    Disallow: /wp-content/themes
    Disallow: /trackback
    Disallow: /feed
    Disallow: /comments
    Disallow: /category/
    Disallow: /category/*/*
    Disallow: /page/*/*
    Disallow: /tag/*
    Disallow: /tag/
    Disallow: */trackback
    Disallow: */feed
    Disallow: */comments
    Disallow: /*?*
    Disallow: /*?
    Disallow: /*sortby*
    Disallow: /*sortby

    User-agent: Googlebot-Image
    Disallow:
    Allow: /*

    This robots.txt must only be applied if your blog uses fancy permalinks, otherwise ?p=1234 direct post links wouldn’t be indexed 😉

    Here’s for my suggestion.
    Good day everyone ! 🙂

Viewing 1 replies (of 1 total)
  • So, just in case, well, here I am making a suggestion, that wordpress.org should include a preformatted robots.txt file to fix that problem.

    Eeeeeh, I see what you mean, but I don’t think it’s possible to do it in a way that will work for everyone.

    The one that comes with WP (and yes, it’s there, it’s a secret) is basic and simple and just says allow or deny. Past that, it’s really up to you. Personally, I would want it to search my tags and cats.

    Hiding the wp-* folders, though, yes. Should do that.

Viewing 1 replies (of 1 total)
  • The topic ‘Suggestion : wordpress should include a preformatted robots.txt’ is closed to new replies.
Skip to toolbar