WordPress.org

Ready to get started?Download WordPress

Forums

Suggestion : wordpress should include a preformatted robots.txt (2 posts)

  1. sabinou
    Member
    Posted 3 years ago #

    Hello, I'm making an improvement suggestion for WordPress, hoping to help : we really need a preformatted robots.txt file !

    This suggestion comes from personal experience. You see, I has horrified, when I asked Google to search for stuff on my blog ("keywords" site:myblog.net). The results were just HORRIBLE.

    I was returned :
    - internal search results with my blog's own search engine
    - archives pages (myblog.net/year/month/)
    - given old posts pages obviously not valid anymore as posts piled up (myblog.net/page/238/)
    - given category pages obviously not valid either as posts piled up (myblog.net/category/page/238)
    - given tag pages obviously not valid either as tagged posts piled up (myblog.net/tags/page/238)
    - given "sort posts by wp post-ratings " archives pages

    The real valid result, a direct link to my post, was indexed faaaar below.

    How can you expect visitors to come back a second time with renewed hope to find the stuff they're searching ?

    It was then that I realized that, cool as it may be, WordPress is NOT google-friendly and lets Google index tons of discouraging useless stuff, instead of just letting him index direct links to precise posts.

    YES : I was foolish to be fooled, I should have double-checked. I don't want to put the blame on WordPress when I'm the one at fault.
    But I strongly believe a vast majority of the wordpress bloggers will commit the same mistake as me.

    So, just in case, well, here I am making a suggestion, that wordpress.org should include a preformatted robots.txt file to fix that problem.

    Here's my fixed robots.txt, this time forcing Google to only index direct post links :
    User-agent: *
    Disallow: /cgi-bin
    Disallow: /wp-admin
    Disallow: /wp-includes
    Disallow: /wp-content/plugins
    Disallow: /wp-content/cache
    Disallow: /wp-content/themes
    Disallow: /trackback
    Disallow: /feed
    Disallow: /comments
    Disallow: /category/
    Disallow: /category/*/*
    Disallow: /page/*/*
    Disallow: /tag/*
    Disallow: /tag/
    Disallow: */trackback
    Disallow: */feed
    Disallow: */comments
    Disallow: /*?*
    Disallow: /*?
    Disallow: /*sortby*
    Disallow: /*sortby

    User-agent: Googlebot-Image
    Disallow:
    Allow: /*

    This robots.txt must only be applied if your blog uses fancy permalinks, otherwise ?p=1234 direct post links wouldn't be indexed ;)

    Here's for my suggestion.
    Good day everyone ! :)

  2. So, just in case, well, here I am making a suggestion, that wordpress.org should include a preformatted robots.txt file to fix that problem.

    Eeeeeh, I see what you mean, but I don't think it's possible to do it in a way that will work for everyone.

    The one that comes with WP (and yes, it's there, it's a secret) is basic and simple and just says allow or deny. Past that, it's really up to you. Personally, I would want it to search my tags and cats.

    Hiding the wp-* folders, though, yes. Should do that.

Topic Closed

This topic has been closed to new replies.

About this Topic