Google: how to avoid RSS links (10 posts)

  1. Michael_
    Posted 10 years ago #

    When checking referrers, I can see one Google related issue:
    Visitors enter a term in Google and usually get 2 results:
    1.) The link to the appropriate post
    2.) Below this link, a second link is listed, and it links to the RSS feed of the post.

    I want to avoid that Google lists my RSS feeds at all. How can I achieve this? It's no good at all that RSS feeds are listed if a user enters a search term in Google.

    Thanks in advance,

  2. Bodhipaksa
    Posted 10 years ago #

    Don't know much about RSS (and should arguably shut the frak up on that basis) but to exclude Google from a directory all you have to do is exclude the appropriate directory in a robots.txt file. You can get plenty of info on how to do that on -- where else? -- Google. Presumably your rss feed comes from a particular directory [Edit: like /feed/]?

  3. Michael_
    Posted 10 years ago #

    - blog.com/date/post-title
    - blog.com/date/post-title/feed/

    That's what Google is displaying very, very often. First the original link, and below of it, slightly indented, the link to its feed.

    This feed link should just be disabled, but I see no chance by using robots.txt for it. Well, I am using Google Sitemaps now, so I assume that this will avoid listing feed links a bit. But I have seen this very often in Google search results on other WordPress blogs as well.


  4. Mark (podz)
    Support Maven
    Posted 10 years ago #

  5. Michael_
    Posted 10 years ago #

    Thanks, podz. That might help, but however does not solve the initial issue. Google (or any other search engine) should not index */feed/ at all.

    Another idea: My robots.txt looks like this:
    User-agent: *
    Disallow: /wp-admin/
    Disallow: /wp-content/
    Disallow: /wp-includes/

    Is it valid for robots.txt to add a line like
    Disallow: */feed/
    I think that would solve this issue.

  6. Michael_
    Posted 10 years ago #

    Ok, just searched the web a bit: In robots.txt "disallow", wildcards can't be used.
    However, we could use rel="nofollow" in the links. I will add this now for both the feed and the trackback urls in my theme.

  7. chaaban
    Posted 10 years ago #

    i tryed nofollow dident give great results ...

    Thank's podz i ll try the post, what about the dissalow anyone tryed it ?

  8. Michael_
    Posted 10 years ago #

    Why did nofollow didn't give you "great results"? As far as I know does Google not follow links with the attribute nofollow, so it should solve this issue.

    As for * in disallow:
    See robotstxt.org/wc/exclusion-admin.html, they state:

    Note also that regular expression are not supported in either the User-agent or Disallow lines. The '*' in the User-agent field is a special value meaning "any robot". Specifically, you cannot have lines like "Disallow: /tmp/*" or "Disallow: *.gif".

  9. Alex Mills (Viper007Bond)
    Posted 10 years ago #

    While Disallow: */feed/ may not meet the standards, Google actually follows it.

    Google has a robots.txt tester at http://www.google.com/webmasters/sitemaps/ and I tested some various rules and Disallow: */feed/ does indeed disallow Google access to my post feeds.

    Blocked by line 5: Disallow: */feed/ Detected as a directory; specific files may have different restrictions

    However, it also blocks my main feed, so this is what I settled on:

    Disallow: /archives/*/feed/

    Change it to match your permalink structure.

  10. Michael_
    Posted 10 years ago #

    Great, thanks for your info. However I hope this won't break the rest of robots.txt for other search engines.

Topic Closed

This topic has been closed to new replies.

About this Topic