Don’t know much about RSS (and should arguably shut the frak up on that basis) but to exclude Google from a directory all you have to do is exclude the appropriate directory in a robots.txt file. You can get plenty of info on how to do that on — where else? — Google. Presumably your rss feed comes from a particular directory [Edit: like /feed/]?
Example:
– blog.com/date/post-title
– blog.com/date/post-title/feed/
That’s what Google is displaying very, very often. First the original link, and below of it, slightly indented, the link to its feed.
This feed link should just be disabled, but I see no chance by using robots.txt for it. Well, I am using Google Sitemaps now, so I assume that this will avoid listing feed links a bit. But I have seen this very often in Google search results on other WordPress blogs as well.
Thanks,
Michael
Thanks, podz. That might help, but however does not solve the initial issue. Google (or any other search engine) should not index */feed/ at all.
Another idea: My robots.txt looks like this:
User-agent: *
Disallow: /wp-admin/
Disallow: /wp-content/
Disallow: /wp-includes/
Is it valid for robots.txt to add a line like
Disallow: */feed/
?
I think that would solve this issue.
Ok, just searched the web a bit: In robots.txt “disallow”, wildcards can’t be used.
However, we could use rel=”nofollow” in the links. I will add this now for both the feed and the trackback urls in my theme.
i tryed nofollow dident give great results …
Thank’s podz i ll try the post, what about the dissalow anyone tryed it ?
chaaban:
Why did nofollow didn’t give you “great results”? As far as I know does Google not follow links with the attribute nofollow, so it should solve this issue.
As for * in disallow:
See robotstxt.org/wc/exclusion-admin.html, they state:
Note also that regular expression are not supported in either the User-agent or Disallow lines. The ‘*’ in the User-agent field is a special value meaning “any robot”. Specifically, you cannot have lines like “Disallow: /tmp/*” or “Disallow: *.gif”.
While Disallow: */feed/
may not meet the standards, Google actually follows it.
Google has a robots.txt tester at http://www.google.com/webmasters/sitemaps/ and I tested some various rules and Disallow: */feed/
does indeed disallow Google access to my post feeds.
Blocked by line 5: Disallow: */feed/ Detected as a directory; specific files may have different restrictions
However, it also blocks my main feed, so this is what I settled on:
Disallow: /archives/*/feed/
Change it to match your permalink structure.
Great, thanks for your info. However I hope this won’t break the rest of robots.txt for other search engines.