I run a site based on WP. All my content is 100% unique, I generally add 1 - 2 new posts per day. I am having some issues with duplicate content, not becuase I have the same content multiple times but becuase the WP template provides multiple pages with the same content, for example /tag or /category
I have my posts and pages which are http://www.mysite.com/page_or_post_name
But I then have:
When I noticed this issue with WP and that I had the same post or page getting indexed multiple times, I went to Google webmaster tools and selected all the duplicate content pages (everything that had a /tag or /category etc that were indexed and marked them to be removed.
I then went to my robots.txt file and added in:
I thought that having instructed Google to remove all the duplicated content via webmaster tools (and blocking it indexing from locations like categories in the future, I should end up with a clean list of indexed pages i.e if I have 30 pages and posts, I have 30 items indexed as such.
Having just checked back (site:<my url>) it does not seem to have worked very well. For example, it returns 9 pages of results so about 90 entries (I have about 40 pages/posts currently)
The first 4 pages (or 40 indexed pages) are correct, they are all http://www.mywebsite.com/page_OR_postname - perfect. It then says some results were omitted, click to view them. Once clicked, it then shows another 5 pages / 50 results such as:
http://mywebsite.com/blind-dating/feed/ - A description for this result is not available because of this site's robots.txt – learn more
So my robots.txt file seems to be blocking all the paths like in /feed in this example that can result in duplicate content being indexed but Google is indexing it anyway but then just not fully displaying it??
My site is growing and starting to become very popular and I don't want this to become a real mess in the near future. I am adding about 2 new posts per day so this could become a problem.
WP clearly offers some great functionality and I am not sure if I am harming my SEO efforts or improving them. For example, I have blocked my /feeds and /trackbacks but is this right as it may reduce duplicate content and the chance of being penalised but am I harming my SEO by not allowing access to the feeds and trackbacks etc?
I have done some Googling on WP and duplicate content and found a lot of info but much is old and out of date and contradicts what I have read on other sites.
All I really want to achieve is to make sure I get the most from my site, I don't want to block useful features but don't want to suffer from duplicate content.
Any help or advice to try and get this clean would be much appreciated