Scraping? People taking posts
I’m really not sure what it’s called — perhaps scraping — when another site takes your posts word for word and displays them on their blog with ads.
Do I have the right term for it? How do I stop it? I’ve gotten many track backs where people have my entire post on their site — looks spammy.
Thanks for any help you can give to point me in the right direction.
If you ever find a plug-in that prevents these guys from scraping content. Let me know please.
They are using your RSS feeds – and you can add something stating your copyright or other messages to the feeds.
There were several plugins dealing with similar issues, you may want to search for them.
The problem with that (Already have em’) is that these are usually done by Bots and they never pay any attention to the Copyright notices. It would be nice to somehow have the original content re-written to something that they wouldn’t want on their site. “Vulgar and intense language. Insulting the site and so on :D)
The problem with that notion is how do you tell the difference between a bad guy pulling your feed and a good guy pulling your feed? I mean, you wouldn’t want your normal users to suddenly get a bunch of cursing and such.
The short answer is that it is not possible. If you publish a site of any sort, then unscrupulous people are going to steal your content. Just a fact of life.
The usual answer to this problem is to make your feed content always contain a link back to your original post. Then the bot posts the link back to you as well, so any readers know where it came from.
Tell Google about these blogs too: http://www.google.com/dmca.html , so they can put in ignore rules for them.
Some products claim to detect these scrapers as well, I have not used them and cannot comment on them: http://copydefender.com/
video.ezineaerticles.com is a real problem for me right now. They steal just minutes after me posting.
I was looking at the plug in called Project Honey Pot Http:BL. But since the validation e-mail isn’t working I cannot comment on that either. But maybe somebody who uses it can.
I’ve reported these sites to Google and all I get is their long drawn out canned response for filing a DMCA. To be honest, I think Google loves these sites because it’s content for their AdSence Service.
I guess I’ll have to use that plug in that includes your web site URL at the top of each post.
Thanks for the link Otto.
If you can trace what IP address they’re scraping you from (by looking at the webserver access logs), then simply block them outright in an .htaccess file:
order allow,deny deny from 22.214.171.124 deny from 12.34.5. allow from all
The first deny blocks 126.96.36.199 specifically.
The second deny blocks the entire block of 12.34.5.* .
Block as many as you want that way. They can’t access your site at all then, period. No feeds, no nothing.
The problem with that is sometimes the botx use the same IP’s as legitimate visitors. I know this to be true because a few ppl from the Phillipines have mentioned this due to the small amount of IP’ available there.
I’m just going to use the header link as suggested in another thread.
- The topic ‘Scraping? People taking posts’ is closed to new replies.