Forums

referral spam from random wordpress blogs (87 posts)

  1. James
    Happiness Engineer
    Posted 7 years ago #

    if you get things settled, can you post how you did it.

    This shouldn't be too hard to block. We'll combine the information from Whooami's list item #4 and this Codex article. From that combination we can conclude that adding the following to our .htaccess file should stop the wave:

    SetEnvIfNoCase User-Agent Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) spammer=yes

    Order allow,deny
    allow from all
    deny from env=spammer

  2. whooami
    Member
    Posted 7 years ago #

    theres only one minor problem with that, thats a viable user-agent.

    for instance, my IE displays:
    Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; Q312461; .NET CLR 1.0.3705; .NET CLR 1.1.4322)

    I dont think anyone is interested in blocking access to all ie 6.0* users. Ive tested it and it does block my browser..

    Besides that, i believe you need to escape your .'s : \. which by the way doesnt fix the blocking of IE.

    Im simply going to take care of them based on how theyre calling the page -- I have no links on my site that include the string "category_name" WP handles that within its own generated mod_rewrite rules, its just a little tweek in the .htaccess.

  3. James
    Happiness Engineer
    Posted 7 years ago #

    As far as I know, SetEnvIfNoCase User-Agent Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1) spammer=yes should block exactly this user agent value: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1). According to my logs, I have no visitors with exactly that user agent. In the end, Whooami is right, always check your access logs for legitimate visitors before using .htaccess to block anything,

  4. whooami
    Member
    Posted 7 years ago #

    one sec..aha.

    I know what the problem is with what you have .. no quotes, the correct line should be like so:

    SetEnvIfNoCase User-Agent "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" keep_out

    doesnt seem to matter id the dots are escaped, I make it a habit to do so though.

  5. angsuman
    Member
    Posted 7 years ago #

    One interesting issue I have noticed is that nowadays many comment spams also carries a payload of referrer spams in the same HTTP GET request. So it is a double spam load, reducing their bandwidth usage. It is actually a benefit in disguise for us.

    I realized that actually blocking the referrer spammers aggresively (my referrer spammer blacklist) gets rid of most comment and trackback spammers!

    There is another trend I have noticed in comment spamming. First there is a meaningless, yet harmless comment in the blog like "Hi" or "Good article", which most bloggers approve. Then a deluge of spam comes in from the same source, which directly passes through because one comment from the same author has already been approved before.

  6. James
    Happiness Engineer
    Posted 7 years ago #

    Thanks for catching that, Whooami.

  7. whooami
    Member
    Posted 7 years ago #

    angsuman, off topic, heh, i used to have a linux server sitting on T1 in my other room -- i did small time shell and web site hosting for friends mainly. One of the funnest things I ever did was to redirect all the codered (remember that?) (default.ida?) back to microsoft.com. In fact to this day, I still do that if I see anything in my logs that resembles something sketchy in IIS.

    They welcome the traffic. :)

    On another note: In the last hour i've seen 2 more of the same hits, 2 diff referers. I'll clearly know by tommorow whether or not this helps (for the time being atleast)

  8. NuclearMoose
    Member
    Posted 7 years ago #

    Now that this whole thing has been analysed to death, I have one simple question;

    What's the sense?

    All of this time and energy wasted because the internet is a wild west of people who are thoughtless and greedy. What's the sense of even having a site when most of your effort goes towards ensuring that your back door is locked from intruders? Gone are the days when this was fun and exciting. Now it's drudgery and it leads reasonable people into situations where negative energy overwhelms the spirit of community. The spammers have successfully turned us on ourselves, and while we bicker, they proliferate their filth and drive up the cost of being part of what many consider to be humankind's crowning achievement so far.

    I am no longer willing to pay the price.

  9. whooami
    Member
    Posted 7 years ago #

    well anyway...

    now that ive gotten a hit in my log from, of all places, alexking.org, i tested the code again to make sure it was actually blocking something (only checked that it wasnt blocking something before) and in fact, it wasnt.

    Heres the 2 correct ways to block the user-agent I described in one page back (if it happens to come across your site)

    No mod_rewrite or just dont like to use it (use macmanx's way above) but use this:

    SetEnvIfNoCase User-Agent ".*(compatible; MSIE 6.0; Windows NT 5.1)" spammer_yes

    want to use mod_rewrite:

    RewriteCond %{HTTP_USER_AGENT} "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" [NC]
    RewriteRule ^(.*) http://%{REMOTE_ADDR}/ [R=301,L]

    personally, i prefer the mod_rewrite way since the other way actually shows them a page, whereas this mod_rewrite rule redirects them back to the proxy ip (perhaps putting a glitch in their crawling).

    nm, again I respectfully submit that everyone is free to do what they like regarding any kind of spam. It seems to me there is plenty of energy on this forum already dedicated to spam, spam plugins, trackback spam, comment spam, you name it spam, that surely one more thread wont be the end of civilization?

    It isnt anyone here's fault that this is "drudgery" and no longer "fun and exciting", for you, I guess?

    For the record its been shown that, statistically speaking, very few "people" are responsible for the vast majority of spam on the 'net. They are outnumbered by us, if that's any concilation.

    "What's the sense of even having a site when most of your effort goes towards ensuring that your back door is locked from intruders?"

    Firstly, there are entire sites dedicated to nothing else but combatting spam. Surely these people must derive some fun from it?

    Secondly, its not really a case of "most" of the effort, its a little bit of effort, atleast for me, with a lot of reward.

    Third, do you leave your front door unlocked when you go to bed at night? Im guessing not. :)

    the entries above are testable using wannabrowser.com or firefox's user-agent switcher btw.

    and here is another awesome tool for creating kickass .htaccess's:
    http://joseluis.pellicer.org/ua/configure.html

  10. Mark (podz)
    Support Maven
    Posted 7 years ago #

    A Codex search comes up with nothing for 'spam' that also has referer there (could be wrong). It would seem that there is a need so unless someone else feels like jumping in, I'll try and collate some information there so we have somewhere to point people.
    I've a ton still to learn about this though so expect to have to edit anything I do slot together.

    If a bot can generate enough processes that it upsets the host, then a blog is in trouble.
    If a bit can generate so much traffic that it breaches your bandwidth limit, then a blog is in trouble.
    Blog in trouble = costs owner money = posts here.
    Let's head'em off :)

  11. ColdForged
    Member
    Posted 7 years ago #

    4. They all send the same user-agent [ "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" ]

    Thanks for that track-down, whoami. I've gotten a slew of these over the past couple of days, all illegitimate.

  12. davidchait
    Member
    Posted 7 years ago #

    CG-Referrer does a good job for me (and a few other sites) at stemming the referrer-spam tide, at least when there's actual detectable spam to be blocked.

    If it's blacklisted words (I've got a long blacklist), it catches it and stops further page processing. I've caught about 100 spammer-attempts per day (none getting through to the best of my knowledge) without htaccess mods -- and that also means that CG-Referrer can track and display 'blocked' accesses.

    It also supports an IP block table, and a UserAgent table (which has a few known bot-agents). It'd be pretty simple to add the above user-agent string to my UA table if it was guaranteed to never be anything other than a bot... ;)

    I keep going back to the root Q: is there actually a legal/procedural way to track down and stop some of these guys? I've done whois lookups on the last dozen or so spam domains, they all link back to the same fake support domain contact...

    -d

  13. whooami
    Member
    Posted 7 years ago #

    david,

    sure you have legal avenues -- the trick is to have something besides a proxy IP and a legitimate domain running WP to blame :)

    here right out my Apache logs :):

    80.58.11.107 - - [16/May/2005:00:47:59 -0700] "GET /index.php?year=2004&monthnum=12&day=&name=&page=&paged=12 HTTP/1.0" 200 47595 "http://www.alexking.org/blog/2003/09/09/arches-national-park/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"

    Not alot you can do legally about something like that, I spose. :P

  14. Mark (podz)
    Support Maven
    Posted 7 years ago #

    Still learning here....
    70.85.148.194 - - [07/May/2005:12:26:46 -0400] "GET /T2/cosmos.php HTTP/1.0" 200 0 "-" "-"

    IP resolves here: http://www.dnsstuff.com/tools/whois.ch?ip=70.85.148.194

    But no UA ?

  15. jpettit
    Member
    Posted 7 years ago #

    i've noticed this referrer spam in my web logs all of a sudden as well.
    in some cases it's coming from people's WP sites that are well known and respected here in the forums, so i don't believe it's them actually spamming me.
    Same as above, this is not visible to anyone but me in my awstats logs, but it's kind of concerning that it's only WP blogs that are showing up as spammed entries.
    can .htaccess block this type of spam WITHOUT acutally blocking legit referrers in my logs from those sites? (as the sites are legit, just the entries are not).
    Thanks.

  16. whooami
    Member
    Posted 7 years ago #

    they have port 8080 open for a proxy -- but i think thats where you are hosted? or maybe not, I just ive recently seen them mentioned on herer somewhere.

    it resolves to tau.asmallorange.com ..

  17. dualravens
    Member
    Posted 7 years ago #

    jpettit, that's it exactly for me. I'm not getting any "spam" as generally known. It's more like referrer flotsam, random sites coming up on my stats, all from respectable WP blogs, and more than one noting they just did the upgrade. It's been about twenty over the last couple of days, none more than once (referrer spam alway hit a bunch of times from the same site).

    I wish I could help out with the tech side, but I'm a little lost with the language. Thanks again to those who are helping out.

  18. Mark (podz)
    Support Maven
    Posted 7 years ago #

    Wierd ..... I'd ask how you get all this info but it's one thing getting it, another knowing what to do with it and yet another to not break things doing so. Tricky stuff this spamming eh ?

  19. James
    Happiness Engineer
    Posted 7 years ago #

    A Codex search comes up with nothing for 'spam' that also has referer there (could be wrong).

    http://codex.wordpress.org/Plugins/Spam_Tools#Referrer_Spam

    http://codex.wordpress.org/Combating_Comment_Spam/Denying_Access#Editing_.htaccess_To_Deny_Access_Referrer_Spammers

    I think those are what you're looking for. I'm not sure why they don't show up when searching though.

  20. Mark (podz)
    Support Maven
    Posted 7 years ago #

    Cheers macmanx - I'll try and read them and consider cobbling something together in the next few days.

  21. whooami
    Member
    Posted 7 years ago #

    since this has come up in another thread now, I thought i would let anyone know thats experiencing this that since Ive added my mod_rewrite rule above to my .htaccess Ive atleast temporarily stopped them. :)

  22. jpettit
    Member
    Posted 7 years ago #

    sorry whooami, which rule did you add? there are a few of them in ths thread. thanks.

  23. angsuman
    Member
    Posted 7 years ago #

    whoami > One of the funnest things I ever did was to redirect all the codered (remember that?) (default.ida?) back to microsoft.com.

    That is funny and ironic too :)

    Maybe someone could write a plugin which redirects all requests for exploiting microsoft bugs (on IIS server) to Microsoft :)
    I am willing to pay for that.

    > On another note: In the last hour i've seen 2 more of the same hits, 2 diff referers. I'll clearly know by tommorow whether or not this helps (for the time being atleast)

    This appears to be a new phenomenon as it is also reported by others. Personally I would let it slide till traffic from such sources becomes a concern.
    Worst case scenario would be to stop trackback and rely only on pingback for auto-linking.

    NuclearMoose> What's the sense?
    Tell that to Nick Denton & Rafat Ali ;)

  24. whooami
    Member
    Posted 7 years ago #

    jpettit,
    if you are able to use mod_rewrite:

    RewriteCond %{HTTP_USER_AGENT} "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" [NC]
    RewriteRule ^(.*) http://%{REMOTE_ADDR}/ [R=301,L]

    those lines go in your .htaccess, just make sure that they somewhere under:

    RewriteEngine On
    RewriteBase /

    since the forum didnt format it like 2 lines, headsup - there are 2 lines there -- ones a conditions, ones a rule.

    It worked like a charm for me. :)

    ---

    hahaha, angusman.. great idea but really not feasable, since the plugin author would be spending more time on updating the plugin than bloggin', if ya know what I mean.

  25. KarenD
    Member
    Posted 7 years ago #

    Many thanks, whooami. I added those lines to my .htaccess yesterday, and my log this morning was free of bogus referrers for the first time since this problem started.

  26. whooami
    Member
    Posted 7 years ago #

    your welcome!

  27. jonlandrum
    Member
    Posted 7 years ago #

    I guess I'll chime in, now. I've upgraded to 1.5.1 with no setbacks. I've seen no spam links on my "dashboard". The only links I've seen on there were from other sites of mine (my blog is only about 2 weeks old and no one knows about it yet :). However, I think I might know why I'm not getting spam, or not. Maybe it's just a coincidence mixed with the youthfulness of my site! I have a link in the meta part of my head:

    <link rel="section" type="text/html" href="http://www.anti-leech.com/spam/spambot_stopper.php" />

    I would just emulate this page with my own, but he has a “copyright� on it, and my conscience won’t let me! Maybe the crapbots are getting caught up in this endless loop before they even get to my site…?
    Just a thought.

    ~Jonathan

Topic Closed

This topic has been closed to new replies.

About this Topic