I thought this might have some interest: http://simon.incutio.com/archive/2003/10/13/linkRedirects
I thought this might have some interest: http://simon.incutio.com/archive/2003/10/13/linkRedirects
Yes, I had read that yesterday, and I was really excited until I read the update about page-rank persisting across redirects.
The development team has been discussing how to handle comment spam, and we have some ideas that we'll be fleshing out soon.
is the team also considering comment registration validation (just like in pmachine)?
I'm seeing users of Mt who use the MT_Blacklist plugin written by Jay Allen, although I am more interested in the "security code authentication" used in some sites. It's faster and more painless, at least for the comments. See this entry as an example. I'm not sure if this would make trackbacks remain compatible across different CMSs, you guys know the architecture more than I do. Just another method to consider, that's all
An ironic thing in my case was that I missed the first viagra spam comment post because spamassassin flagged the message from my blog as spam, and I deleted it without really looking. I didn't notice the viagra comment post until someone else posted another spam comment.
So my suggestion is, if you are considering a comment approval system, perhaps it would be better not to include the body of the post in the email to the admin. That way SA or other mail filters won't flag that email.
You could always try configuring SA to whitelist emails from WordPress.... I don't know how to do that off the top of my head, though. I use a combination of SpamAssassin, SpamBouncer, and a customized procmail whitelist, so I just whitelist things in my procmail rules.
Interesting idea over at http://feedster.com/blog/ (today's entry.) Basically, he wonders if the spammers aren't simply targeting file names with 'comments' in name, and suggesting that simply randomly renaming the files (e.g. wpcomments.php to 34kfak23.php) might solve the problem.
I just started getting these, too. Looking over the access logs, all 3 comment spams have one thing in common: no referrer. I think I'm going to hack the comment handler to drop anything not referred by my base URL. Maybe it should be even more stringent and drop comments not referred by the parent post's comment URL.
Raising the bar...
Rantor, did you figure out the hack?
Rantor's idea certainly sounds rather nice actually. Clean and simple :) I Like it. It's not Bayesian spam filtering for blog comments, but it might be rather effecticve, for a time at least.
Referers are something that easily can be faked by SpamBots. There are a lot of indications actually that the Bots are parsing forms and stuff, so they most probably have to hit the post itself and then send the comment using the form on the page.
Referrer checking will raise the bar for the moment, but will be useless as soon as a reasonable number of blogs use that.
he he Bayesian filtering for blog bomments (on MT).
I don't really have anything else constructive to add to this discussion, although I will mention (again) the extreme unusability of captch-style blocking, and the impracticality of 'moderated' comments.
On second thought, I have two ideas ;-)
First thought: treat commenters as 'users' and submit for moderation only comments by first-time users only. Once you've approved a comment from a given user (a combination of name/email/url), all futur comments from that user would be automatically approved. I'm assuming that spammers don't care enough to figure this out, or having figured it out, to go through the process of posting a real comment just so they can get the right to post spam.
Another idea would be a kind of 'registration' requirement for blog posting ... where non-registered users would get an email when they posted a comment, and would just click a link in the email. Actually, you could bypass their having to click, if you just embed an image in the email ... then they could validate their comment by either:
a) they get the email, and it loads an image: http://myurl.com/commentvalidation.php?comment=commentID ... which shows up in the browser as a graphic that says something along the lines of "By viewing this image you have verified your identity and your comment has been accepted"
b) they get the email in a text-only email client (or with images disabled) and they have to click the link at the bottom (which goes to the same location)
I like that first-time only moderation.
@Jaykul: Personally I don't like the idea to have a registration for each blog I want to put a quick comment in. I agree that spammers wouldn't bother to figure out how to pass this by this "bar"... as long as they find enough blogs that are completely open for comments. As soon as a critical mass of blogs have introduced spam countermeasures, they will take the time to figure out how they can resume their "job". Writing up a bot that visits each link that is mentioned in an e-mail they fetch from a faked mailbox is trivial - so that won't help for long, I fear. And another problem this method will have: how will you treat trackbacks/pingbacks?
I think that not the legitimate users should suffer from the spam countermeasures, but the spammers. It should still be possible to have anonymous comments, it should still be possible to quickly drop a line without having to hazzle with a registration procedure. This is what makes blogging as interesting as it is.
If the first post was a very minimal registration (email address/name) then as long as they used the same name and email address, they'd be good and there wouldn't be any more of a registration hassle then most of the blogs I visit already have in requireing an email address. We'd be able to keep it open and usable, but provide a light layer of control over who blogs. Not much, but we don't want/need much.
Where is the problem in creating tons of e-mail accounts? It's not. Of course we could start filtering email addresses, but that's just another step that has to be taken. I think there are better methods with less side effects.
If you (the spammer) can spare the bandwith to create tons of email accounts, set up scripts to check them repeatedly for emails and click on all links ... then I have several good ideas:
1) We'll hunt you down using your ISP information, and a subpoena if necessary.
2) We'll sign up your email account (after all, we know it's valid) for some ... uhm .... commercial email ... bucket-loads of it ... if the sheer volume of email doesn't cause problems on it's own, maybe paying for the bandwidth of clicking all those links will give you pause.
3) Maybe someone will even send you viruses, after all, we know your email address, and that you automatically process HTML email and click on links, and not everyone is as nice as I am.
4) If you think that's nasty, wait until the guys over at 2600 hear we have the email accounts of spammers ...
Honestly, I think the idea that a spammer would be willing to go to such lengths, and take such risks, just to post in your blog in the hopes of increasing his Google rating before you spot and delete his post ... is a bit egotistical ... but then, maybe my site's just not very important.
The spammers now are willing to manually submit comment spam, so why shouldn't they be willing to write up a small script that creates tons of e-mail-accounts on one of their domains (or on forreign domains as well) if necessary? What should cause them to stop about thinking of the idea that they could throw together some simple code that polls that list of accounts they created before with the script, having a bot getting each url out of the incoming e-mails and feed them into wget? If that is all they have to do to come by these measures, they will do.
But if they can pass these procedures by this easily, why should we bother to bring them on and burden additional steps on regular users? All we will cause with this is that we'll loose users who write comments on our blogs. And at least I'm not willing to partly destroy my blog this way just in order to get rid of spammers. I think there are better methods to do this, which have less side effects on regular (non-spamming) users.
Do you think they care about the traffic that is generated? They don't at all. Do you think they care whether their own bot-accounts get spammed by others? They don't, why should they? It's just some more URLs their bots have to visit, so what? Viruses? Their bots don't care at all about viruses, because they won't execute them.
Just my 0.02$.
If we see their comments have nothing to do with the entry on the blog, then they are spam, if it does, we pass the comment through and all continuing comments with that name and email address continue to work. We don't share the email address on the site so spammers don't steal it for posting on our blog.
It doesn't matter what their name or email address is. If the comments don't have to do with the entry and have references like "come to my site for free viagra with purchase of 3 bottles" then it's pretty obvious. I don't even need to see the name or email address. It's just the computer that needs to see it for authentication purposes. It would be very difficult to write a script that generated a comment that made sense on my blog entry. It would have to be manual. If they come once and then start using their account to advertise, I ban them and maybe their IP if they do it again.
This seems really easy and perhaps a bit of work on the bloggers part, but not on the commenters. Any solution will require some labor by the blogger to maintain. I don't see any real holes in this. Does anyone else? Is WP going to grab ahold of this?
@flickerfly: Banning IPs only helps for those who use static IPs. Spammers who use static IPs for posting their spam are dumb, and the bad thing is: most spammers are not dumb enough to do this. Bottom line: IP blacklisting will be contraproductive, at least from my point of view. Chances are good that you keep out wanted users instead of spammers.
Ever seen the spam comments here? If you look quickly over them, they seem to fit somehow, not being of the type of "go there and buy my stuff". Such comments are very generic, and can easily be posted by a script.
You ask for the hole in that idea? Well, apart from the problem "how do we handle trackbacks and pingbacks - the *backers won't register at my blog and there is no way of authentication in these 'protocols' at all" there is the biggest one: it's a hazzle for users of the blogs (those people who post comments on your posts) - and that's the worst effect a anti-spam method for blogs can bring along. This way we'll sooner or later loose what blogging made what it is now: it's not really open anymore to everyone.
Again, just my 0.02 $ (and I fear I'll bring up many people against me because I'm arguing this way :))
*backers are the easiest of all to deal with: we just use the moderate once method:
When a (track|ping)back comes in, it gets sent (email?) to me for moderation. I would then add some portion of the URL to my whitelist, and any further *backs from that domain would be let through. (of course, I'd be able to revoke that permission at any time).
This ends up with having a mix of comment moderation and URL filtering - which is what I promote all the time :) Or did I miss something?
It sounds like we are agreed then whitelist the URL for blogs to *back. Something like this in wp_config.php:
(maybe we could combine the *back and email lists, but that would be more confusing for the user I think)
Seems like an easy implementation. Just add a little bit of code to disallow display of a comment/*back until approval and then allow automated approval or deletion according to being matched in the above list.
I'd suggest that if it's in the blacklists that it not even get emailed to the user, but just get logged in a file to check later, just in case a false-positive seems to slip by somehow.
What would be even better would be the ability to send a command to add a commenter to one of the lists through an email kinda like a mailing-list command email, but that might be difficult to manage. Nice feature if it can be done.
> That won't work because if a comment's headers indicate it comes
> from the commenter. We're probably going to go the no-body route.
This can easily be taken care of without removing the body of the notes.
Upon WP installation, ask the admin to pick a special comment-moderation password. Have that word used in a WP header ("X-WP-moderation: Fr,37qp" for example) and then it's a simple exercise on nearly any e-mail client to whitelist any message with that header.
Why don't you put this mod into the wordpress realease? It's a necessary feature..
b2evolution has an anti-spam feature already implemented.
I seriously don't understand the idea of blacklists. How can they ever hope to work without a high incidence of false positives? And besides, comment spammers arn't trying to attract blog-readers, they are trying to attract search bots. That's the idea behind comment redirects with white listing... </shameless-plug> ;)
Actually when I was working on that hack I found WP does have some natural defenses against comment throttling, however anyone who views the source code can see how to get around it.
If you're interested in stopping comment spam you should check out one of the latest builds of WP. For reasons documented elsewhere, I think centralized blacklists and registration hoops are destined to fail and are at best stopgap solutions.
I was SUPER surprised to learn that there is no option "Only registered users may comment". There ought to be one, and that would solve most problems.
This topic has been closed to new replies.