• Resolved Tony G

    (@starbuck)


    I have a blacklist with over 5000 terms. I’ve used code to cleanse this list against duplicates, like I just have ‘asp’ and not ’01asp’, ’02asp’, etc. The list includes invalid characters which shouldn’t be in a user name/email, improbable character sequences, and recognized abusive text in several languages. I’ve added the numbers 101-999, because while two-digits in an ID field is common, three-digits indicates a generated address and I’d rather flag it. I understand that this list is extensive and that I’m almost certainly blocking valid names, so instead of simply banning and turning away users that might be registering, I’m providing a link to a Contact page where valid users can request a manual account creation. I will log accounts that get by the filters and enhance the blacklist as required.

    So to my questions:
    1) I know 5000+ terms is extensive but since this is registration and there’s a tight loop processing the registration values against the list, I don’t think this is a performance issue.
    2) Related: I’m going to extend my installation Ban Hammer to support regular expressions. This will significantly reduce the size of the list because one regexp can eliminate the need for many individual filters.
    3) I was going to publish my list at GitHub but then I was thinking that would just tell the bad guys exactly what patterns they should avoid in order to keep doing what they do. Comments?
    4) Is there already a common list that we might use? I’ve downloaded lists from sites with blacklists for bad words, anti-spam, anti-malware, etc. My list is a hybrid of a few of these. I’m wondering if we might find value in an enhancement to Ban Hammer, where a common list is downloaded rather than maintained by each site owner. That’s not a great idea because everyone has their own sensibilities about what they want to block for specific sites. But other people might welcome the option.
    5) Similarly I was thinking about a method where site owners can contribute from their list to a common pool. To prevent abuse I’d link the source domain of the list with entries in an attempt to prevent abuse (like adding the single letter ‘a’ into the list which would effectively block all registrations). This would require regular curation of the list, which I’m not very inclined to do, but in the spirit of FOSS, we all benefit from community contributions, so doing this would be as much for my own site administration as it is for others.

    Comments on all of the above are welcome. Please ID your responses with the numbers above. Thanks.

Viewing 4 replies - 1 through 4 (of 4 total)
  • Plugin Author Ipstenu (Mika Epstein)

    (@ipstenu)

    🏳️‍🌈 Advisor and Activist

    You’re probably looking for a plugin like https://wordpress.org/plugins/stop-signup-spam/

    1) I don’t think 5000 is a huge issue, but at that point, I’d be totally rethinking my strategy. A blacklist that large is difficult to maintain and becomes unwieldy.

    2) Regexp can be cool but also will net you a lot more false positives. Be careful there 🙂

    3 and 4) There are some common lists, I mentioned Stop Signup Spam. Stop Forum Spam is a big one, but listening to this whole post, I don’t think Ban Hammer is quite right for your need.

    It sounds like you’re not having a troll problem as much as a SPAM problem. That is, you’re getting spammers signing up for your site, right?

    If that’s the case, you may want to look at https://wordpress.org/plugins/stop-spammer-registrations-plugin/

    There’s also the idea of using Akismet – https://stackoverflow.com/questions/5414232/registration-spammer-detection-with-akismet

    And finally, the last barrier I’d use is a plugin to manually approve registrations. It’s work, but it’s better to be safe than sorry.

    ETA: Also https://github.com/splorp/wordpress-comment-blacklist !! I forgot about that one!

    Thread Starter Tony G

    (@starbuck)

    I see the Stop Signup Spam plugin as complimentary to Ban Hammer, not a replacement – a user/address that does not get filtered against common text can then be filtered against the email address (or the other way around, doesn’t matter). This is where hooks would help as discussed in my other recent note.

    1-2) Agreed on the size and yes, regex is always a double-edged sword.

    As to trolls vs spam, I’m not trying to solve one problem. We all deal with different issues over time with different sites and software platforms. I’m really trying to build a general purpose WP toolkit, my own super plugin if you will, for keeping out as much abuse as possible. That includes guest-role registration sanitization, banning IPs for excessive 404s, using captcha, email verification, honey pots, optional 2FA, and banning IP blocks and addresses, and subscriber-role filtering via Akismet, language filters, link filtering, first-time comment moderation, and whatever else seems appropriate. Or in short, I’d rather spend my time on site content than fighting bad guys.

    3-4) Given my intent, The Stop Spammer Registrations plugin also seems to fit well in this toolkit, addressing areas not addressed elsewhere. Thanks for that recommendation.

    About splorp/wordpress-comment-blacklist, that’s a great reference! I think that’s the closest thing to an answer for this thread that I’m going to get and I thank you sincerely for remembering it. Notice that the blacklist there is highly redundant. The single line “longchamp” makes all other tests that include that text redundant. For this reason I’ve reduced my list size from 16000 entries to 5000, using code to eliminate all redundancies. The same could probably be done for that list (which I will merge with mine). What I have found is that the blacklist for comments needs to be different than for registration. My blacklist has the single line ‘asp’ and all other redundant lines removed. For registration that makes more sense, but not for comments where the text is commonly found in words like Aspect, Aspiring, Grasp, and ASP.NET.

    Of course there is such a thing as over-kill and false-positives. This is why my rejection message links to the Contact page, which is more forgiving but gives visitors an opportunity to request manual registration. I’ll try to automate this as time permits.

    At the end of the day, we just want legitimate users to be able to interact with our sites, quickly, and without being treated like potential criminals. So I’m trying to be as defensive as possible, without being offensive against the active and prospective base of site users.

    After over 20 years of dealing with various kinds of site abuse, spam, etc, I’m just striving to be more pro-active than re-active, to save some time, improve tranquility, and have more fun.

    Thanks for your time.

    Plugin Author Ipstenu (Mika Epstein)

    (@ipstenu)

    🏳️‍🌈 Advisor and Activist

    I’m an opponent of “One perfect plugin” for spam/trolls 🙂 Yeah, I’m against it. I’ve been doing this a long time and it doesn’t work. Iv’e never met a SINGLE plugin (or tool) that was sustainable and manageable.

    The story behind Ban Hammer is literally that I was facing a moron on a public registration site, and I didn’t want him commenting or registering. So I blocked his email. It happens that the logic worked well for swaths of email, but I still see that as a bonus.

    The additional tools I used at the time included Akismet, Bad Behavior, D’arcy Norman’s anti-splog trick, requiring email verification, requiring registration approval, and a moderation team.

    In the end though, since I wanted to write content and not manage idiots, I disabled the membership stuff. It wasn’t worth it to me. I still use this, but not as much (hence why I said it’s in maintenance mode in the other thread, though pull requests welcome).

    What I have found is that the blacklist for comments needs to be different than for registration.

    And that’s why I designed it out of the box to only be emails 🙂 It was too aggro otherwise and I didn’t really feel the need to let people shoot themselves in the foot not understanding that. Less to have to answer on my end.

    Thread Starter Tony G

    (@starbuck)

    Sincere thanks again for the commentary. You’ve responded to all points – beyond this plugin. 🙂

Viewing 4 replies - 1 through 4 (of 4 total)
  • The topic ‘LONG blacklist / sharing?’ is closed to new replies.