LONG blacklist / sharing?
-
I have a blacklist with over 5000 terms. I’ve used code to cleanse this list against duplicates, like I just have ‘asp’ and not ’01asp’, ’02asp’, etc. The list includes invalid characters which shouldn’t be in a user name/email, improbable character sequences, and recognized abusive text in several languages. I’ve added the numbers 101-999, because while two-digits in an ID field is common, three-digits indicates a generated address and I’d rather flag it. I understand that this list is extensive and that I’m almost certainly blocking valid names, so instead of simply banning and turning away users that might be registering, I’m providing a link to a Contact page where valid users can request a manual account creation. I will log accounts that get by the filters and enhance the blacklist as required.
So to my questions:
1) I know 5000+ terms is extensive but since this is registration and there’s a tight loop processing the registration values against the list, I don’t think this is a performance issue.
2) Related: I’m going to extend my installation Ban Hammer to support regular expressions. This will significantly reduce the size of the list because one regexp can eliminate the need for many individual filters.
3) I was going to publish my list at GitHub but then I was thinking that would just tell the bad guys exactly what patterns they should avoid in order to keep doing what they do. Comments?
4) Is there already a common list that we might use? I’ve downloaded lists from sites with blacklists for bad words, anti-spam, anti-malware, etc. My list is a hybrid of a few of these. I’m wondering if we might find value in an enhancement to Ban Hammer, where a common list is downloaded rather than maintained by each site owner. That’s not a great idea because everyone has their own sensibilities about what they want to block for specific sites. But other people might welcome the option.
5) Similarly I was thinking about a method where site owners can contribute from their list to a common pool. To prevent abuse I’d link the source domain of the list with entries in an attempt to prevent abuse (like adding the single letter ‘a’ into the list which would effectively block all registrations). This would require regular curation of the list, which I’m not very inclined to do, but in the spirit of FOSS, we all benefit from community contributions, so doing this would be as much for my own site administration as it is for others.Comments on all of the above are welcome. Please ID your responses with the numbers above. Thanks.
- The topic ‘LONG blacklist / sharing?’ is closed to new replies.