greetings,
there's a plethora of plugins available to combat Referer Spam at the user level. but if you're a server admin who would like to try yet another approach that will protect all of your <virtualhost> blocks, please have a read. :)
for Apache users. the problem with all plugins & php hacks is that it doesn't sanitize what gets written to
your access_log, a file widely used for statistics among other things.
Referer Spam can be spawned simultaneously using the same URL-naming schemes coming from multiple ip/hostname addresses.
let's say, your machine is hosting 5,000 domains, each with it's own <virtualhost> block and access_log & error_log files. at any given time, 1,500 of those can get attacked simultaneously and guess what? every access_log file associated with each <virtualhost> block will contain that Referer Spam.
will platform plugins block what get's written to access_log? nope.
will prepending a custom script through PHP block what get's written to access_log? nope.
wouldn't you rather pipe the Referer Spam from reaching the access_log file to all 5,000 <virtualhost> blocks into a single log file? =)
there's a mod_rewrite technique that does the trick but it's cryptic. using Apache's built-in SetEnvIf, conditional logging & mod_security, you can pipe Referer Spam to a single log file for later deletion and/or analysis. you can even invoke "tail -f /path/to/referer-spam.log" to observe in realtime.
you can implement this globally or per <virtualhost> block from within the Apache configuration file, httpd.conf
tested under FreeBSD. kudos to you if you get it work on other OS platforms! =)
Apache 1.3.33 & related modules:
config_log_module
setenvif_module
security_module (mod_security)
##############################################################################
## global setting ins httpd.conf
## must be outside of any blocks, eg. <virtualhost>, <directory>, etc..
##############################################################################
UseCanonicalName On
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" common
CustomLog "/path/to/access_log" common env=!do_not_log
ErrorLog "/path/to/error_log"
LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" lamerbouncer
CustomLog "/path/to/referer_spam_log" lamerbouncer env=do_not_log
# for each offendingterm, a SetEnvIf & SecFilterSelective pair must be specified.
SetEnvIf Referer "offendingterm" do_not_log
SecFilterSelective "HTTP_REFERER" "offendingterm"
# for each offendinghost.tld, a SetEnvIf & SecFilterSelective pair must be specified.
SecFilterSelective "HTTP_REFERER" "offendinghost.tld"
SetEnvIf Referer "offendinghost.tld" do_not_log
SecFilterDefaultAction "deny,,status:412"
##############################################################################
## matching the above directives to a <virtualhost> block
## substitute 127.0.0.1 with a valid IP address.
## substitute "htdocs" with "public_html" if need be.
##############################################################################
NameVirtualHost 127.0.0.1:80
Listen 127.0.0.1:80
<VirtualHost 127.0.0.1:80>
ServerName domain.tld
DocumentRoot "/path/to/domain.tld/htdocs/"
ErrorLog "/path/to/domain.tld/error_log"
CustomLog "/path/to/domain.tld/access_log" common env=!do_not_log
CustomLog "/path/to/domain.tld/referer_spam_log" lamerbouncer env=do_not_log
</VirtualHost>
make the necessary modifications, save your changes to httpd.conf and exit out of your editor.
as root, type: apachectl configtest
if the output is: Syntax OK
as root, type: apachectl configtest
now it's time to issue a tail -f on the <virtualhost> access_log and the global referer_spam_log
at the same time on separate windows and observe the beautify of if all. =)
window 1 - as root, type: tail -f /path/to/referer_spam_log
window 2 - as root, type: tail -f /path/to/domain.tld/access_log
a 412 response code will be sent to any matching SecFilterSelective "HTTP_REFERER" filter.
anything that matches a SetEnvIf Referer filter will be piped to /path/to/referer_spam_log NOT /path/to/domain.tld/access_log
you can get creative with your filters, so long as you stick "do_not_log" to a SetEnvIf directive, it will be piped /path/to/referer_spam_log
use accordingly & test before you deploy it globally. :)