• Unknown robot (identified by ‘bot*’) 67,478+181 4.06 GB
    Unknown robot (identified by ‘spider’) 1,954+100 154.07 MB
    Unknown robot (identified by ‘*bot’) 1,517+207 52.22 MB
    Unknown robot (identified by ‘robot’) 1,368+21 88.71 MB
    Unknown robot (identified by ‘crawl’) 827+104 39.89 MB

Viewing 2 replies - 1 through 2 (of 2 total)
  • Some research ideas.

    How to block bots using .htaccess

    @draigus check the access logs on the server and see if there is anything more than “bot” in the User-Agent string.

    Here is a blocking example that uses the U-A to send unwanted bots straight home.

    RewriteCond %{HTTP_USER_AGENT} ^.*(Ahrefs|Baidu|BlogScope|Butterfly|DCPbot|discoverybot|domain|Ezooms|ImageSearcherFree).*$ [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} ^.*(ips-agent|linkdex|MJ12|Netcraft|NextGenSearchBot|SISTRIX|Sogou|soso|TweetmemeBot|Unwind|Yandex).*$ [NC]
    RewriteRule ^/?.*$ "http\:\/\/127\.0\.0\.1" [R,L]

    You could just Rewrite -F them but I’ve found that telling them to go search themselves causes them to stop coming after a while, but giving them a 403 does not.

    Note, however, that if you block “bot” you will end up blocking GoogleBot and BingBot, which you may want indexing your site. So if you cannot get a good string out of the user agent, you’ll need to use other tools.

    I’m an enthusiast of ZBBlock as it provides defense against much more than unwanted bots, is free, is actively supported by its developer and his community and can provide cover for any php on your server, not just WordPress. If you follow the step-by-step instructions, it is painless to install. One tip is that the standard advice is to load the hook (copy-paste a string, nothing complicated) in wp-load.php. Instead do it in wp-config and you won’t have to worry about re-edits after WP updates.

Viewing 2 replies - 1 through 2 (of 2 total)
  • The topic ‘how to block Unknown robot (identified by 'bot*')’ is closed to new replies.