Forums

New Bad Bot (9 posts)

  1. Mark (podz)
    Support Maven
    Posted 7 years ago #

    http://www.openwebspider.org/

    Ignores robots.txt

    82.55.69.126
    OpenWebSpider/0.3

  2. vkaryl
    Member
    Posted 7 years ago #

    Ugh. Thanks for the heads up Podz.

  3. Lorelle
    Member
    Posted 7 years ago #

    Podz, remind us how to deal with these bad bots...if we can't bounce them out in the robots.txt, then how?

  4. whooami
    Member
    Posted 7 years ago #

    .htaccess:

    order allow,deny
    allow from all
    deny from 82.55.69.126

  5. James
    Happiness Engineer
    Posted 7 years ago #

    Yes, but wouldn't the bot just come back with a new IP eventually (thus re-distributing that IP to a legitimate web-surfer)?

  6. vkaryl
    Member
    Posted 7 years ago #

    Probably. It's almost a Catch-22.... *sigh*

  7. whooami
    Member
    Posted 7 years ago #

    .htaccess:

    SetEnvIf User-Agent ^OpenWebSpider/0.3 keep_out
    order allow,deny
    allow from all
    deny from env=keep_out

  8. vkaryl
    Member
    Posted 7 years ago #

    Don't you love it? "eviltime (dot) com"....

    Thanks whooami, I'll have to try that one. I'm not very good with .htaccess stuff....

  9. shen139
    Member
    Posted 7 years ago #

    As posted here: http://www.tamba2.org.uk/T2/archives/2005/04/02/bad-bot/

    No, openwebspider 0.3 (and older versions) doesn’t read robots.txt!
    I’m working on it! Version 0.3 is not public yet and I intend to implement robots.txt support as soon as possible, maybe in the next release!
    I’m sorry for any problem caused by the spider and thanks for the feedback!

    Shen139

Topic Closed

This topic has been closed to new replies.

About this Topic

Tags

No tags yet.