• Resolved alexandergish

    (@alexandergish)


    I’m trying to run the RavenCrawler on a client’s website, but iThemes Security is preventing it from happening. I looked into whitelisting the ip ranges that AmazonAWS uses, but they’re provided in a JSON format, and it doesn’t look like iThemes supports that.

    Is there a way that I could allow the RavenCrawler in the .htaccess file to give it root level permissions to crawl the entire site? If so, how would I go about doing that? This is a fairly urgent matter that I need to resolve ASAP.

    Please help me out.

    https://wordpress.org/plugins/better-wp-security/

Viewing 8 replies - 1 through 8 (of 8 total)
  • @alexandergish

    What makes you think it is the iTSec plugin stopping RavenCrawler ?
    Does it work when you temporarily deactivate the iTSec plugin ?

    dwinden

    Thread Starter alexandergish

    (@alexandergish)

    I disabled the security plugin and the crawl was working, but I can’t leave it disabled because the client’s user base can’t log in when it’s not active.

    @alexandergish

    Ok, I see.

    It’s probably something in the .htaccess file that is blocking it.
    Reactivate the iTSec plugin (if not already).

    And then if possible please post the content of the .htaccess file while obfuscating any sensitive info.

    Also confirm you are using the iTSec plugin release 5.2.1

    dwinden

    Thread Starter alexandergish

    (@alexandergish)

    # BEGIN iThemes Security – Do not modify or remove this line
    # iThemes Security Config Details: 2
    # Enable HackRepair.com’s blacklist feature – Security > Settings > Banned Users > Default Blacklist
    # Start HackRepair.com Blacklist
    RewriteEngine on
    # Start Abuse Agent Blocking
    RewriteCond %{HTTP_USER_AGENT} “^Mozilla.*Indy” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Mozilla.*NEWT” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^$” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Maxthon$” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^SeaMonkey$” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Acunetix” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^binlar” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^BlackWidow” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Bolt 0” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^BOT for JCE” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Bot mailto\:craftbot@yahoo\.com” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^casper” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^checkprivacy” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^ChinaClaw” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^clshttp” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^cmsworldmap” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^comodo” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Custo” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Default Browser 0” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^diavol” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^DIIbot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^DISCo” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^dotbot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Download Demon” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^eCatch” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^EirGrabber” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^EmailCollector” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^EmailSiphon” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^EmailWolf” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Express WebPictures” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^extract” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^ExtractorPro” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^EyeNetIE” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^feedfinder” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^FHscan” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^FlashGet” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^flicky” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^g00g1e” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^GetRight” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^GetWeb\!” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Go\!Zilla” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Go\-Ahead\-Got\-It” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^grab” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^GrabNet” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Grafula” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^harvest” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^HMView” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^ia_archiver” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Image Stripper” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Image Sucker” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^InterGET” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Internet Ninja” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^InternetSeer\.com” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^jakarta” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Java” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^JetCar” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^JOC Web Spider” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^kanagawa” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^kmccrew” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^larbin” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^LeechFTP” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^libwww” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Mass Downloader” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^microsoft\.url” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^MIDown tool” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^miner” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Mister PiX” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^MSFrontPage” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Navroad” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^NearSite” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Net Vampire” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^NetAnts” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^NetSpider” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^NetZIP” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^nutch” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Octopus” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Offline Explorer” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Offline Navigator” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^PageGrabber” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Papa Foto” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^pavuk” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^pcBrowser” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^PeoplePal” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^planetwork” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^psbot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^purebot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^pycurl” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^RealDownload” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^ReGet” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Rippers 0” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^sitecheck\.internetseer\.com” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^SiteSnagger” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^skygrid” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^SmartDownload” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^sucker” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^SuperBot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^SuperHTTP” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Surfbot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^tAkeOut” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Teleport Pro” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Toata dragostea mea pentru diavola” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^turnit” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^vikspider” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^VoidEYE” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Web Image Collector” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Web Sucker” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WebAuto” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WebBandit” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WebCopier” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WebFetch” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WebGo IS” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WebLeacher” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WebReaper” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WebSauger” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Website eXtractor” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Website Quester” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WebStripper” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WebWhacker” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WebZIP” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Wget” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Widow” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WPScan” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WWW\-Mechanize” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^WWWOFFLE” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Xaldon WebSpider” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^Zeus” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “^zmeu” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “360Spider” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “AhrefsBot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “CazoodleBot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “discobot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “EasouSpider” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “ecxi” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “GT\:\:WWW” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “heritrix” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “HTTP\:\:Lite” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “HTTrack” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “ia_archiver” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “id\-search” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “IDBot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “Indy Library” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “IRLbot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “ISC Systems iRc Search 2\.1” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “LinksCrawler” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “LinksManager\.com_bot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “linkwalker” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “lwp\-trivial” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “MFC_Tear_Sample” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “Microsoft URL Control” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “Missigua Locator” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “MJ12bot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “panscient\.com” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “PECL\:\:HTTP” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “PHPCrawl” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “PleaseCrawl” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “SBIder” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “SearchmetricsBot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “SeznamBot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “Snoopy” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “Steeler” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “URI\:\:Fetch” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “urllib” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “Web Sucker” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “webalta” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “WebCollage” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “Wells Search II” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “WEP Search” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “XoviBot” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “YisouSpider” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “zermelo” [NC,OR]
    RewriteCond %{HTTP_USER_AGENT} “ZyBorg” [NC,OR]
    # End Abuse Agent Blocking
    # Start Abuse HTTP Referrer Blocking
    RewriteCond %{HTTP_REFERER} “^https?://(?:[^/]+\.)?semalt\.com” [NC,OR]
    RewriteCond %{HTTP_REFERER} “^https?://(?:[^/]+\.)?kambasoft\.com” [NC,OR]
    RewriteCond %{HTTP_REFERER} “^https?://(?:[^/]+\.)?savetubevideo\.com” [NC]
    # End Abuse HTTP Referrer Blocking
    RewriteRule ^.* – [F,L]
    # End HackRepair.com Blacklist, http://pastebin.com/u/hackrepair

    # Enable the hide backend feature – Security > Settings > Hide Login Area > Hide Backend
    RewriteRule ^(/)?wwlogin/?$ /wp-login.php [QSA,L]

    # Protect System Files – Security > Settings > System Tweaks > System Files
    <files .htaccess>
    <IfModule mod_authz_core.c>
    Require all denied
    </IfModule>
    <IfModule !mod_authz_core.c>
    Order allow,deny
    Deny from all
    </IfModule>
    </files>
    <files readme.html>
    <IfModule mod_authz_core.c>
    Require all denied
    </IfModule>
    <IfModule !mod_authz_core.c>
    Order allow,deny
    Deny from all
    </IfModule>
    </files>
    <files readme.txt>
    <IfModule mod_authz_core.c>
    Require all denied
    </IfModule>
    <IfModule !mod_authz_core.c>
    Order allow,deny
    Deny from all
    </IfModule>
    </files>
    <files install.php>
    <IfModule mod_authz_core.c>
    Require all denied
    </IfModule>
    <IfModule !mod_authz_core.c>
    Order allow,deny
    Deny from all
    </IfModule>
    </files>
    <files wp-config.php>
    <IfModule mod_authz_core.c>
    Require all denied
    </IfModule>
    <IfModule !mod_authz_core.c>
    Order allow,deny
    Deny from all
    </IfModule>
    </files>

    # Disable Directory Browsing – Security > Settings > System Tweaks > Directory Browsing
    Options -Indexes

    <IfModule mod_rewrite.c>
    RewriteEngine On

    # Protect System Files – Security > Settings > System Tweaks > System Files
    RewriteRule ^wp-admin/includes/ – [F]
    RewriteRule !^wp-includes/ – [S=3]
    RewriteCond %{SCRIPT_FILENAME} !^(.*)wp-includes/ms-files.php
    RewriteRule ^wp-includes/[^/]+\.php$ – [F]
    RewriteRule ^wp-includes/js/tinymce/langs/.+\.php – [F]
    RewriteRule ^wp-includes/theme-compat/ – [F]

    # Filter Request Methods – Security > Settings > System Tweaks > Request Methods
    RewriteCond %{REQUEST_METHOD} ^(TRACE|DELETE|TRACK) [NC]
    RewriteRule ^.* – [F]

    # Filter Suspicious Query Strings in the URL – Security > Settings > System Tweaks > Suspicious Query Strings
    RewriteCond %{QUERY_STRING} \.\.\/ [NC,OR]
    RewriteCond %{QUERY_STRING} ^.*\.(bash|git|hg|log|svn|swp|cvs) [NC,OR]
    RewriteCond %{QUERY_STRING} etc/passwd [NC,OR]
    RewriteCond %{QUERY_STRING} boot\.ini [NC,OR]
    RewriteCond %{QUERY_STRING} ftp\: [NC,OR]
    RewriteCond %{QUERY_STRING} http\: [NC,OR]
    RewriteCond %{QUERY_STRING} https\: [NC,OR]
    RewriteCond %{QUERY_STRING} (\<|%3C).*script.*(\>|%3E) [NC,OR]
    RewriteCond %{QUERY_STRING} mosConfig_[a-zA-Z_]{1,21}(=|%3D) [NC,OR]
    RewriteCond %{QUERY_STRING} base64_encode.*\(.*\) [NC,OR]
    RewriteCond %{QUERY_STRING} ^.*(%24&x).* [NC,OR]
    RewriteCond %{QUERY_STRING} ^.*(127\.0).* [NC,OR]
    RewriteCond %{QUERY_STRING} ^.*(globals|encode|localhost|loopback).* [NC,OR]
    RewriteCond %{QUERY_STRING} ^.*(request|concat|insert|union|declare).* [NC]
    RewriteCond %{QUERY_STRING} !^loggedout=true
    RewriteCond %{QUERY_STRING} !^action=jetpack-sso
    RewriteCond %{QUERY_STRING} !^action=rp
    RewriteCond %{HTTP_COOKIE} !^.*wordpress_logged_in_.*$
    RewriteCond %{HTTP_REFERER} !^http://maps\.googleapis\.com(.*)$
    RewriteRule ^.* – [F]

    # Reduce Comment Spam – Security > Settings > System Tweaks > Comment Spam
    RewriteCond %{REQUEST_METHOD} POST
    RewriteCond %{REQUEST_URI} /wp-comments-post\.php$
    RewriteCond %{HTTP_USER_AGENT} ^$ [OR]
    RewriteCond %{HTTP_REFERER} !^https?://(([^/]+\.)?231\.124|jetpack\.wordpress\.com/jetpack-comment)(/|$) [NC]
    RewriteRule ^.* – [F]
    </IfModule>
    # END iThemes Security – Do not modify or remove this line

    # HTTP AUTH

    # BEGIN WordPress
    <IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /
    RewriteRule ^index\.php$ – [L]
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . /index.php [L]
    </IfModule>

    # END WordPress

    Thread Starter alexandergish

    (@alexandergish)

    It looks like the versioning on the theme is 5.1.1

    @alexandergish

    Ok, try and disable the Enable HackRepair.com’s blacklist feature setting in the Banned Users section of the iTSec plugin Settings page.

    I haven’t seen ‘RavenCrawler’ in the list but who knows …

    dwinden

    @alexandergish

    No news, good news ?
    If so please mark this topic as ‘resolved’.

    dwinden

    Thread Starter alexandergish

    (@alexandergish)

    It went through, no clue what changed. Will chalk it up to a resolved mystery.

Viewing 8 replies - 1 through 8 (of 8 total)

The topic ‘How do I allow AmazonAWS to crawl?’ is closed to new replies.