Viewing 15 replies - 1 through 15 (of 18 total)
  • @internationils

    Can you present any evidence to backup your observation ?

    While not being able to access the robots.txt file Googlebot etc will still be able to crawl and index your website…

    dwinden

    Thread Starter internationils

    (@internationils)

    Sure…
    – both were in accessible.
    – disabled ithemes, still inaccessible
    – went back to the WP minimal .htaccess file: works
    – enabled ithemes: still works
    – after securing site, both are blocked. I did have the blocklists enabled.

    Do you want the htaccess file?

    @internationils

    both were in accessible.

    So the URL http://www.yourdomain.com/robots.txt returns a 403 (forbidden) ?

    Are you using an actual robots.txt file which exists in the root of your WordPress install ?
    (The reason why I ask is because WordPress by default uses a virtual robots.txt).

    dwinden

    Thread Starter internationils

    (@internationils)

    Sorry, the first line should say “both were inaccessible”. Both files returned a 403. I’m using the virtual WP robots.txt file, and Yoast SEO is generating the sitemap.xml.

    If there is anything I can do to help debug, please let me know.

    @internationils

    Thank you for your feedback.

    Did you actually try to manually access the URL:

    http://www.yourdomain.com/robots.txt

    from a browser (Google Chrome or Mozilla FireFox) ?

    Or are those 403s reported in Google Search Console ?

    Please upload your .htaccess file after obscuring any sensitive info.

    dwinden

    Thread Starter internationils

    (@internationils)

    Yes, I manually tried them both from firefox. Google search console also showed they were inaccessible. Didn’t try curl or wget, but would have expected the same result. I’ll post the htaccess.

    Thread Starter internationils

    (@internationils)

    Same error with wget:

    $ wget http://mysite.com/robots.txt http://mysite.com/sitemap.xml
    --2016-08-02 12:07:05--  http://mysite.com/robots.txt
    Resolving mysite.com (mysite.com)...
    Connecting to mysite.com (mysite.com)... connected.
    HTTP request sent, awaiting response... 403 Forbidden
    2016-08-02 12:07:05 ERROR 403: Forbidden.
    
    --2016-08-02 12:07:05--  http://mysite.com/sitemap.xml
    Reusing existing connection to mysite.com:80.
    HTTP request sent, awaiting response... 403 Forbidden
    2016-08-02 12:07:05 ERROR 403: Forbidden.

    Thread Starter internationils

    (@internationils)

    # BEGIN iThemes Security - Do not modify or remove this line
    # iThemes Security Config Details: 2
        # Enable HackRepair.com's blacklist feature - Security > Settings > Banned Users > Default Blacklist
        # Start HackRepair.com Blacklist
        RewriteEngine on
        # Start Abuse Agent Blocking
        RewriteCond %{HTTP_USER_AGENT} "^Mozilla.*Indy" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Mozilla.*NEWT" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^$" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Maxthon$" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^SeaMonkey$" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Acunetix" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^binlar" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^BlackWidow" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Bolt 0" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^BOT for JCE" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Bot mailto\:craftbot@yahoo\.com" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^casper" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^checkprivacy" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^ChinaClaw" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^clshttp" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^cmsworldmap" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^comodo" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Custo" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Default Browser 0" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^diavol" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^DIIbot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^DISCo" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^dotbot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Download Demon" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^eCatch" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^EirGrabber" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^EmailCollector" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^EmailSiphon" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^EmailWolf" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Express WebPictures" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^extract" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^ExtractorPro" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^EyeNetIE" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^feedfinder" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^FHscan" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^FlashGet" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^flicky" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^g00g1e" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^GetRight" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^GetWeb\!" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Go\!Zilla" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Go\-Ahead\-Got\-It" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^grab" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^GrabNet" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Grafula" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^harvest" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^HMView" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^ia_archiver" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Image Stripper" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Image Sucker" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^InterGET" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Internet Ninja" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^InternetSeer\.com" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^jakarta" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Java" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^JetCar" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^JOC Web Spider" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^kanagawa" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^kmccrew" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^larbin" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^LeechFTP" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^libwww" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Mass Downloader" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^microsoft\.url" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^MIDown tool" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^miner" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Mister PiX" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^MSFrontPage" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Navroad" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^NearSite" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Net Vampire" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^NetAnts" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^NetSpider" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^NetZIP" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^nutch" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Octopus" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Offline Explorer" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Offline Navigator" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^PageGrabber" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Papa Foto" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^pavuk" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^pcBrowser" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^PeoplePal" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^planetwork" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^psbot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^purebot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^pycurl" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^RealDownload" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^ReGet" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Rippers 0" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^sitecheck\.internetseer\.com" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^SiteSnagger" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^skygrid" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^SmartDownload" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^sucker" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^SuperBot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^SuperHTTP" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Surfbot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^tAkeOut" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Teleport Pro" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Toata dragostea mea pentru diavola" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^turnit" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^vikspider" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^VoidEYE" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Web Image Collector" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Web Sucker" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WebAuto" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WebBandit" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WebCopier" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WebFetch" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WebGo IS" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WebLeacher" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WebReaper" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WebSauger" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Website eXtractor" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Website Quester" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WebStripper" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WebWhacker" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WebZIP" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Wget" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Widow" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WPScan" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WWW\-Mechanize" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^WWWOFFLE" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Xaldon WebSpider" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^Zeus" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "^zmeu" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "360Spider" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "AhrefsBot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "CazoodleBot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "discobot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "EasouSpider" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "ecxi" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "GT\:\:WWW" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "heritrix" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "HTTP\:\:Lite" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "HTTrack" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "ia_archiver" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "id\-search" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "IDBot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "Indy Library" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "IRLbot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "ISC Systems iRc Search 2\.1" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "LinksCrawler" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "LinksManager\.com_bot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "linkwalker" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "lwp\-trivial" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "MFC_Tear_Sample" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "Microsoft URL Control" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "Missigua Locator" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "MJ12bot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "panscient\.com" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "PECL\:\:HTTP" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "PHPCrawl" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "PleaseCrawl" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "SBIder" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "SearchmetricsBot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "SeznamBot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "Snoopy" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "Steeler" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "URI\:\:Fetch" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "urllib" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "Web Sucker" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "webalta" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "WebCollage" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "Wells Search II" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "WEP Search" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "XoviBot" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "YisouSpider" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "zermelo" [NC,OR]
        RewriteCond %{HTTP_USER_AGENT} "ZyBorg" [NC,OR]
        # End Abuse Agent Blocking
        # Start Abuse HTTP Referrer Blocking
        RewriteCond %{HTTP_REFERER} "^https?://(?:[^/]+\.)?semalt\.com" [NC,OR]
        RewriteCond %{HTTP_REFERER} "^https?://(?:[^/]+\.)?kambasoft\.com" [NC,OR]
        RewriteCond %{HTTP_REFERER} "^https?://(?:[^/]+\.)?savetubevideo\.com" [NC]
        # End Abuse HTTP Referrer Blocking
        RewriteRule ^.* - [F,L]
        # End HackRepair.com Blacklist, http://pastebin.com/u/hackrepair
    
        # Protect System Files - Security > Settings > System Tweaks > System Files
        <files .htaccess>
            <IfModule mod_authz_core.c>
                Require all denied
            </IfModule>
            <IfModule !mod_authz_core.c>
                Order allow,deny
                Deny from all
            </IfModule>
        </files>
        <files readme.html>
            <IfModule mod_authz_core.c>
                Require all denied
            </IfModule>
            <IfModule !mod_authz_core.c>
                Order allow,deny
                Deny from all
            </IfModule>
        </files>
        <files readme.txt>
            <IfModule mod_authz_core.c>
                Require all denied
            </IfModule>
            <IfModule !mod_authz_core.c>
                Order allow,deny
                Deny from all
            </IfModule>
        </files>
        <files install.php>
            <IfModule mod_authz_core.c>
                Require all denied
            </IfModule>
            <IfModule !mod_authz_core.c>
                Order allow,deny
                Deny from all
            </IfModule>
        </files>
        <files wp-config.php>
            <IfModule mod_authz_core.c>
                Require all denied
            </IfModule>
            <IfModule !mod_authz_core.c>
                Order allow,deny
                Deny from all
            </IfModule>
        </files>
    
        # Disable Directory Browsing - Security > Settings > System Tweaks > Directory Browsing
        Options -Indexes
    
        <IfModule mod_rewrite.c>
            RewriteEngine On
    
            # Protect System Files - Security > Settings > System Tweaks > System Files
            RewriteRule ^wp-admin/includes/ - [F]
            RewriteRule !^wp-includes/ - [S=3]
            RewriteCond %{SCRIPT_FILENAME} !^(.*)wp-includes/ms-files.php
            RewriteRule ^wp-includes/[^/]+\.php$ - [F]
            RewriteRule ^wp-includes/js/tinymce/langs/.+\.php - [F]
            RewriteRule ^wp-includes/theme-compat/ - [F]
    
            # Disable PHP in Uploads - Security > Settings > System Tweaks > Uploads
            RewriteRule ^wp\-content/uploads/.*\.(?:php[1-6]?|pht|phtml?)$ - [NC,F]
    
            # Filter Request Methods - Security > Settings > System Tweaks > Request Methods
            RewriteCond %{REQUEST_METHOD} ^(TRACE|DELETE|TRACK) [NC]
            RewriteRule ^.* - [F]
            # Filter Suspicious Query Strings in the URL - Security > Settings > System Tweaks > Suspicious Query Strings
            RewriteCond %{QUERY_STRING} \.\.\/ [NC,OR]
            RewriteCond %{QUERY_STRING} ^.*\.(bash|git|hg|log|svn|swp|cvs) [NC,OR]
            RewriteCond %{QUERY_STRING} etc/passwd [NC,OR]
            RewriteCond %{QUERY_STRING} boot\.ini [NC,OR]
            RewriteCond %{QUERY_STRING} ftp\:  [NC,OR]
            RewriteCond %{QUERY_STRING} http\:  [NC,OR]
            RewriteCond %{QUERY_STRING} https\:  [NC,OR]
            RewriteCond %{QUERY_STRING} (\<|%3C).*script.*(\>|%3E) [NC,OR]
            RewriteCond %{QUERY_STRING} mosConfig_[a-zA-Z_]{1,21}(=|%3D) [NC,OR]
            RewriteCond %{QUERY_STRING} base64_encode.*\(.*\) [NC,OR]
            RewriteCond %{QUERY_STRING} ^.*(%24&x).* [NC,OR]
            RewriteCond %{QUERY_STRING} ^.*(127\.0).* [NC,OR]
            RewriteCond %{QUERY_STRING} ^.*(globals|encode|localhost|loopback).* [NC,OR]
            RewriteCond %{QUERY_STRING} ^.*(request|concat|insert|union|declare).* [NC]
            RewriteCond %{QUERY_STRING} !^loggedout=true
            RewriteCond %{QUERY_STRING} !^action=jetpack-sso
            RewriteCond %{QUERY_STRING} !^action=rp
            RewriteCond %{HTTP_COOKIE} !^.*wordpress_logged_in_.*$
            RewriteCond %{HTTP_REFERER} !^http://maps\.googleapis\.com(.*)$
            RewriteRule ^.* - [F]
    
            # Filter Non-English Characters - Security > Settings > System Tweaks > Non-English Characters
            RewriteCond %{QUERY_STRING} ^.*(%0|%A|%B|%C|%D|%E|%F).* [NC]
            RewriteRule ^.* - [F]
        </IfModule>
    
        # Enable the hide backend feature - Security > Settings > Hide Login Area > Hide Backend
        RewriteRule ^(/)?wplogin/?$ /wp-login.php [QSA,L]
        RewriteRule ^(/)?wp-register-php/?$ /wplogin?action=register [QSA,L]
    # END iThemes Security - Do not modify or remove this line
    
    # BEGIN WordPress
    <IfModule mod_rewrite.c>
    RewriteEngine On
    RewriteBase /
    RewriteRule ^index\.php$ - [L]
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . /index.php [L]
    </IfModule>
    
    # END WordPress
    
    # BEGIN MainWP
    
    # END MainWP
    Thread Starter internationils

    (@internationils)

    Not enabling the Banned Users (Default Blacklist – Enable HackRepair.com’s blacklist feature …and … Ban Lists) keeps it working, so its something there.

    @internationalis

    Ok, so then it must be the HackRepair.com’s blacklist.

    Try and disable it and see what happens.

    Note wget and empty User Agents are blocked by the HackRepair.com’s blacklist. Those are known to cause trouble in some cases.

    Your Ban Lists are probably empty because I don’t see anything related in your .htaccess file.

    dwinden

    @internationils

    If you require no further assistance please mark this topic as ‘resolved’.

    dwinden

    Thread Starter internationils

    (@internationils)

    I would like to know which rule is causing it so I can report it to the blacklist. And, preferably, ithemes should detect these two important files being blocked and at least warn the user. Dropping a sitemap has a big impact on google rankings according to quite a few posts that I read.

    Is there a logging that I can use to see which rule triggered, or why robots.txt was blocked that can help track down the problem?

    Thanks

    @internationils

    I think you should know that this is not a general iTSec plugin issue.
    I’m unable to reproduce the issue so it must be a user agent specific issue for your hosting env.

    You could try and temporarily comment out the line that blocks empty user agents in the .htaccess file like this:

    # RewriteCond %{HTTP_USER_AGENT} “^$” [NC,OR]

    But this is a bit of a wild guess.

    Otherwise try and determin which user agent is being detected.
    To do so follow these steps:

    • First make sure HackRepair.com’s blacklist is disabled.
    • Use FireFox with the FireBug extension installed.
    • In the FireBug console click on the little arrow right next to the Net menu option. Enable it.
    • Then access the robots.txt file from the FireFox browser.
    • Since the HackRepair.com’s blacklist is disabled this should result in the content of the virtual robots.txt file being rendered.
      In the FireBug console (Net->All) a single GET request for robots.txt will be displayed.

    • Click on the + sign right in front of the request.
    • This will show you the response and request headers of the request. Look for the value of the User-Agent header in the request headers.

    Good luck !

    dwinden

    @internationils

    Just another thought …
    If the website is behind a proxy the proxy server might be changing the user agent …

    dwinden

    @internationils

    Have you been able to identify the offending RewriteCond line ?

    dwinden

Viewing 15 replies - 1 through 15 (of 18 total)
  • The topic ‘robots.txt and sitemap.xml blocked’ is closed to new replies.