Support » Plugin: Wordfence Security - Firewall & Malware Scan » Blocking "fake" Googlebot

  • I received an email from Google today that Googlebot couldn’t access my site:

    Over the last 24 hours, Googlebot encountered 67 errors while attempting to access your robots.txt. To ensure that we didn’t crawl any pages listed in that file, we postponed our crawl. Your site’s overall robots.txt error rate is 65.7%.

    I checked my site and found that Wordfence was blocking several IPs that when I ran WHOIS came back AS Google. Most frequently it was 173.194.99.134

    To make it that my site isn’t overloading the CPU with the Googlebot trying to read my site, I have changed my settings under Firewall Rules to “Treat Google like any other Crawler” and unchecked “Block fake Google crawlers”

    I don’t know why Wordfence was blocking this IP but I don’t think its fake as I have watched the live activity on my site. I hope this can be corrected so I can change my settings back and block the crawlers that are fake.

    https://wordpress.org/plugins/wordfence/

Viewing 15 replies - 1 through 15 (of 15 total)
  • Plugin Author WFSupport

    (@wfsupport)

    Hi

    Thanks for using Wordfence and sorry that you are seeing this error.

    I’ve escalated this issue to our dev team to look at. Can you provide a screenshot of the settings page and check the permissions on your robots.txt file?

    Thanks!

    tim

    EnduringEpilepsy

    (@enduringepilepsy)

    FYI – As I said, I’ve changed the Settings so that my site will work with the error right now…

    • I unchecked Immediately block Fake Google Crawlers
    • I selected Treat Google like any other crawler

    WHOIS of IP that was being blocked
    Current Firewall Settings
    Wordfence Activity Log

    Yoast SEO says I don’t have a robots.txt created but Google sent me a link which gave me this ??? Robots.txt

    Don’t know if this might play a role either, but I copied and saved a txt version of my .htacccess file. .htaccess I edit it in Yoast and know its a bit messy with the additions made automatically from settings in W3TC.

    Thank you for your help

    Katrina

    EnduringEpilepsy

    (@enduringepilepsy)

    Maybe this? Google Forum on Robots.txt

    Code I think it may be related to in my .htaccess:

    <IfModule mod_headers.c>
            Header append Vary User-Agent env=!dont-vary
        </IfModule>

    I just need to figure this out. Literally Google is crawling my site constantly right now. I have a hit in my live activity every minute or two from either 173.194.99.134 or 74.125.178.17. And I have no clue why…

    BTW- I changed settings to Security Level: Lockdown after receiving email about Bash vulnerability. My hosting is aware of issue and supposed to be addressing it, but I took the step for extra precaution.

    Thomas O.

    (@thomas-o)

    I do not believe this IP is a legitimate Google Bot:
    173.194.99.134
    http://googlewebmastercentral.blogspot.com/2006/09/how-to-verify-googlebot.html

    It might be one of their other services though.

    EnduringEpilepsy

    (@enduringepilepsy)

    Thanks Thomas, but that doesn’t solve the problem… I’m still having this kill my CPU. And actually getting quite a few different IPs now. Thats just the most prominent in my activity log.

    I do use PageSpeed and have for quite awhile now, but never had a like this problem until this week. So something has changed. Hopefully Tim and I can find the solution quickly.

    EnduringEpilepsy

    (@enduringepilepsy)

    Ok – this is getting a whole lot worse. I looked and saw that there was a login from Mountain View, United States. The IP address is 74.125.19.14. The WHOIS says its Google. But I know that PageSpeed is set up though the server, and does not require a login

    I tried blocking the IP and Wordfence says I can’t block my own IP. So whoever or whatever this is has clearly made changes.

    I changed my password (50 characters from 1Password) last night and still received 3 emails this morning of attempted logins blocked from IPs 74.125.19.14, 74.125.16.6, 74.125.41.4 using ‘admin’ as user

    Please help!

    EnduringEpilepsy

    (@enduringepilepsy)

    I looked at the activity on my hosting (GoDaddy) and saw that all of the almost all of the activity says it from Wordfence and my admin-ajax.php file ???

    Recent Activity

    Each hit is coming back with this as the “referring” URL, which makes no sense to me: http://www.enduringepilepsy.com/wp-admin/admin.php?page=WordfenceBlockedIPs

    I’m desperate here guys. I gotta get whatever this is fixed so it isn’t ruling my site. I’m trying to track down as much information as possible about the potential source of the issue. But I’m reaching a point where I’m a little lost in the maze and unsure where to go without causing more problems.

    Help… please. Thank you.

    indigokj

    (@indigokj)

    I have something similar happening. The offending IP address is our own server IP, which doesn’t make sense.

    Plugin Author WFSupport

    (@wfsupport)

    @indigo I know there is an option in the “how Wordfence get’s its ip’s” options specifically addresses ip’s that look like they come from your own server

    @ee Still researching this.

    tim

    indigokj

    (@indigokj)

    @wpsupport, I don’t think you understand. ALL traffic to our sites is being reported as being from our server’s IP (216.194.164.150). We no longer see IP addresses for users’ actual networks.

    Plugin Author WFSupport

    (@wfsupport)

    This is the option I’m referring to

    http://www.filedropper.com/screenshot-howwfgetsips

    tim

    EnduringEpilepsy

    (@enduringepilepsy)

    I contacted PageSpeed and received this response:

    Thank you for trying PageSpeed Service and for reporting the issue.

    To implement X-forwarded-for on your domain we suggest you some steps to follow as below:

    1.You need to edit the wp-config.php file located in the root of your site. At the top of the file (? And after the <php), add this code snippet:

    // ** WordPress x-forwarded-for ip fix ** //
     if(isset($_SERVER['HTTP_X_FORWARDED_FOR'])) {
     $xffaddrs = explode(',',$_SERVER['HTTP_X_FORWARDED_FOR']);
     $_SERVER['REMOTE_ADDR'] = $xffaddrs[0];

    }

    2. Now, disable the rewriters as per the first screenshot attached.

    3. Then, flush caches as per the second screenshot and then turn rewriter again.

    It will take 24 to 48 hours to propagate successfully.

    http://www.seeyar.fr/pss-proxy-publisher-failure/ – for screenshots

    Please let us know if you face any other issues.

    I just added the code to my wp-config a few minutes ago so I’m waiting to see if this helps.

    I don’t know what exactly x-forwarded is. ??? If it might have been an upgrade my hosting did or what but I’ve got my fingers crossed this helps.

    Plugin Author WFSupport

    (@wfsupport)

    Not sure but let me know so I can document for other users. Great catch 🙂

    tim

    EnduringEpilepsy

    (@enduringepilepsy)

    No luck. I’m still getting hits constantly from the IP 74.125.19.14.

    My latest activity file on GoDaddy says the referring URL is:

    http://www.enduringepilepsy.com/wp-admin/admin.php?page=WordfenceActivity

    And the User Agent is:

    Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/37.0.2062.124 Safari/537.36

    I’m stumped and still a bit desperate here…

    I am experiencing the same issue. My website is under bot attacks for 10 days.

    I have tried wordfence to other plugins..failed.
    I use my dedicated server firewall..still useless.
    Abnormal traffic,some sql injection i get from logs.

    I try to block from htaccess etc..all failed. I got attacks from various IP’s including many hits from google referrers. What i try finally cloudflare pro. and change to “under attack” mode it prevents and make my website up. But this is not the true solution.
    http://tinypic.com/view.php?pic=ve83zo&s=8# (cloudflare stats)

    Attack is to domain name only, when i copy site content to another domain, no problem.

Viewing 15 replies - 1 through 15 (of 15 total)
  • The topic ‘Blocking "fake" Googlebot’ is closed to new replies.