• Recently appears like Wordfence is blocking legitimate Google IP’s, or at least that it seems like:

    Wordfence Live Activity: Blocking fake Googlebot at IP 66.249.67.161 (Reason: Fake Google crawler automatically blocked)

    All my settings are identical and I haven’t changed nothing recently:
    Immediately block fake Google crawlers: CHECKED
    How should we treat Google’s crawlers: Verified Google crawlers have unlimited access to this site

    Searching some websites today logs I can find these blocked Mountain View IPs which a simple IP location query associates to Google Inc:
    66.249.64.161
    66.249.64.167
    66.249.67.161
    66.249.74.107
    66.249.78.43
    66.249.78.51
    66.249.78.177

    Are not these actually REAL Google bots?

    https://wordpress.org/plugins/wordfence/

Viewing 13 replies - 1 through 13 (of 13 total)
  • Plugin Author WFMattR

    (@wfmattr)

    This may be a DNS issue on your host, as it sounds like the reverse lookup is not working correctly, but I am not positive yet — it is working correctly on our test servers, but we will investigate further too.

    For now, you may need to temporarily set “How should we treat Google’s crawlers” to the second option, “Anyone claiming to be Google has unlimited access”, to make sure they can crawl the site properly. Disabling “Immediately block fake Google crawlers” may be necessary too.

    If you have SSH access to the server, can you try running the command:
    dig -x 66.249.67.161
    and post the results here?

    -Matt R

    Thread Starter OviLiz

    (@ovib)

    The mentioned websites are within 3 completely different servers:

    root@webserver1 [~]# dig -x 66.249.67.161

    ; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.37.rc1.el6_7.4 <<>> -x 66.249.67.161
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 50557
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

    ;; QUESTION SECTION:
    ;161.67.249.66.in-addr.arpa. IN PTR

    ;; ANSWER SECTION:
    161.67.249.66.in-addr.arpa. 21599 IN PTR crawl-66-249-67-161.googlebot.com.

    ;; Query time: 101 msec
    ;; SERVER: 8.8.8.8#53(8.8.8.8)
    ;; WHEN: Wed Oct 21 09:09:07 2015
    ;; MSG SIZE rcvd: 91

    root@webserver2:~# dig -x 66.249.67.161

    ; <<>> DiG 9.8.1-P1 <<>> -x 66.249.67.161
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 23979
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

    ;; QUESTION SECTION:
    ;161.67.249.66.in-addr.arpa. IN PTR

    ;; ANSWER SECTION:
    161.67.249.66.in-addr.arpa. 21527 IN PTR crawl-66-249-67-161.googlebot.com.

    ;; Query time: 21 msec
    ;; SERVER: 8.8.8.8#53(8.8.8.8)
    ;; WHEN: Wed Oct 21 09:10:02 2015
    ;; MSG SIZE rcvd: 91

    [root@webserver3 ~]# dig -x 66.249.67.161

    ; <<>> DiG 9.8.2rc1-RedHat-9.8.2-0.37.rc1.el6_7.4 <<>> -x 66.249.67.161
    ;; global options: +cmd
    ;; Got answer:
    ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1934
    ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0

    ;; QUESTION SECTION:
    ;161.67.249.66.in-addr.arpa. IN PTR

    ;; ANSWER SECTION:
    161.67.249.66.in-addr.arpa. 21599 IN PTR crawl-66-249-67-161.googlebot.com.

    ;; Query time: 50 msec
    ;; SERVER: 8.8.8.8#53(8.8.8.8)
    ;; WHEN: Wed Oct 21 09:21:47 2015
    ;; MSG SIZE rcvd: 91

    Plugin Author WFMattR

    (@wfmattr)

    Thanks for checking that. Yesterday on one host for me, “dig -x” was returning no PTR records for the above Google addresses, and the “status” said SERVFAIL instead of NOERROR, but it cleared up about 4 hours after my initial reply above. I think it was a temporary DNS issue, but I can’t tell for sure — reverse lookups were working for other hosts at the time, on the same host.

    If you have Live Traffic enabled in Wordfence, you should be able to see if Google is currently being identified correctly by looking at the “Google Crawlers” tab.

    The dev team is looking into this, to add a fallback if reverse DNS lookups are not working, since that seems to be the most likely issue here. Our internal reference number for this is FB1034.

    If Google crawlers still aren’t being identified today, let me know. If it does happen again, could you also run “dig -x” for the IP that has a problem, as soon as you notice it?

    -Matt R

    Thread Starter OviLiz

    (@ovib)

    Thank you. Actually today I haven’t seen any Mountain View blocked IPs.

    Plugin Author WFMattR

    (@wfmattr)

    Ok, thanks. Let us know if it comes up again, and the dev team will still be looking at adding a fallback for future reverse lookup issues, mentioned above.

    -Matt R

    Wordfence is blocking all 66.249.75.x GoogleBot visitors on my site for a few days now.

    Just one example below…. [notwithstanding lack of productive src allowances by WP]

    src=”http://itnorthwest.ca/img/dig-google.jpg&#8221;

    That said, another Site at the same Host has no problems allowing all 66.249.65’s.

    – i always seem to be baby-sitting WF. Unfortunate really.

    Plugin Author WFMattR

    (@wfmattr)

    @anna: Thanks for the report, and sorry for the trouble. Was the ‘dig -x’ command run on the same host that had this problem, and was the Google IP the same as one that was blocked on that host? (The PTR record shows the expected result in your screenshot, but the TTL is 21599, which is the default of 6 days that Google uses, so the result wasn’t cached locally.)

    Changing these two options should allow the crawlers through, on sites that are having this problem, for now — it may be that the bad results have been cached, and will likely clear up:
    Set “How should we treat Google’s crawlers” to the second option, “Anyone claiming to be Google has unlimited access”
    Disable “Immediately block fake Google crawlers”

    If you can post screenshots of any of the blocked IPs as well, if you have any still blocked, I can verify that there isn’t a new issue causing this. (We haven’t seen this problem on our sites, and the one host where I saw the DNS error doesn’t have a web site — so I don’t have much else I can check on our end. The dev team is already looking at handling failed DNS lookups, in case that is the issue.)

    -Matt R

    Thread Starter OviLiz

    (@ovib)

    Looks like is happening again…

    Plugin Author WFMattR

    (@wfmattr)

    Can you post the IPs that are showing up, and try “dig -x” on those IPs again, from the affected host(s)?

    Also, do you still have the options set that I had listed above? They should stop the blocking from happening, if the host is having trouble with the reverse lookups, or other possible issues.

    -Matt R

    Thread Starter OviLiz

    (@ovib)

    Hi Matt,
    sorry but I have just unlocked the Google’s Mountain View IP without taking notes.

    However in meantime I have changed the specific settings back. It was a just temporary change in the end, right? 🙂

    Plugin Author WFMattR

    (@wfmattr)

    Yes, it should be only temporary, but if there are consistent problems with this on your host, the fix may have to wait for a future release of Wordfence.

    If you can still try “dig -x” on the affected host for the affected IP, or a few of the other IPs you mentioned in the original post above, that might still help. If any of them come up with a missing “ANSWER SECTION” and/or a status other than “NOERROR”, it would help to confirm the issue.

    If any of the Google IPs that you look up have issues, please copy and paste the entire output from dig, like you did before. (If they’re successful and the “answer section” has a googlebot.com address, then we don’t need the output.)

    -Matt R

    Plugin Author WFMattR

    (@wfmattr)

    @cristian Balan and @anna:

    We are trying to narrow down the cause of this issue — can you tell me which hosting providers you are using, and whether the sites that were affected are on a VPS or shared hosting?

    Also, are the sites using CloudFlare or any other reverse proxy? This is most likely unrelated, but I want to check just in case.

    -Matt R

    Digital Ocean. Not shared.

Viewing 13 replies - 1 through 13 (of 13 total)

The topic ‘Blocking fake Googlebot’ is closed to new replies.