WordPress.org

Ready to get started?Download WordPress

Forums

YandexBot , wp-cron and a word to the wise (1 post)

  1. spherical
    Member
    Posted 3 years ago #

    Just a heads-up. Our server got hosed yesterday with a white page Error Connecting To Database upon accessing a blog front page. The outage lasted several minutes. Server load, normally at 0.06 and lower, was at 6.47 when I checked. NOC said that Apache had 500 connections open, CPU was at 100%, ran out of physical RAM and nearly ran out of swap space before the situation began to finally die off.

    The time that NOC support said that the spike occurred was coincident with my own access of the blog when I got the error. Huh? What is this Schrodinger's Cat?

    A check of the server logs indicated that just prior to the spike, however, wp-cron.php had started a job; less than 3 minutes earlier. Researching this, I found lots of reports of high CPU use from this section of WP. It was apparent that wp-cron was the cause of the spike. But why?

    Going back into the logs to see what else may stick out, I see that just prior to that, an access from yandex.ru appeared, with YandexBot as the user-agent. Hmmm. Researching that, I find that this bot doesn't play well, doesn't read robots.txt at all, many users reported the thing sucking down 2GB of bandwidth/day, it tries to access /categories/*, /tags/*, short URLs (which causes a redirect), not just actual posts/pages (a practice that could detrimentally affect your duplicate content rating), and there are indications that it is related to WP comment spam blooms.

    Installed wp-cron-dashboard plugin to check valid cronjobs that are scheduled to run and each of 3 blogs have 8 in a 24 hour period listed.

    Checking the logs from yesterday again, there were 305! runs of wp-cron -- a serious hit on server resources. All but a few (24 that are scheduled) were preceded by accesses from yandex.ru YandexBot, usually within 5 lines and 2 minutes ahead of the wp-cron run; most nearly immediate. There's your smoking gun!

    Do your server and yourself a favor and ban it. Only way I know of is in .htaccess:

    SetEnvIfNoCase User-Agent "Yandex.*" badbot
    
    <Limit GET POST PUT HEAD>
    order allow,deny
    allow from all
    # block specific badbot by user agent
    deny from env=badbot
    </Limit>

    Since this was added to my loooooooong list of bad bots, the Apache access logs shows it is getting it in the teeth with 403s and the server is loafing along at 0.02. My error logs are getting bigger but small tradeoff for the recovered stability and speed.

Topic Closed

This topic has been closed to new replies.

About this Topic