WordPress.org

Ready to get started?Download WordPress

Forums

[resolved] Load spikes - any ideas? (28 posts)

  1. jazbek
    Member
    Posted 3 years ago #

    I am a developer for two sites. On both of these servers I am having problems with the exact same symptoms:

    At various times throughout the week, the cpu load will suddenly start shooting up exponentially until apache is unable to serve pages. This clears up when apache is restarted. Neither server is having an issue with not enough RAM/too much swap being used.

    Site 1:
    dedicated server with 2GB ram
    ~410k http requests per day
    ~1700 visits/day
    not using multisite
    w3 total cache (using memcached)

    Site 2:
    Rackspace cloud server instance with 4GB of ram
    Database is on another instance also with 4GB of ram
    uses multisite (5 blogs via subdomain)
    ~560k http requests per day
    ~5000 visits per day
    WP Super Cache

    Both sites have wp 3.0.1 installed. One of them is a new site (built in 3.0.1). The other was running 2.7 for over a year just fine, and this problem didn't start happening until immediately after I upgraded to 3.0.1 in September. Neither have any plugins in common. Any ideas where I should look to see where the problem might be coming from?

    I have been pulling my hair out over this for almost two months, and am hoping for some insight.

  2. jazbek
    Member
    Posted 3 years ago #

    Can a moderator move this to the WP-Advanced forum? Thanks.

  3. mrmist
    Forum Janitor
    Posted 3 years ago #

    Go on then.

  4. maxk
    Member
    Posted 3 years ago #

    Have you checked your Apache error logs, MySQL error logs, and set up slow query logging?

    Sounds like a table locking / data processing issue.

  5. jazbek
    Member
    Posted 3 years ago #

    Nothing in the apache error logs. I will check the MySQL logs.

  6. lochinvar
    Member
    Posted 3 years ago #

    We were running into the same issue for a number of months. Web logs pointed to request that were constantly being redirected and would eventual max out the redirect tries. At some point this would bring down our server, usually just after we published a set of posts. Well, four posts, but it was always at 9pm pst and we would see the server go down about 25 minutes after that.

    We turned on caching and the problem disappeared. I know that I should go back and figure out why the redirects were maxing out but now that the server has stopped crashing it is low priority.

  7. lochinvar
    Member
    Posted 3 years ago #

    Sorry I should have mentioned that we saw this issue with both 2.x and 3.x installed.

    And I should learn to read the original post fully before replying. Ignore the bit about caching

  8. Donncha O Caoimh
    Member
    Posted 3 years ago #

    Could be some sort of Apache or PHP memory leak? Try set the max number of requests for Apache child to a lower number so the processes are recycled faster.

  9. jazbek
    Member
    Posted 3 years ago #

    @lochinvar - I too had the exact same problem on a 2.9 MU site this past January. When this started happening, I thought it was the same thing, but I've been scanning the apache logs like crazy and can't find any redirect loops.

    @donncha - Yes, it does seem sort of like a memory leak, but aren't memory leaks kind of slow? I have seen the server load go from .5 to 140 in less than 5 minutes.

    EDIT: good idea about recycling processes though. I will change that setting and cross my fingers!

  10. jazbek
    Member
    Posted 3 years ago #

    I just checked and MaxRequestsPerChild is already @ 1000 on both sites -- that already seems low enough, no? Considering the site is getting hundreds of thousands of http requests per day, that would mean it's recycling hundreds of times a day already.

  11. maxk
    Member
    Posted 3 years ago #

    Again -- it sounds like it could be a locking issue on the tables. Set up your slow query log to look for any abnormally lengthy MySQL queries, and check your crontabs to make sure there isn't some maintenance process doing analytical crunching on the database, or interrupting the server.

  12. jazbek
    Member
    Posted 3 years ago #

    I did set up slow query logging today, and since then, we've experienced two load spikes. The most recent was at around 01:53 (server time).

    Here is the slow query log from around that time:

    # Time: 101117  1:40:22
    # User@Host: x_] @ localhost []
    # Query_time: 2  Lock_time: 0  Rows_sent: 0  Rows_examined: 0
    SELECT tt.term_id, tt.term_taxonomy_id FROM wp_terms AS t INNER JOIN wp_term_taxonomy as tt ON tt.term_id = t.term_id WHERE t.term_id = 1532 AND tt.taxonomy = 'link_category';
    # Time: 101117  1:55:12
    # User@Host: x_] @ localhost []
    # Query_time: 2  Lock_time: 0  Rows_sent: 1  Rows_examined: 0
    SELECT t.*, tt.* FROM wp_terms AS t INNER JOIN wp_term_taxonomy AS tt ON t.term_id = tt.term_id WHERE tt.taxonomy IN ('category')  AND t.slug = 'book-reviews' ORDER BY t.name ASC;
    # Time: 101117  1:58:11
    # User@Host: x_] @ localhost []
    # Query_time: 8  Lock_time: 0  Rows_sent: 0  Rows_examined: 0
    SELECT tt.term_id, tt.term_taxonomy_id FROM wp_terms AS t INNER JOIN wp_term_taxonomy as tt ON tt.term_id = t.term_id WHERE t.term_id = 1532 AND tt.taxonomy = 'link_category';
    # User@Host: x_] @ localhost []
    # Query_time: 8  Lock_time: 0  Rows_sent: 5  Rows_examined: 3646
    SELECT *, ((0.1800 * (MATCH (<code>title</code>) AGAINST ( "alternative medicine's flawed reasoning one_ true cause all_ disease " ))) + (2.4429 * (MATCH (<code>content</code>) AGAINST ( " medicine disease alternative treat science pain energy true causation infection claims genetic evidence bacteria practitioners strep simple life underlying treatment" )))  ) as score FROM <code>wp_similar_posts</code> LEFT JOIN <code>wp_posts</code> ON <code>pID</code> = <code>ID</code> WHERE (MATCH (<code>title</code>) AGAINST ( "alternative medicine's flawed reasoning one_ true cause all_ disease " ) OR MATCH (<code>content</code>) AGAINST ( " medicine disease alternative treat science pain energy true causation infection claims genetic evidence bacteria practitioners strep simple life underlying treatment" ))  AND post_status IN ('publish') AND post_type='post' AND ID != 13757 AND post_password ='' ORDER BY score DESC LIMIT 0, 5;

    The longest query here 8 seconds, and was completed 5 mins after the load spike. This is my first time using slow query log, but it doesn't seem to me like this is the issue, plz correct me if I'm wrong.

    Edit: took out the server name to protect my client's anonymity. :)

  13. maxk
    Member
    Posted 3 years ago #

    Well the next step is to set up a crontab to run uptime periodically and record the results of top when the load average spikes...

  14. ghas
    Member
    Posted 3 years ago #

    Luckily I have run across this issue yet but thanks for the info. In the beginning I did set up my slow query log to look for lengthy abnormalities.

  15. Frederick Townes
    Member
    Posted 3 years ago #

    @jazbek, I'd recommend some different settings for W3TC, contact me for tips.

  16. gje
    Member
    Posted 3 years ago #

    I'm facing the same sort of problems
    Just migrated to VPS.net
    Might contact you as well

  17. jazbek
    Member
    Posted 3 years ago #

    Hey all, sorry for not replying sooner. One of the clients and I are no longer working together. The other one seems to have been completely fixed by doubling the RAM on their hosting instances from 4GB to 8GB each.

    Does this seem like a lot for a site that has wp-super-cache and about 16k visits a day?

  18. Donncha O Caoimh
    Member
    Posted 3 years ago #

    That's a lot of memory! Chances are there's a memory hogging plugin running on the server somewhere.

  19. skyetek
    Member
    Posted 3 years ago #

    I found a company (no affiliation) online a while back when this was an issue for a friend of mine.

    citrusleaf.net/clsolutions.html

  20. The Hack Repair Guy
    Member
    Posted 3 years ago #

    Hi folks,
    I encountered almost exactly the same issue with a WP site as well.

    Obviously just adding more RAM isn't the solution-- I mean at some point WP is going to bomb once it uses up 8gb of RAM, so adding RAM only seems to postpone the inevitable server crash.

    Any other sage advice?

    Thanks,
    Jim

  21. Donncha O Caoimh
    Member
    Posted 3 years ago #

    Are you using Feedburner and the utm_source GET parameters to track where visitors come from? Unfortunately Supercache doesn't cache those requests so visitors from Twitter or feed readers will visit those URLs and they won't be cached.
    Check out Joost's comment on this post about the issue. (That feature has since been removed from Supercache) And check out Google Analytics plugin. I should make this a FAQ...

    Oh, it already is in the dev version!

  22. Frederick Townes
    Member
    Posted 3 years ago #

    To be clear, you can use the debug mode in W3TC to understand which queries are expensive and may block php for returning so apache can answer requests. W3TC and WPSC use the same performance advantages of using disk for apache, so using W3TC with page caching to memcached is not going to perform like using disk enhanced when compared to WPSC when using a single server. Memcached is ideally suited for either persistent storage across web server restarts or across multiple servers. Anyway, you'll want to find your expensive queries and optimize for them so you can build your page cache efficiently, and object caching should reduce the number of queries that are ultimately made.

  23. GRAQ
    Member
    Posted 3 years ago #

    If the single change made, without being in conjunction with anything else, was adding RAM, and that solved your issues, then RAM was your problem. Either your services were badly configured, or there was not enough RAM for the services to serve your website.

    8GB would be excessive for most, but it is very difficult to judge without some solid facts. Essentially, you need to measure how much RAM is being used, and by what and when.

    There are many factors.

    @Frederick I've also found that sites that generate large pages are sometimes not well suited to memcached. It works better with many (many!) small buckets rather than several large ones.

  24. Frederick Townes
    Member
    Posted 3 years ago #

    By default memcached has a 1MB slab size, so if for some reason you're not using compression (the default) and you still have a huge page (often due to comments) that is greater than 1MB when compressed, you will have lots of cache misses.

  25. lochinvar
    Member
    Posted 2 years ago #

    We found it. core control http logging was turned on and then forgotten. wp_posts: 300,000 records at 3.6 GB.
    turned off and cleaned the table. Site is running very well now.

  26. The Hack Repair Guy
    Member
    Posted 2 years ago #

    Can you clarify what you mean by:
    core control http logging
    ?

    Thanks.

  27. a1wsn
    Member
    Posted 2 years ago #

    Can you clarify what you mean by:
    core control http logging

    http://wordpress.org/extend/plugins/core-control/

  28. Dion Hulse
    WordPress Dev
    Posted 2 years ago #

    We found it. core control http logging was turned on and then forgotten.

    I've done that myself, and fully intend on adding a admin notice alert to mention it's enabled for the next release..

Topic Closed

This topic has been closed to new replies.

About this Topic