Forums

Google XML Sitemaps
Performance issues on large wp_posts tables (3 posts)

  1. ccranford
    Member
    Posted 4 months ago #

    Hello,

    We have thousands of customers that utilize this plugin, however on sites that have a large amounts of posts ( 10k+ ), the query ends up causing performance issues because of the amount of results returned, specifically from this query:

    --
    SELECT ID, post_author, post_date, post_date_gmt, post_status, post_name, post_modified, post_modified_gmt, post_parent, post_type FROM wp_posts WHERE ( (post_status = 'publish' AND (post_type = 'post' OR post_type = '')) OR (post_status = 'publish' AND post_type = 'page') ) AND post_password='' ORDER BY post_modified DESC
    --

    It should be feasible to keep the data that has already been indexed in the sitemap and only get entries that have been inputted in the days/hours since the previous sitemap creation run, then perhaps have an option to refresh the entire sitemap in cases where changes/edits occurred.

    This issue is causing performance issues on many large wp_posts sites that we host ending up with CPU and IO usage that can take 3+ minutes to parse through the results, whats worse is the users are running this multiple times per day due to autobloggers. Granted, we can address the individual users and educate them on the process which we do attempt to accomplish, however it might be more beneficial to address it from an application stand point which is what brought us here. Anything that we can do to assist, please let us know.

    thanks,

    Cruz Cranford
    hostgator.com

    http://wordpress.org/extend/plugins/google-sitemap-generator/

  2. gingerdog
    Member
    Posted 4 months ago #

    I can confirm this - I'm seeing the same problem with a site which has approx 85k posts. When the sitemap generation occurs, MySQL is locked up for a number of minutes, in our case trying to run the following query :

    SELECT DISTINCT YEAR(post_date_gmt) AS year, MONTH(post_date_gmt) AS month, MAX(post_date_gmt) as last_mod, count(ID) as posts FROM wp_posts WHERE post_date < '2012-01-20 22:03:49' AND post_status = 'publish' AND post_type = 'post' GROUP BY YEAR(post_date_gmt), MONTH(post_date_gmt) ORDER BY post_date_gmt DESC

    (See google-sitemap-generator/sitemap-core.php approx line 2004; v3.6.2).

  3. ccranford
    Member
    Posted 4 months ago #

    Hello,

    I have data on queries you posted, this data wont paste well most likely. Some of the issues are due to lack of proper indexes on the table of the data being requested, however, using the primary keys would be a good reference instead of using timestamps when rehashing data that was already gathered. We're open to any ideas as currently we have to occasionally recommend users with large amounts of post to use a different plugin which is not something that we like to do.

    -------------------------------------------------------------------------+
    | rows_examined | query_time | sql_text                                                                                                                                                                                                                                                                               |
    +---------------+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    |         14526 | 00:00:50   | SELECT DISTINCT YEAR(post_date) AS <code>year</code>, MONTH(post_date) AS <code>month</code>, count(ID) as posts FROM wp_posts WHERE post_date < '2012-01-21 00:27:09' AND post_date != '0000-00-00 00:00:00' AND post_status = 'publish' GROUP BY YEAR(post_date), MONTH(post_date) ORDER BY post_date DESC |
    |            52 | 00:00:41   | SELECT DISTINCT YEAR(post_date) AS <code>year</code>, MONTH(post_date) AS <code>month</code>, count(ID) as posts FROM wp_posts  WHERE post_type = 'post' AND post_status = 'publish' GROUP BY YEAR(post_date), MONTH(post_date) ORDER BY post_date DESC                                                      |
    |          2376 | 00:00:39   | SELECT DISTINCT YEAR(post_date) AS <code>year</code>, MONTH(post_date) AS <code>month</code>, count(ID) as posts FROM wp_posts  WHERE post_type = 'post' AND post_status = 'publish' GROUP BY YEAR(post_date), MONTH(post_date) ORDER BY post_date DESC                                                      |
    |         17013 | 00:00:33   | SELECT DISTINCT YEAR(post_date) AS <code>year</code>, MONTH(post_date) AS <code>month</code>, count(ID) as posts FROM wp_posts  WHERE post_type = 'post' AND post_status = 'publish' GROUP BY YEAR(post_date), MONTH(post_date) ORDER BY post_date DESC                                                      |
    |           122 | 00:00:31   | SELECT DISTINCT YEAR(post_date) AS <code>year</code>, MONTH(post_date) AS <code>month</code>, count(ID) as posts FROM wp_posts  WHERE post_type = 'post' AND post_status = 'publish' GROUP BY YEAR(post_date), MONTH(post_date) ORDER BY post_date DESC                                                      |
    |         17013 | 00:00:24   | SELECT DISTINCT YEAR(post_date) AS <code>year</code>, MONTH(post_date) AS <code>month</code>, count(ID) as posts FROM wp_posts  WHERE post_type = 'post' AND post_status = 'publish' GROUP BY YEAR(post_date), MONTH(post_date) ORDER BY post_date DESC                                                      |
    |         17013 | 00:00:23   | SELECT DISTINCT YEAR(post_date) AS <code>year</code>, MONTH(post_date) AS <code>month</code>, count(ID) as posts FROM wp_posts  WHERE post_type = 'post' AND post_status = 'publish' GROUP BY YEAR(post_date), MONTH(post_date) ORDER BY post_date DESC                                                      |
    |         14526 | 00:00:22   | SELECT DISTINCT YEAR(post_date) AS <code>year</code>, MONTH(post_date) AS <code>month</code>, count(ID) as posts FROM wp_posts WHERE post_date < '2012-01-20 16:50:00' AND post_date != '0000-00-00 00:00:00' AND post_status = 'publish' GROUP BY YEAR(post_date), MONTH(post_date) ORDER BY post_date DESC |
    |          1037 | 00:00:21   | SELECT DISTINCT YEAR(post_date) AS <code>year</code>, MONTH(post_date) AS <code>month</code>, count(ID) as posts FROM wp_posts WHERE post_date < '2012-01-21 01:02:43' AND post_date != '0000-00-00 00:00:00' AND post_status = 'publish' GROUP BY YEAR(post_date), MONTH(post_date) ORDER BY post_date DESC |
    +---------------+------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
    9 rows in set (0.59 sec)

Reply

You must log in to post.

About this Plugin

About this Topic