Optimizing SQL query for metadata (3 posts)

  1. ekgermann
    Posted 11 years ago #

    I'm looking at the limits of WordPress and have an optimization question. For the designers of WP, I'm trying to understand the following query in admin-functions.php:

    SELECT meta_key from $wpdb->postmeta GROUP BY meta_key ORDER BY meta_id DESC LIMIT 10

    I built a blog with the contents of a mailing list (http://blogs.semperen.com/nblog). The code I wrote to do the import (in *cough* perl) stuffs the info in to the db and saves the original RFC822 headers as meta data. You can see this from the front page of a post as the author of the post, which is a meta data retrieval. That's pretty quick.

    I have 83,147 blog posts in the db with a total of 1,180,243 rows in the meta data table. When editing a post, the above query takes about 5 minutes and 11 seconds, which doesn't make it terribly interactive :(

    So my question is, what is the point of the query? Is it to retrieve the metadata for that particular post? If so, why not do something like

    SELECT meta_key FROM $wpdb->postmeta WHERE post_id = *thepostid* ORDER BY meta_id DESC

    Takes about 0.01 seconds. In the original query, what relates it to the post. Looking for some guidance on this and I'm putting an articles together on it.

  2. davidchait
    Posted 11 years ago #

    That's just bad code it seems. Though I'm sure that's one of the largest WP blogs at the moment. ;) You'll probably be very helpful in trying to hunt down such issues.

    I'm not sure what that query is for... given the LIMIT 10, which is odd. My guess is it is trying to pre-populate up to the first 10 potential meta field names for the edit form, but should be caching those as a field in the options table or something.


  3. ekgermann
    Posted 11 years ago #

    Ok, I did some poking around and optimized some really horrid SQL query and got it to work I THINK the way they intended.

    It looks like the intention of the original query was to give you a select box of the 10 last used meta keys in the db. Unfortunately, it's REALLLLLLLLYYYYY slow when you have piles of metadata.

    The offending function for editing is in admin-functions.php and is meta_form().

    If all you're looking for is the last 10 keys, the following code works just fine:

    $distinctkeys = $wpdb->get_col ("
    FROM $wpdb->postmeta
    ORDER BY meta_id DESC LIMIT 10");

    $sql = "SELECT meta_key FROM $wpdb->postmeta WHERE meta_id IN (" . join(",", $distinctkeys) . ")";
    $keys = $wpdb->get_col($sql);

    do a SELECT with a DISTINCT and LIMIT to give you the meta_id's of the last 10, then do a select of the key names from that set of id's. It could be done in one SQL query using sub-selects and the IN clause, but that only works for MySQL 4.1 and above, so it isn't portable (really).

    Looking at get_meta_keys in the same file shows the same ugly SQL query style, so I optimized it to:

    $keys = $wpdb->get_col ("SELECT DISTINCT meta_key FROM $wpdb->postmeta");

    return $keys;

    This again should run a lot faster against large metakey tables.

    The diff for admin-functions.php (against a stock install) can be found here if anyone wants to try it. My thoughts to the WP team are to look at incorporating it in to the next update if it doesn't break anything else.

    As I go through this project, I'll keep optimizing for speed and posting.

Topic Closed

This topic has been closed to new replies.

About this Topic