WordPress.org

Make WordPress Core

Opened 7 years ago

Closed 7 years ago

#30814 closed defect (bug) (invalid)

Large wp_postmeta table causes core database queries to become extremely slow.

Reported by: turkeybucket Owned by:
Milestone: Priority: normal
Severity: normal Version: 4.1
Component: Query Keywords:
Focuses: performance Cc:

Description

On a site that I just updated to WordPress 4.1, I found that I was getting 504 gateway time-out errors at least half the time I tried to load the site. The site would load intermittently and was generally much slower than it had been before the upgrade. On the server (Nginx PHP 5.4 MySQL 5.5.40), the MySQL Process Log showed that a lot of the queries for this site were hanging up and taking 30 to 60 seconds to complete.

After digging through the site's database to see what could be causing the slow queries I found that an old plugin had left 250,000 rows of data in the wp_postmeta table. I deleted those rows out of the postmeta table which left about 20,000 rows -- after deleting those rows, the site started loading quickly again and I no longer was getting time out errors.

Change History (9)

#1 @pento
7 years ago

  • Component changed from Database to Query
  • Focuses performance added

#2 @boonebgorges
7 years ago

turkeybucket - Do you have examples of specific queries that were running slowly?

#3 @turkeybucket
7 years ago

The MySQL Process Log in WHM just showed a truncated version of the queries, so I don't have the full queries. One that I consistently showed up had "SQL_CALC_FOUND_ROWS" and "LEFT JOIN" and it looked like it was joining the postmeta table. Another query that showed up had "INNER JOIN". Sorry I don't have anything more specific -- cleaning out that database table fixed the problem before I needed to start logging the slow queries. From the public side of the site, the performance problem seemed to hit whenever I would view a page that used the archive.php file, although other pages were timing out as well.

#4 @boonebgorges
7 years ago

One that I consistently showed up had "SQL_CALC_FOUND_ROWS" and "LEFT JOIN" and it looked like it was joining the postmeta table

Unfortunately, this doesn't narrow things down much - most WP_Query queries look like this. SQL syntax related to the postmeta table was changed in a few places in 4.1, but without more data about the specific queries that were causing problems in your case, it's hard to guess which, if any, is the source of your problem. Are you using any plugins or theme customizations that modify the posts that are displayed on archive pages or in other loops? Are you doing anything custom with post order/orderby?

#5 @turkeybucket
7 years ago

I made a backup of that site's database before I cleared out the wp_postmeta table. I'll recreate the site and see if I can get you the specific queries.

#6 @turkeybucket
7 years ago

I tracked down the slow query -- it was caused by a modification the theme made to the main query with pre_get_posts. It looks like the modification was designed to exclude posts that met one of two conditions based on custom fields. I've included the query below. I'm guessing this is a problem with the theme and not with WordPress.

SELECT SQL_CALC_FOUND_ROWS wp_posts.ID FROM wp_posts LEFT JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id ) LEFT JOIN wp_postmeta AS mt1 ON (wp_posts.ID = mt1.post_id AND mt1.meta_key = 'story_format' ) WHERE 1=1 AND wp_posts.post_type = 'post' AND (wp_posts.post_status = 'publish') AND (

( wp_postmeta.meta_key = 'story_format' AND CAST(wp_postmeta.meta_value AS CHAR) NOT LIKE '%Long-Form Chapter%' )
OR
mt1.post_id IS NULL

) GROUP BY wp_posts.ID ORDER BY wp_posts.post_date DESC LIMIT 0, 12;

#7 @boonebgorges
7 years ago

This is helpful - thanks, turkeybucket. Can you share the code that's being used in the 'pre_get_posts' filter to generate this SQL? I'm guessing it's adding a meta_query that looks something like:

array(
    'relation' => 'OR',
    array(
        'key' => 'story_format',
        'value' => 'Long-Form Chapter',
        'compare' => 'NOT LIKE',
    ),
    array(
        'key' => 'story_format',
        'compare' => 'NOT EXISTS',
    ),
),

It'd be helpful for me to see the exact syntax being used, so I can run some comparisons of the SQL syntax generated in 4.0 and 4.1.

#8 @turkeybucket
7 years ago

Here's the function being called by the pre_get_posts filter:

function sno_lf_exclusion( $query ) {
	if( $query->is_main_query() && !is_admin() ) {
		$meta_query = array(
     				'relation' => 'OR',
			array(
				'key' => 'story_format',
				'value' => 'Long-Form Chapter',
				'compare' => 'NOT LIKE'
			),
			array(
				'key' => 'story_format',
				'value' => 'Long-Form Chapter',
				'compare' => 'NOT EXISTS'
			)
		);
		$query->set( 'meta_query', $meta_query );
	}

}

#9 @boonebgorges
7 years ago

  • Milestone Awaiting Review deleted
  • Resolution set to invalid
  • Status changed from new to closed

Thanks again for digging up the implementation details.

I've run some tests between 4.0 and 4.1. The queries are almost identical, except that in 4.1 all of the JOINS are LEFT JOINS:

4.0: INNER JOIN wp_postmeta ON (wp_posts.ID = wp_postmeta.post_id) LEFT JOIN wp_postmeta AS mt1 ...
4.1: LEFT JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id ) LEFT JOIN wp_postmeta AS mt1 ...

This change dates from [29890]. See #29062 for the back story. In short, mixing INNER JOINs with LEFT JOINs when using NOT EXISTS was leading to incorrect results. It's true (as noted in the ticket) that the change to LEFT JOINs means a performance hit - especially in cases where the postmeta table is filled with a bunch of irrelevant data. This is what you experienced, and that's why it went away when you cleaned up the old plugin data. But this is not something we can roll back without reintroducing the bug.

Depending on your data set, you may be able to write a version of the 'pre_get_posts' filter that performs better by doing separate direct queries against postmeta to get your post_ids - one for the NOT LIKE and one for the NOT EXISTS - then joining the results with array_merge and passing them as a post__in param to WP_Query.

See #30044 for a discussion of using subqueries for WP_Meta_Query itself. Thanks again for the report.

Note: See TracTickets for help on using tickets.