Make WordPress Core

Opened 8 years ago

Last modified 3 weeks ago

#38173 assigned enhancement

Meta query creates unecessary multiple left joins when using the same meta key

Reported by: neonwired's profile neonWired Owned by: pbearne's profile pbearne
Milestone: Future Release Priority: normal
Severity: normal Version: 3.2
Component: Query Keywords: needs-patch
Focuses: performance Cc:

Description

If you specify the below as a meta_query wordpress creates an extremely bad and inefficient query, it seems to unnecessarily create a left join for each array even though they have the same key when it could use the same join

<?php
array(
  'relation' => 'OR',
  array(
     'key' => 'product',
     'value' => '1',
     'compare' => '!='
  ),
  array(
    'key' => 'product',
    'compare' => 'NOT EXISTS'
  )
);
SELECT SQL_CALC_FOUND_ROWS  vvc_posts.ID FROM vvc_posts  LEFT JOIN vvc_postmeta ON ( vvc_posts.ID = vvc_postmeta.post_id )  LEFT JOIN vvc_postmeta AS mt1 ON (vvc_posts.ID = mt1.post_id AND mt1.meta_key = 'product' ) WHERE 1=1  AND ( 
  ( vvc_postmeta.meta_key = 'product' AND CAST(vvc_postmeta.meta_value AS CHAR) != '1' ) 
  OR 
  mt1.post_id IS NULL
) AND vvc_posts.post_type = 'news' AND ((vvc_posts.post_status = 'publish')) GROUP BY vvc_posts.ID ORDER BY vvc_posts.post_date DESC LIMIT 0, 10

On my site this query takes a huge 6.640 sec, more than 80% of the page's ttfb.

SELECT SQL_CALC_FOUND_ROWS vvc_posts.ID 
FROM vvc_posts 
LEFT JOIN vvc_postmeta ON ( vvc_posts.ID = vvc_postmeta.post_id && vvc_postmeta.meta_key = 'product') 
WHERE 1=1 
AND (CAST(vvc_postmeta.meta_value AS CHAR) != '1' OR vvc_postmeta.post_id IS NULL ) 
AND vvc_posts.post_type = 'news' 
GROUP BY vvc_posts.ID 
ORDER BY vvc_posts.post_date 

whereas an optimized version takes only 0.969 sec.

Change History (4)

#1 @tamiassibiricus
8 years ago

Got same situation when have about 10000 posts in WP setup and need to filter by meta that filled for all posts quantity(each post have value for meta). So we have same meta qty that could joined twice or more.
When mysql join meta vertical tables it join each 10000 records and then shrink results by conditions placed in Where section.
it could really slow down mysql selects at first run.
working and tested solution - https://gist.github.com/TamiasSibiricus/917b3dfc2b13b97e644ea739a1a1a306
Just note: solution is a quick fix and demonstrate query efficiency when conditions moved to join sections

Last edited 8 years ago by tamiassibiricus (previous) (diff)

#2 @tamiassibiricus
8 years ago

Bug was registered 6 months ago and no one care about this?

#3 @boonebgorges
8 years ago

  • Component changed from Database to Query
  • Keywords needs-patch added
  • Milestone changed from Awaiting Review to Future Release
  • Type changed from defect (bug) to enhancement
  • Version changed from 4.6 to 3.2

Bug was registered 6 months ago and no one care about this?

There are many open enhancement requests in WordPress, and many of them are worthwhile, but there are only so many people reviewing and writing patches.

@neonWired is correct that no table join is required for subclauses that are joined by OR that both operate on the same meta_key when one of them uses the compare operator NOT EXISTS. But this syntax requires that the meta_key comparison be put into the ON clause for that particular JOIN clause. This already happens for NOT EXISTS: https://core.trac.wordpress.org/browser/tags/4.7.2/src/wp-includes/class-wp-meta-query.php#L536 But the logic becomes a good deal more complicated when other clauses share a JOIN with a NOT EXISTS clause, because of the order in which WP_Meta_Query currently builds SQL. Take @neonWired's query as an example. Here's what happens internally:

  1. SQL is generated for the first subclause ('compare' => '!='). The JOIN clause looks like INNER JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id )
  2. SQL is generated for the second subclause after this. WP_Meta_Query realizes it's a 'NOT EXISTS', so it knows that the JOIN clause must be like LEFT JOIN wp_postmeta ON ( wp_posts.ID = wp_postmeta.post_id AND wp_postmeta.meta_key = 'product' )
  3. But in order to share the table join with the clause generated in 1, we'd have to go back and modify the SQL that's already generated there, so that it has the proper JOIN syntax.

SQL generation for clauses is more or less linear at the moment; there's no easy way to jump back and change the syntax of a clause that already been generated. We'd need to either (a) change the way that the clauses are generated so that it's easier to change existing clauses, or (b) create some sort of look-ahead device that can tell whether there's going to be a 'NOT EXISTS' clause involved, and if so, process it first.

Either of these is quite possible, but it's not a trivial task. I've just spent an hour or so trying to come up with a proof-of-concept, but I haven't been successful. If anyone would like to have a go at writing a patch, I'd be happy to help to review it.

#4 @pbearne
3 weeks ago

  • Owner set to pbearne
  • Status changed from new to assigned
Note: See TracTickets for help on using tickets.