Opened 5 years ago
Last modified 3 weeks ago
#49278 reopened enhancement
Improve meta query
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 5.3.2 |
Component: | Query | Keywords: | has-patch dev-feedback early early-like-actually-early has-unit-tests needs-refresh |
Focuses: | performance | Cc: |
Description
When having a couple of Meta Query statements in WP_Query the query becomes very slow. I think this is because of the way the JOINs are created.
Currently the JOINs are only done on the Post ID. The JOIN can become enormous, which means that filtering (the WHERE part) will take a lot of time.
I checked /wp-includes/class-wp-meta-query.php and posted the code between line 557 and 573 .
// JOIN clauses for NOT EXISTS have their own syntax. if ( 'NOT EXISTS' === $meta_compare ) { $join .= " LEFT JOIN $this->meta_table"; $join .= $i ? " AS $alias" : ''; if ( 'LIKE' === $meta_compare_key ) { $join .= $wpdb->prepare( " ON ($this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key LIKE %s )", '%' . $wpdb->esc_like( $clause['key'] ) . '%' ); } else { $join .= $wpdb->prepare( " ON ($this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key = %s )", $clause['key'] ); } // All other JOIN clauses. } else { $join .= " INNER JOIN $this->meta_table"; $join .= $i ? " AS $alias" : ''; $join .= " ON ( $this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column )"; }
Apparantly when using the 'NOT EXISTS' compare the 'AND $alias.meta_key' part is added to the JOIN, but when NOT using the 'NOT EXISTS' compare this part is not there.
This means that when NOT using the 'NOT EXISTS' compare the a lot of data is joined in the temporary data set. I played with this part a bit and when adding the 'AND $alias.meta_key' part to those JOINs as well it sped up my query a lot. My query went from 38 seconds to 0.01 seconds with the same results.
My 'test' code:
// JOIN clauses for NOT EXISTS have their own syntax. if ( 'NOT EXISTS' === $meta_compare ) { $join .= " LEFT JOIN $this->meta_table"; $join .= $i ? " AS $alias" : ''; if ( 'LIKE' === $meta_compare_key ) { $join .= $wpdb->prepare( " ON ($this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key LIKE %s )", '%' . $wpdb->esc_like( $clause['key'] ) . '%' ); } else { $join .= $wpdb->prepare( " ON ($this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key = %s )", $clause['key'] ); } // All other JOIN clauses. } else { $join .= " INNER JOIN $this->meta_table"; $join .= $i ? " AS $alias" : ''; $valid_compares = array( '=', '!=', '>', '>=', '<', '<=', 'IN', 'NOT IN', 'EXISTS', ); if( in_array($meta_compare, $valid_compares ) && !empty($clause['key']) && 'LIKE' !== $meta_compare_key ) { $join .= $wpdb->prepare( " ON ( $this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key = %s )", $clause['key']); } else { $join .= " ON ( $this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column )"; } }
I'm not really sure if this works in all cases (with all compare/compare_key variations), but I think it would be good to check it out (on Github I've seen that the last improvements here have been done at least 2 years ago).
For now I 'solved' my slow query by parsing the JOIN and WHERE on the filter 'get_meta_sql' and add the 'AND' part in the JOIN.
Below the query that gets created before and after the changes.
Query before (38 seconds):
SELECT SQL_CALC_FOUND_ROWS riff19_posts.ID FROM riff19_posts INNER JOIN riff19_postmeta ON ( riff19_posts.ID = riff19_postmeta.post_id ) INNER JOIN riff19_postmeta AS mt1 ON ( riff19_posts.ID = mt1.post_id ) INNER JOIN riff19_postmeta AS mt2 ON ( riff19_posts.ID = mt2.post_id ) INNER JOIN riff19_postmeta AS mt3 ON ( riff19_posts.ID = mt3.post_id ) JOIN riff19_icl_translations wpml_translations ON riff19_posts.ID = wpml_translations.element_id AND wpml_translations.element_type = CONCAT('post_', riff19_posts.post_type) WHERE 1=1 AND ( ( riff19_postmeta.meta_key = 'pinplugin_event_start_date' AND CAST(riff19_postmeta.meta_value AS DATE) < '2020-01-23' ) OR ( ( ( mt1.meta_key = 'pinplugin_event_start_date' AND CAST(mt1.meta_value AS DATE) = '2020-01-23' ) AND mt2.meta_key = 'pinplugin_event_start_time' AND ( mt3.meta_key = 'pinplugin_event_end_time' AND CAST(mt3.meta_value AS TIME) <= '17:19:19' ) ) ) ) AND riff19_posts.post_type = 'event' AND (riff19_posts.post_status = 'publish' OR riff19_posts.post_status = 'acf-disabled' OR riff19_posts.post_status = 'private') AND ( ( ( wpml_translations.language_code = 'nl' OR 0 ) AND riff19_posts.post_type IN ('post','page','attachment','wp_block','location','person','news','blog','case','service','event','vacancy','whitepaper' ) ) OR riff19_posts.post_type NOT IN ('post','page','attachment','wp_block','location','person','news','blog','case','service','event','vacancy','whitepaper' ) ) GROUP BY riff19_posts.ID ORDER BY riff19_posts.menu_order, CAST(riff19_postmeta.meta_value AS DATE) DESC, CAST(mt2.meta_value AS TIME) DESC, CAST(mt3.meta_value AS TIME) DESC LIMIT 0, 12
Query after (0.0028 seconds):
SELECT SQL_CALC_FOUND_ROWS riff19_posts.ID FROM riff19_posts INNER JOIN riff19_postmeta ON ( riff19_posts.ID = riff19_postmeta.post_id AND riff19_postmeta.meta_key = 'pinplugin_event_start_date') INNER JOIN riff19_postmeta AS mt1 ON ( riff19_posts.ID = mt1.post_id AND mt1.meta_key = 'pinplugin_event_start_date') INNER JOIN riff19_postmeta AS mt2 ON ( riff19_posts.ID = mt2.post_id AND mt2.meta_key = 'pinplugin_event_start_time') INNER JOIN riff19_postmeta AS mt3 ON ( riff19_posts.ID = mt3.post_id AND mt3.meta_key = 'pinplugin_event_end_time') JOIN riff19_icl_translations wpml_translations ON riff19_posts.ID = wpml_translations.element_id AND wpml_translations.element_type = CONCAT('post_', riff19_posts.post_type) WHERE 1=1 AND ( ( riff19_postmeta.meta_key = 'pinplugin_event_start_date' AND CAST(riff19_postmeta.meta_value AS DATE) < '2020-01-23' ) OR ( ( ( mt1.meta_key = 'pinplugin_event_start_date' AND CAST(mt1.meta_value AS DATE) = '2020-01-23' ) AND mt2.meta_key = 'pinplugin_event_start_time' AND ( mt3.meta_key = 'pinplugin_event_end_time' AND CAST(mt3.meta_value AS TIME) <= '17:18:05' ) ) ) ) AND riff19_posts.post_type = 'event' AND (riff19_posts.post_status = 'publish' OR riff19_posts.post_status = 'acf-disabled' OR riff19_posts.post_status = 'private') AND ( ( ( wpml_translations.language_code = 'nl' OR 0 ) AND riff19_posts.post_type IN ('post','page','attachment','wp_block','location','person','news','blog','case','service','event','vacancy','whitepaper' ) ) OR riff19_posts.post_type NOT IN ('post','page','attachment','wp_block','location','person','news','blog','case','service','event','vacancy','whitepaper' ) ) GROUP BY riff19_posts.ID ORDER BY riff19_posts.menu_order, CAST(riff19_postmeta.meta_value AS DATE) DESC, CAST(mt2.meta_value AS TIME) DESC, CAST(mt3.meta_value AS TIME) DESC LIMIT 0, 12
Change History (34)
This ticket was mentioned in Slack in #core-media by joedolson. View the logs.
4 years ago
This ticket was mentioned in Slack in #core by peterwilsoncc. View the logs.
4 years ago
This ticket was mentioned in Slack in #core by jeffpaul. View the logs.
4 years ago
#5
@
4 years ago
- Keywords needs-patch needs-unit-tests added; dev-feedback removed
Hi there, welcome to WordPress Trac! Thanks for the report, sorry it took a while for someone to get back to you.
A patch and some unit tests would be great to move the ticket forward.
This ticket was mentioned in PR #1229 on WordPress/wordpress-develop by markparnell.
4 years ago
#6
- Keywords has-patch added; needs-patch removed
Trac ticket: https://core.trac.wordpress.org/ticket/49278
#7
@
4 years ago
- Keywords needs-patch added; has-patch removed
Just added a patch which applies the discussed changes. As I flagged above it will definitely need someone with more background knowledge than me here - we may need a more nuanced approach as there are quite possibly edge cases I've not considered.
I also have no idea how to go about setting up unit tests for this, or even what we would actually be testing - there are already tests in place for meta queries, but I don't know whether they cover this sufficiently.
peterwilsoncc commented on PR #1229:
4 years ago
#8
Thanks @markparnell!
I've run the actions on this PR and am noticing a few failures in the PHP unit tests in Tests_Query_MetaQuery
. Most of them test the returned posts so it looks like they're meaningful (rather than a result of the query changing).
The coding standards check is also failing on two lines but if you don't want to worry about that until the functionality is sorted out, that's cool.
markparnell commented on PR #1229:
4 years ago
#9
Thanks Peter. Looks like I'm going to need to try and work out how to run the test suite on my local dev so I can investigate those errors properly. :-)
markparnell commented on PR #1229:
4 years ago
#10
I've resolved all of the test failures now. I think I've also fixed the CS issues, though I wasn't sure how to run those locally so I may have missed something.
#11
@
4 years ago
- Keywords has-patch added; needs-patch removed
PR updated to resolve all the test failures. I'm not entirely enamoured with the resulting code - particularly having to pass additional parameters between methods - so if anyone has any suggestions for a better approach that would be very welcome.
This ticket was mentioned in Slack in #core by markparnell. View the logs.
4 years ago
#13
@
4 years ago
- Keywords dev-feedback added; needs-unit-tests removed
- Milestone changed from Awaiting Review to 5.8
Per the discussion in devchat, moving this to 5.8
for more visibility and adding dev-feedback
in the hope that someone with solid SQL knowledge can take a look.
I've also removed the needs-unit-tests
flag as there are quite a number of existing unit tests (which pass), though I recognise it's possible that additional tests may be desirable.
This ticket was mentioned in Slack in #core by antpb. View the logs.
4 years ago
#15
@
4 years ago
- Milestone changed from 5.8 to 5.9
Since feature freeze for 5.8
is imminent, this isn't going to get the review it needs before then. Moving to 5.9
and will push for it to get some attention early in the next cycle.
This ticket was mentioned in Slack in #core by markparnell. View the logs.
4 years ago
This ticket was mentioned in Slack in #core by markparnell. View the logs.
3 years ago
#18
@
3 years ago
- Keywords early added
As per today's bugscrub, let's move it to next cycle and mark it early
.
#19
@
3 years ago
- Milestone changed from 5.9 to Future Release
Query changes by design need careful consideration and plenty of soak time to ensure nothing breaks. With 5.9 feature freeze ~2 days away, it's too late in the release cycle for a change to query. Punting this out of 5.9. As the next milestone is not yet available, moving to Future Release
. Once available, feel free to move into the next release.
#20
@
3 years ago
- Keywords early-like-actually-early added
Whoopsie forgot to mark it for really really early in the cycle.
#21
@
3 years ago
- Milestone changed from Future Release to 6.0
Per comment:19 moving this to 6.0 now that the milestone exists.
This ticket was mentioned in Slack in #core by costdev. View the logs.
3 years ago
#23
@
3 years ago
- Keywords has-unit-tests added
Per the discussion in the bug scrub:
- I'm adding the
needs-unit-tests
keyword so that this can be evaluated by the test team. - @craigfrancis has kindly offered to take a look at the patch when he's available.
- While this ticket already has the
performance
focus, it may be worth reaching out in#performance
to get eyes on this ticket when the approach has been given feedback.
This ticket was mentioned in Slack in #core by craigfrancis. View the logs.
3 years ago
#25
@
3 years ago
As we have a Dev Chat today, I'll summarise my conversation with Mark so far:
- The current patch does not change anything if the query uses
OR
(ref$this->has_or_relation
), which does not help @jillebehm (rough example). This is becausefind_compatible_table_alias()
will try to re-use an existing alias, which would have specified themeta_key
from the first condition. - In my own basic tests, which does not represent real data, I was able to get similar improvements by changing the field to
varchar(191)
(ref #33885), or runningOPTIMIZE TABLE
, because the updated INDEX info would lead to a much better query execution plan. - I have a small concern about how the
get_meta_sql
filter/hook might be affected; e.g. a plugin that modifies the 'join' string (e.g. to apply their own version of this fix), but I think they should expect changes.
This ticket was mentioned in Slack in #core by costdev. View the logs.
3 years ago
#27
@
3 years ago
- Milestone changed from 6.0 to Future Release
This was discussed in a bug scrub today.
Due to the complexity of the task and the patch still requiring some work, it was decided to move it off the 6.0 milestone.
#28
@
2 years ago
CAST(mt1.meta_value AS DATE) = '2020-01-23'
For MySQL and for DATE or DATETIME, that can be simplified to
mt1.meta_value = '2020-01-23'
This change will allow an index, even INDEX(32)
to be of some use for optimization. Cf, the plugin: https://wordpress.org/plugins/index-wp-mysql-for-speed/
Caveat, if meta_value
is actually a DATETIME and the intent is to truncate to DATE, then the CAST
(or equivalent) is needed. The following would be better, but probably not worth considering:
mt1.meta_value LIKE '2020-01-23%'
#29
@
2 years ago
Re: pinplugin_event_start_date
and pinplugin_event_start_time
It is almost always better (in MySQL) to use a single column instead of two columns for a DATETIME. (I don't know the details of the plugin, so I can't be more specific than to suggest you consider this change.)
#30
@
3 weeks ago
- Keywords needs-refresh added
- Milestone Future Release deleted
- Resolution set to maybelater
- Status changed from new to closed
There hasn't been any activity on this ticket for a few years, and the PR was opened almost 4 years ago without having moved forward since.
Moving it forward is going to require very specific domain knowledge to make it happen. FWIW I personally feel not comfortable approving SQL changes like this due to my lack of familiarity with its performance nuances.
Aside: It would be great to iterate on the ticket summary to clarify what it is about, since "Improve meta query" is extremely vague.
Due to the lack of traction, I'm going to close this as maybelater
for now. If you see this ticket and are interested in working on it, please reopen.
#32
@
3 weeks ago
@rjasdfiii could you please add some thoughts about why this ticket is reopened after comment:30 by @flixos90? Thanks :)
#33
@
3 weeks ago
Looking at the report again [belatedly!], I have to agree that moving the AND to the ON clause is important and "required":
Bad:
INNER JOIN riff19_postmeta AS mt3 ON ( riff19_posts.ID = mt3.post_id ) WHERE ( blah-blah ) OR ( this-and-that AND ( mt3.meta_key = 'pinplugin_event_end_time' ) )
Good:
INNER JOIN riff19_postmeta AS mt3 ON ( riff19_posts.ID = mt3.post_id AND mt3.meta_key = 'pinplugin_event_end_time'
)
Explanation: The only "meta_key" of interest for "mt3" is 'pinplugin_event_end_time'. This is what the "Good" version says. The "Bad" version tests the meta_key only some of the time.
(Similarly for the other JOINs to postmeta.)
Putting the test in the ON emphasizes my point. This produces identical results, but is less clear:
INNER JOIN riff19_postmeta AS mt3 ON ( riff19_posts.ID = mt3.post_id ) WHERE mt3.meta_key = 'pinplugin_event_end_time' AND ( blah-blah ) OR ( this-and-that AND ( mt3.meta_key = 'pinplugin_event_end_time' ) -- now redundant )
With the Good version, plus the improved indexes, the lookup in mt3 can use the combined index on post_id and meta_key, hence the dramatic speedup.
#34
@
3 weeks ago
@audrasjb - I reopened because I feel I have "very specific domain knowledge" in MySQL to justify the change and explain it. See comment:33 .
I've independently come to the same conclusion here. Particularly on large sites (i.e. millions of rows in the meta table), this makes a massive difference to performance. I've run some quick tests and got similar results to @jillebehm - with the additional clause in the
JOIN
the returned data is the same, but the query runs several orders of magnitude quicker.Would love for someone with more SQL (and
WP_Meta_Query
) knowledge than me to take a look, but from where I'm sitting this appears to be a simple change that could drastically improve performance for a lot of sites.