Opened 5 years ago
Last modified 2 years ago
#49278 new enhancement
Improve meta query
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Future Release | Priority: | normal |
Severity: | normal | Version: | 5.3.2 |
Component: | Query | Keywords: | has-patch dev-feedback early early-like-actually-early has-unit-tests |
Focuses: | performance | Cc: |
Description
When having a couple of Meta Query statements in WP_Query the query becomes very slow. I think this is because of the way the JOINs are created.
Currently the JOINs are only done on the Post ID. The JOIN can become enormous, which means that filtering (the WHERE part) will take a lot of time.
I checked /wp-includes/class-wp-meta-query.php and posted the code between line 557 and 573 .
// JOIN clauses for NOT EXISTS have their own syntax. if ( 'NOT EXISTS' === $meta_compare ) { $join .= " LEFT JOIN $this->meta_table"; $join .= $i ? " AS $alias" : ''; if ( 'LIKE' === $meta_compare_key ) { $join .= $wpdb->prepare( " ON ($this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key LIKE %s )", '%' . $wpdb->esc_like( $clause['key'] ) . '%' ); } else { $join .= $wpdb->prepare( " ON ($this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key = %s )", $clause['key'] ); } // All other JOIN clauses. } else { $join .= " INNER JOIN $this->meta_table"; $join .= $i ? " AS $alias" : ''; $join .= " ON ( $this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column )"; }
Apparantly when using the 'NOT EXISTS' compare the 'AND $alias.meta_key' part is added to the JOIN, but when NOT using the 'NOT EXISTS' compare this part is not there.
This means that when NOT using the 'NOT EXISTS' compare the a lot of data is joined in the temporary data set. I played with this part a bit and when adding the 'AND $alias.meta_key' part to those JOINs as well it sped up my query a lot. My query went from 38 seconds to 0.01 seconds with the same results.
My 'test' code:
// JOIN clauses for NOT EXISTS have their own syntax. if ( 'NOT EXISTS' === $meta_compare ) { $join .= " LEFT JOIN $this->meta_table"; $join .= $i ? " AS $alias" : ''; if ( 'LIKE' === $meta_compare_key ) { $join .= $wpdb->prepare( " ON ($this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key LIKE %s )", '%' . $wpdb->esc_like( $clause['key'] ) . '%' ); } else { $join .= $wpdb->prepare( " ON ($this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key = %s )", $clause['key'] ); } // All other JOIN clauses. } else { $join .= " INNER JOIN $this->meta_table"; $join .= $i ? " AS $alias" : ''; $valid_compares = array( '=', '!=', '>', '>=', '<', '<=', 'IN', 'NOT IN', 'EXISTS', ); if( in_array($meta_compare, $valid_compares ) && !empty($clause['key']) && 'LIKE' !== $meta_compare_key ) { $join .= $wpdb->prepare( " ON ( $this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key = %s )", $clause['key']); } else { $join .= " ON ( $this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column )"; } }
I'm not really sure if this works in all cases (with all compare/compare_key variations), but I think it would be good to check it out (on Github I've seen that the last improvements here have been done at least 2 years ago).
For now I 'solved' my slow query by parsing the JOIN and WHERE on the filter 'get_meta_sql' and add the 'AND' part in the JOIN.
Below the query that gets created before and after the changes.
Query before (38 seconds):
SELECT SQL_CALC_FOUND_ROWS riff19_posts.ID FROM riff19_posts INNER JOIN riff19_postmeta ON ( riff19_posts.ID = riff19_postmeta.post_id ) INNER JOIN riff19_postmeta AS mt1 ON ( riff19_posts.ID = mt1.post_id ) INNER JOIN riff19_postmeta AS mt2 ON ( riff19_posts.ID = mt2.post_id ) INNER JOIN riff19_postmeta AS mt3 ON ( riff19_posts.ID = mt3.post_id ) JOIN riff19_icl_translations wpml_translations ON riff19_posts.ID = wpml_translations.element_id AND wpml_translations.element_type = CONCAT('post_', riff19_posts.post_type) WHERE 1=1 AND ( ( riff19_postmeta.meta_key = 'pinplugin_event_start_date' AND CAST(riff19_postmeta.meta_value AS DATE) < '2020-01-23' ) OR ( ( ( mt1.meta_key = 'pinplugin_event_start_date' AND CAST(mt1.meta_value AS DATE) = '2020-01-23' ) AND mt2.meta_key = 'pinplugin_event_start_time' AND ( mt3.meta_key = 'pinplugin_event_end_time' AND CAST(mt3.meta_value AS TIME) <= '17:19:19' ) ) ) ) AND riff19_posts.post_type = 'event' AND (riff19_posts.post_status = 'publish' OR riff19_posts.post_status = 'acf-disabled' OR riff19_posts.post_status = 'private') AND ( ( ( wpml_translations.language_code = 'nl' OR 0 ) AND riff19_posts.post_type IN ('post','page','attachment','wp_block','location','person','news','blog','case','service','event','vacancy','whitepaper' ) ) OR riff19_posts.post_type NOT IN ('post','page','attachment','wp_block','location','person','news','blog','case','service','event','vacancy','whitepaper' ) ) GROUP BY riff19_posts.ID ORDER BY riff19_posts.menu_order, CAST(riff19_postmeta.meta_value AS DATE) DESC, CAST(mt2.meta_value AS TIME) DESC, CAST(mt3.meta_value AS TIME) DESC LIMIT 0, 12
Query after (0.0028 seconds):
SELECT SQL_CALC_FOUND_ROWS riff19_posts.ID FROM riff19_posts INNER JOIN riff19_postmeta ON ( riff19_posts.ID = riff19_postmeta.post_id AND riff19_postmeta.meta_key = 'pinplugin_event_start_date') INNER JOIN riff19_postmeta AS mt1 ON ( riff19_posts.ID = mt1.post_id AND mt1.meta_key = 'pinplugin_event_start_date') INNER JOIN riff19_postmeta AS mt2 ON ( riff19_posts.ID = mt2.post_id AND mt2.meta_key = 'pinplugin_event_start_time') INNER JOIN riff19_postmeta AS mt3 ON ( riff19_posts.ID = mt3.post_id AND mt3.meta_key = 'pinplugin_event_end_time') JOIN riff19_icl_translations wpml_translations ON riff19_posts.ID = wpml_translations.element_id AND wpml_translations.element_type = CONCAT('post_', riff19_posts.post_type) WHERE 1=1 AND ( ( riff19_postmeta.meta_key = 'pinplugin_event_start_date' AND CAST(riff19_postmeta.meta_value AS DATE) < '2020-01-23' ) OR ( ( ( mt1.meta_key = 'pinplugin_event_start_date' AND CAST(mt1.meta_value AS DATE) = '2020-01-23' ) AND mt2.meta_key = 'pinplugin_event_start_time' AND ( mt3.meta_key = 'pinplugin_event_end_time' AND CAST(mt3.meta_value AS TIME) <= '17:18:05' ) ) ) ) AND riff19_posts.post_type = 'event' AND (riff19_posts.post_status = 'publish' OR riff19_posts.post_status = 'acf-disabled' OR riff19_posts.post_status = 'private') AND ( ( ( wpml_translations.language_code = 'nl' OR 0 ) AND riff19_posts.post_type IN ('post','page','attachment','wp_block','location','person','news','blog','case','service','event','vacancy','whitepaper' ) ) OR riff19_posts.post_type NOT IN ('post','page','attachment','wp_block','location','person','news','blog','case','service','event','vacancy','whitepaper' ) ) GROUP BY riff19_posts.ID ORDER BY riff19_posts.menu_order, CAST(riff19_postmeta.meta_value AS DATE) DESC, CAST(mt2.meta_value AS TIME) DESC, CAST(mt3.meta_value AS TIME) DESC LIMIT 0, 12
Change History (29)
This ticket was mentioned in Slack in #core-media by joedolson. View the logs.
4 years ago
This ticket was mentioned in Slack in #core by peterwilsoncc. View the logs.
4 years ago
This ticket was mentioned in Slack in #core by jeffpaul. View the logs.
4 years ago
#5
@
4 years ago
- Keywords needs-patch needs-unit-tests added; dev-feedback removed
Hi there, welcome to WordPress Trac! Thanks for the report, sorry it took a while for someone to get back to you.
A patch and some unit tests would be great to move the ticket forward.
This ticket was mentioned in PR #1229 on WordPress/wordpress-develop by markparnell.
4 years ago
#6
- Keywords has-patch added; needs-patch removed
Trac ticket: https://core.trac.wordpress.org/ticket/49278
#7
@
4 years ago
- Keywords needs-patch added; has-patch removed
Just added a patch which applies the discussed changes. As I flagged above it will definitely need someone with more background knowledge than me here - we may need a more nuanced approach as there are quite possibly edge cases I've not considered.
I also have no idea how to go about setting up unit tests for this, or even what we would actually be testing - there are already tests in place for meta queries, but I don't know whether they cover this sufficiently.
peterwilsoncc commented on PR #1229:
4 years ago
#8
Thanks @markparnell!
I've run the actions on this PR and am noticing a few failures in the PHP unit tests in Tests_Query_MetaQuery
. Most of them test the returned posts so it looks like they're meaningful (rather than a result of the query changing).
The coding standards check is also failing on two lines but if you don't want to worry about that until the functionality is sorted out, that's cool.
markparnell commented on PR #1229:
4 years ago
#9
Thanks Peter. Looks like I'm going to need to try and work out how to run the test suite on my local dev so I can investigate those errors properly. :-)
markparnell commented on PR #1229:
4 years ago
#10
I've resolved all of the test failures now. I think I've also fixed the CS issues, though I wasn't sure how to run those locally so I may have missed something.
#11
@
4 years ago
- Keywords has-patch added; needs-patch removed
PR updated to resolve all the test failures. I'm not entirely enamoured with the resulting code - particularly having to pass additional parameters between methods - so if anyone has any suggestions for a better approach that would be very welcome.
This ticket was mentioned in Slack in #core by markparnell. View the logs.
4 years ago
#13
@
4 years ago
- Keywords dev-feedback added; needs-unit-tests removed
- Milestone changed from Awaiting Review to 5.8
Per the discussion in devchat, moving this to 5.8
for more visibility and adding dev-feedback
in the hope that someone with solid SQL knowledge can take a look.
I've also removed the needs-unit-tests
flag as there are quite a number of existing unit tests (which pass), though I recognise it's possible that additional tests may be desirable.
This ticket was mentioned in Slack in #core by antpb. View the logs.
4 years ago
#15
@
4 years ago
- Milestone changed from 5.8 to 5.9
Since feature freeze for 5.8
is imminent, this isn't going to get the review it needs before then. Moving to 5.9
and will push for it to get some attention early in the next cycle.
This ticket was mentioned in Slack in #core by markparnell. View the logs.
4 years ago
This ticket was mentioned in Slack in #core by markparnell. View the logs.
3 years ago
#18
@
3 years ago
- Keywords early added
As per today's bugscrub, let's move it to next cycle and mark it early
.
#19
@
3 years ago
- Milestone changed from 5.9 to Future Release
Query changes by design need careful consideration and plenty of soak time to ensure nothing breaks. With 5.9 feature freeze ~2 days away, it's too late in the release cycle for a change to query. Punting this out of 5.9. As the next milestone is not yet available, moving to Future Release
. Once available, feel free to move into the next release.
#20
@
3 years ago
- Keywords early-like-actually-early added
Whoopsie forgot to mark it for really really early in the cycle.
#21
@
3 years ago
- Milestone changed from Future Release to 6.0
Per comment:19 moving this to 6.0 now that the milestone exists.
This ticket was mentioned in Slack in #core by costdev. View the logs.
3 years ago
#23
@
3 years ago
- Keywords has-unit-tests added
Per the discussion in the bug scrub:
- I'm adding the
needs-unit-tests
keyword so that this can be evaluated by the test team. - @craigfrancis has kindly offered to take a look at the patch when he's available.
- While this ticket already has the
performance
focus, it may be worth reaching out in#performance
to get eyes on this ticket when the approach has been given feedback.
This ticket was mentioned in Slack in #core by craigfrancis. View the logs.
3 years ago
#25
@
3 years ago
As we have a Dev Chat today, I'll summarise my conversation with Mark so far:
- The current patch does not change anything if the query uses
OR
(ref$this->has_or_relation
), which does not help @jillebehm (rough example). This is becausefind_compatible_table_alias()
will try to re-use an existing alias, which would have specified themeta_key
from the first condition. - In my own basic tests, which does not represent real data, I was able to get similar improvements by changing the field to
varchar(191)
(ref #33885), or runningOPTIMIZE TABLE
, because the updated INDEX info would lead to a much better query execution plan. - I have a small concern about how the
get_meta_sql
filter/hook might be affected; e.g. a plugin that modifies the 'join' string (e.g. to apply their own version of this fix), but I think they should expect changes.
This ticket was mentioned in Slack in #core by costdev. View the logs.
3 years ago
#27
@
3 years ago
- Milestone changed from 6.0 to Future Release
This was discussed in a bug scrub today.
Due to the complexity of the task and the patch still requiring some work, it was decided to move it off the 6.0 milestone.
#28
@
2 years ago
CAST(mt1.meta_value AS DATE) = '2020-01-23'
For MySQL and for DATE or DATETIME, that can be simplified to
mt1.meta_value = '2020-01-23'
This change will allow an index, even INDEX(32)
to be of some use for optimization. Cf, the plugin: https://wordpress.org/plugins/index-wp-mysql-for-speed/
Caveat, if meta_value
is actually a DATETIME and the intent is to truncate to DATE, then the CAST
(or equivalent) is needed. The following would be better, but probably not worth considering:
mt1.meta_value LIKE '2020-01-23%'
#29
@
2 years ago
Re: pinplugin_event_start_date
and pinplugin_event_start_time
It is almost always better (in MySQL) to use a single column instead of two columns for a DATETIME. (I don't know the details of the plugin, so I can't be more specific than to suggest you consider this change.)
(updated comment...)
Yes, I think that would simplify the JOIN
, which seems to be where the first version of the query was inefficient.
I've independently come to the same conclusion here. Particularly on large sites (i.e. millions of rows in the meta table), this makes a massive difference to performance. I've run some quick tests and got similar results to @jillebehm - with the additional clause in the
JOIN
the returned data is the same, but the query runs several orders of magnitude quicker.Would love for someone with more SQL (and
WP_Meta_Query
) knowledge than me to take a look, but from where I'm sitting this appears to be a simple change that could drastically improve performance for a lot of sites.