WordPress.org

Make WordPress Core

Opened 21 months ago

Last modified 2 months ago

#49278 new enhancement

Improve meta query

Reported by: jillebehm Owned by:
Milestone: 5.9 Priority: normal
Severity: normal Version: 5.3.2
Component: Query Keywords: has-patch dev-feedback
Focuses: performance Cc:

Description

When having a couple of Meta Query statements in WP_Query the query becomes very slow. I think this is because of the way the JOINs are created.

Currently the JOINs are only done on the Post ID. The JOIN can become enormous, which means that filtering (the WHERE part) will take a lot of time.

I checked /wp-includes/class-wp-meta-query.php and posted the code between line 557 and 573 .

// JOIN clauses for NOT EXISTS have their own syntax.
			if ( 'NOT EXISTS' === $meta_compare ) {
				$join .= " LEFT JOIN $this->meta_table";
				$join .= $i ? " AS $alias" : '';

				if ( 'LIKE' === $meta_compare_key ) {
					$join .= $wpdb->prepare( " ON ($this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key LIKE %s )", '%' . $wpdb->esc_like( $clause['key'] ) . '%' );
				} else {
					$join .= $wpdb->prepare( " ON ($this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key = %s )", $clause['key'] );
				}

				// All other JOIN clauses.
			} else {
				$join .= " INNER JOIN $this->meta_table";
				$join .= $i ? " AS $alias" : '';
				$join .= " ON ( $this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column )";
			}

Apparantly when using the 'NOT EXISTS' compare the 'AND $alias.meta_key' part is added to the JOIN, but when NOT using the 'NOT EXISTS' compare this part is not there.
This means that when NOT using the 'NOT EXISTS' compare the a lot of data is joined in the temporary data set. I played with this part a bit and when adding the 'AND $alias.meta_key' part to those JOINs as well it sped up my query a lot. My query went from 38 seconds to 0.01 seconds with the same results.

My 'test' code:

// JOIN clauses for NOT EXISTS have their own syntax.
			if ( 'NOT EXISTS' === $meta_compare ) {
				$join .= " LEFT JOIN $this->meta_table";
				$join .= $i ? " AS $alias" : '';

				if ( 'LIKE' === $meta_compare_key ) {
					$join .= $wpdb->prepare( " ON ($this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key LIKE %s )", '%' . $wpdb->esc_like( $clause['key'] ) . '%' );
				} else {
					$join .= $wpdb->prepare( " ON ($this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key = %s )", $clause['key'] );
				}

				// All other JOIN clauses.
			} else {
				$join .= " INNER JOIN $this->meta_table";
				$join .= $i ? " AS $alias" : '';

				$valid_compares = array(
                    '=',
                    '!=',
                    '>',
                    '>=',
                    '<',
                    '<=',
                    'IN',
                    'NOT IN',
                    'EXISTS',
                );
				if( in_array($meta_compare, $valid_compares ) && !empty($clause['key']) && 'LIKE' !== $meta_compare_key ) {
                    $join .= $wpdb->prepare( " ON ( $this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column AND $alias.meta_key = %s )", $clause['key']);
                }
				else {
                    $join .= " ON ( $this->primary_table.$this->primary_id_column = $alias.$this->meta_id_column )";
                }
			}

I'm not really sure if this works in all cases (with all compare/compare_key variations), but I think it would be good to check it out (on Github I've seen that the last improvements here have been done at least 2 years ago).

For now I 'solved' my slow query by parsing the JOIN and WHERE on the filter 'get_meta_sql' and add the 'AND' part in the JOIN.

Below the query that gets created before and after the changes.

Query before (38 seconds):

SELECT SQL_CALC_FOUND_ROWS  riff19_posts.ID FROM riff19_posts  INNER JOIN riff19_postmeta ON ( riff19_posts.ID = riff19_postmeta.post_id )  INNER JOIN riff19_postmeta AS mt1 ON ( riff19_posts.ID = mt1.post_id )  INNER JOIN riff19_postmeta AS mt2 ON ( riff19_posts.ID = mt2.post_id )  INNER JOIN riff19_postmeta AS mt3 ON ( riff19_posts.ID = mt3.post_id ) JOIN riff19_icl_translations wpml_translations
							ON riff19_posts.ID = wpml_translations.element_id
								AND wpml_translations.element_type = CONCAT('post_', riff19_posts.post_type)  WHERE 1=1  AND ( 
  ( riff19_postmeta.meta_key = 'pinplugin_event_start_date' AND CAST(riff19_postmeta.meta_value AS DATE) < '2020-01-23' ) 
  OR 
  ( 
    ( 
      ( mt1.meta_key = 'pinplugin_event_start_date' AND CAST(mt1.meta_value AS DATE) = '2020-01-23' ) 
      AND 
      mt2.meta_key = 'pinplugin_event_start_time' 
      AND 
      ( mt3.meta_key = 'pinplugin_event_end_time' AND CAST(mt3.meta_value AS TIME) <= '17:19:19' )
    )
  )
) AND riff19_posts.post_type = 'event' AND (riff19_posts.post_status = 'publish' OR riff19_posts.post_status = 'acf-disabled' OR riff19_posts.post_status = 'private') AND ( ( ( wpml_translations.language_code = 'nl' OR 0 ) AND riff19_posts.post_type  IN ('post','page','attachment','wp_block','location','person','news','blog','case','service','event','vacancy','whitepaper' )  ) OR riff19_posts.post_type  NOT  IN ('post','page','attachment','wp_block','location','person','news','blog','case','service','event','vacancy','whitepaper' )  ) GROUP BY riff19_posts.ID ORDER BY riff19_posts.menu_order, CAST(riff19_postmeta.meta_value AS DATE) DESC, CAST(mt2.meta_value AS TIME) DESC, CAST(mt3.meta_value AS TIME) DESC LIMIT 0, 12

Query after (0.0028 seconds):

SELECT SQL_CALC_FOUND_ROWS  riff19_posts.ID FROM riff19_posts  INNER JOIN riff19_postmeta ON ( riff19_posts.ID = riff19_postmeta.post_id AND riff19_postmeta.meta_key = 'pinplugin_event_start_date') INNER JOIN riff19_postmeta AS mt1 ON ( riff19_posts.ID = mt1.post_id AND mt1.meta_key = 'pinplugin_event_start_date') INNER JOIN riff19_postmeta AS mt2 ON ( riff19_posts.ID = mt2.post_id AND mt2.meta_key = 'pinplugin_event_start_time') INNER JOIN riff19_postmeta AS mt3 ON ( riff19_posts.ID = mt3.post_id AND mt3.meta_key = 'pinplugin_event_end_time') JOIN riff19_icl_translations wpml_translations
							ON riff19_posts.ID = wpml_translations.element_id
								AND wpml_translations.element_type = CONCAT('post_', riff19_posts.post_type)  WHERE 1=1  AND ( 
  ( riff19_postmeta.meta_key = 'pinplugin_event_start_date' AND CAST(riff19_postmeta.meta_value AS DATE) < '2020-01-23' ) 
  OR 
  ( 
    ( 
      ( mt1.meta_key = 'pinplugin_event_start_date' AND CAST(mt1.meta_value AS DATE) = '2020-01-23' ) 
      AND 
      mt2.meta_key = 'pinplugin_event_start_time' 
      AND 
      ( mt3.meta_key = 'pinplugin_event_end_time' AND CAST(mt3.meta_value AS TIME) <= '17:18:05' )
    )
  )
) AND riff19_posts.post_type = 'event' AND (riff19_posts.post_status = 'publish' OR riff19_posts.post_status = 'acf-disabled' OR riff19_posts.post_status = 'private') AND ( ( ( wpml_translations.language_code = 'nl' OR 0 ) AND riff19_posts.post_type  IN ('post','page','attachment','wp_block','location','person','news','blog','case','service','event','vacancy','whitepaper' )  ) OR riff19_posts.post_type  NOT  IN ('post','page','attachment','wp_block','location','person','news','blog','case','service','event','vacancy','whitepaper' )  ) GROUP BY riff19_posts.ID ORDER BY riff19_posts.menu_order, CAST(riff19_postmeta.meta_value AS DATE) DESC, CAST(mt2.meta_value AS TIME) DESC, CAST(mt3.meta_value AS TIME) DESC LIMIT 0, 12

Change History (16)

This ticket was mentioned in Slack in #core-media by joedolson. View the logs.


7 months ago

#2 @markparnell
6 months ago

I've independently come to the same conclusion here. Particularly on large sites (i.e. millions of rows in the meta table), this makes a massive difference to performance. I've run some quick tests and got similar results to @jillebehm - with the additional clause in the JOIN the returned data is the same, but the query runs several orders of magnitude quicker.

Would love for someone with more SQL (and WP_Meta_Query) knowledge than me to take a look, but from where I'm sitting this appears to be a simple change that could drastically improve performance for a lot of sites.

This ticket was mentioned in Slack in #core by peterwilsoncc. View the logs.


6 months ago

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


6 months ago

#5 @SergeyBiryukov
6 months ago

  • Keywords needs-patch needs-unit-tests added; dev-feedback removed

Hi there, welcome to WordPress Trac! Thanks for the report, sorry it took a while for someone to get back to you.

A patch and some unit tests would be great to move the ticket forward.

This ticket was mentioned in PR #1229 on WordPress/wordpress-develop by markparnell.


5 months ago

  • Keywords has-patch added; needs-patch removed

#7 @markparnell
5 months ago

  • Keywords needs-patch added; has-patch removed

Just added a patch which applies the discussed changes. As I flagged above it will definitely need someone with more background knowledge than me here - we may need a more nuanced approach as there are quite possibly edge cases I've not considered.

I also have no idea how to go about setting up unit tests for this, or even what we would actually be testing - there are already tests in place for meta queries, but I don't know whether they cover this sufficiently.

#8 @prbot
5 months ago

peterwilsoncc commented on PR #1229:

Thanks @markparnell!

I've run the actions on this PR and am noticing a few failures in the PHP unit tests in Tests_Query_MetaQuery. Most of them test the returned posts so it looks like they're meaningful (rather than a result of the query changing).

The coding standards check is also failing on two lines but if you don't want to worry about that until the functionality is sorted out, that's cool.

#9 @prbot
5 months ago

markparnell commented on PR #1229:

Thanks Peter. Looks like I'm going to need to try and work out how to run the test suite on my local dev so I can investigate those errors properly. :-)

#10 @prbot
5 months ago

markparnell commented on PR #1229:

I've resolved all of the test failures now. I think I've also fixed the CS issues, though I wasn't sure how to run those locally so I may have missed something.

#11 @markparnell
5 months ago

  • Keywords has-patch added; needs-patch removed

PR updated to resolve all the test failures. I'm not entirely enamoured with the resulting code - particularly having to pass additional parameters between methods - so if anyone has any suggestions for a better approach that would be very welcome.

This ticket was mentioned in Slack in #core by markparnell. View the logs.


5 months ago

#13 @markparnell
5 months ago

  • Keywords dev-feedback added; needs-unit-tests removed
  • Milestone changed from Awaiting Review to 5.8

Per the discussion in devchat, moving this to 5.8 for more visibility and adding dev-feedback in the hope that someone with solid SQL knowledge can take a look.

I've also removed the needs-unit-tests flag as there are quite a number of existing unit tests (which pass), though I recognise it's possible that additional tests may be desirable.

This ticket was mentioned in Slack in #core by antpb. View the logs.


5 months ago

#15 @markparnell
5 months ago

  • Milestone changed from 5.8 to 5.9

Since feature freeze for 5.8 is imminent, this isn't going to get the review it needs before then. Moving to 5.9 and will push for it to get some attention early in the next cycle.

This ticket was mentioned in Slack in #core by markparnell. View the logs.


2 months ago

Note: See TracTickets for help on using tickets.