Make WordPress Core

Opened 10 years ago

Last modified 23 months ago

#30044 new enhancement

Use subqueries for AND clauses in WP_Meta_Query

Reported by: boonebgorges's profile boonebgorges Owned by:
Milestone: Future Release Priority: normal
Severity: normal Version: 4.1
Component: Query Keywords: needs-refresh
Focuses: performance Cc:

Description

See #24093, starting about here, for background: https://core.trac.wordpress.org/ticket/24093#comment:10

When multiple clauses are passed to WP_Meta_Query under the scope of an AND relation, a new table JOIN is required for each clause. With very large meta tables, this can lead to poor performance. We should investigate the performance implications of using subqueries for some of these queries to avoid the joins, much like what happens in WP_Tax_Query for NOT IN and AND operators.

Note that the improvements to OR queries in [29953] probably mean that subqueries will not be faster in most cases where relation=OR.

First step toward this would be to see some benchmarks using data of different kinds: small databases; large sets of posts with relatively small postmeta tables; large sets of posts with large amounts of postmeta. As pento notes here https://core.trac.wordpress.org/ticket/24093#comment:30, we'll also need to look at how performance differs between supported versions of MySQL.

Attachments (2)

wp-query-patch-test-data.zip (2.7 KB) - added by p51labs 10 years ago.
wp-query-patch.zip (2.9 KB) - added by p51labs 10 years ago.

Download all attachments as: .zip

Change History (15)

#1 @kevinfodness
10 years ago

I am the author of the original subquery patch for #24093. I'd like to take a crack at modifying it to extend the work that was done for the 4.1 milestone and expand the scope to other types of meta queries that perform JOIN operations in similar ways. I should have some time available next week or early November to work on this.

#2 @boonebgorges
10 years ago

kevinfodness - That sounds great. Looking forward to seeing what you come up with.

#3 @p51labs
10 years ago

The wp-query-patch plugin has a replacement for the get_sql_for_query function in wp-includes/meta.php. I opted for separate queries in this solution over joins and sub queries because there are no mysql version issues and they are faster. The wp-query-patch-test-plugin will create a custom post type called 'Test Posts' and will then populate the post type with 10,000 entries, each with 11 custom fields and values randomly chosen from 0-100. When the list page is loaded it will run a query against the test posts using a large nested meta query to test with. Below are some examples of execution times using the attached plugins.

WordPress: AND ~ 110s OR ~ Timeout
Patch (all queries totaled): AND ~ 100ms OR ~ 100ms

Obviously these show a dramatic difference in execution time. We have been doing something similar with the Piklist plugin (https://wordpress.org/plugins/piklist/) for the last 2+ years and have been very pleased with the speed, allowing even the most complex of queries.

Let me know if there are any questions, I thought this would be a good place to kick things off?

#4 @boonebgorges
10 years ago

  • Type changed from defect (bug) to enhancement

p51labs - Cool, this looks like an interesting start! A couple thoughts -

  • It's pretty hard to use in this format. You think you could generate a proper svn or git patch with your newly suggested get_sql_for_query()? This'd make it a lot easier to test. (Ideally you would also use tabs instead of spaces for indentation and adhere to other WP coding standards too - again, just to make this easier to read for the purposes of review.)
  • When running the unit test suite with your patch, quite a few syntax errors come up. It looks like you're doing something funny when joining subclauses with AND.
  • Performance improvements when using AND are very impressive. When using OR, it's not as dramatic when compared with current trunk, though there is still an improvement. I'm curious to know how performance will look with different kinds of data sets: a larger number of posts; a greater variety of meta_keys; much smaller amounts of posts/postmeta.

#5 @p51labs
10 years ago

boonebgorges - Thanks for the feedback. I have a patch going, just working on an orderby issue. Once that's resolved I will post an update.

#6 @Will Brownsberger
10 years ago

Thanks for your work on this team -- faster AND queries for custom fields will make a big difference for my app.

Not sure if this is an appropriate thread to raise this in, but it relates to the same problem -- slow custom field meta queries: The lack of indexing of metavalues slows things down a lot:

See this related thread: https://core.trac.wordpress.org/ticket/19738#comment:7

Last edited 10 years ago by Will Brownsberger (previous) (diff)

#7 in reply to: ↑ description @saggel
10 years ago

  • Version set to 4.1

Replying to boonebgorges:

First step toward this would be to see some benchmarks using data of different kinds: small databases; large sets of posts with relatively small postmeta tables; large sets of posts with large amounts of postmeta.

--

Hello all, I am writing cause I saw that you need to test this on sites with large databases – or at least that’s what I understood. I own a wordpress-based site which currently has 46.524 Posts and 431.935 Comments (let me know what other stats you’d need). If you’d like to apply the patch and get any performance metrics before and after let me know and I’ll help as much as I can.

#8 @chriscct7
9 years ago

  • Keywords needs-patch added

#9 @haydenkshaw
8 years ago

Hello. Just thought I'd bump the issue as I've had to take an altenative approach with a plug-in I'm developing based on this limitation.

I found that my page response time would jump from a couple seconds to 3 minutes once I had 9 seperate Meta_Query keys that were compared in the following way:

OR(OR(key1,AND(key2,key3)), OR(key4,AND(key5,key6)), OR(key7,AND(key8,key9)))
Last edited 8 years ago by haydenkshaw (previous) (diff)

#10 @boonebgorges
7 years ago

#41219 was marked as a duplicate.

#11 @Tyrannous
7 years ago

p51labs, it’s been three years since your first patch. Have you been able to work on boonebgorges’s comment in the meantime?
Performance has not improved since v4.1, so a (refreshed) patch is still needed.

Thanks.

#12 @chriswecreate
6 years ago

Hi,

I know this is an old ticket but I just wanted to say I used @p51labs plugin and it still work! I have 11 custom fields I'm querying and it reduced query time from 57 seconds to 1.7! Just wanted to say thanks and still seems to work well!

Ah ... sorry it's not working. It did reduce the query time but just seems to be pulling back all results

Last edited 6 years ago by chriswecreate (previous) (diff)

#13 @mukesh27
23 months ago

  • Keywords needs-refresh added; needs-patch removed

Thanks @boonebgorges for the ticket.

We check this ticket in today's Performance bug scrub. As first step convert patch to GitHub PR.

Note: See TracTickets for help on using tickets.