Opened 3 years ago
Last modified 10 months ago
#54346 accepted enhancement
Slow SQL queries fetching posts from specific categories
Reported by: | Krstarica | Owned by: | craigfrancis |
---|---|---|---|
Milestone: | Future Release | Priority: | normal |
Severity: | normal | Version: | |
Component: | Database | Keywords: | needs-testing has-patch has-unit-tests |
Focuses: | rest-api, performance | Cc: |
Description
Noticed that REST API queries used in mobile app are very slow and found that such SQL queries can be optimized to be 10 times faster (1.8134 seconds vs. 0.1804 seconds) for wp_posts table having 800k records, see below.
Very similar queries are used when displaying posts from specific categories on the web, meaning optimizing this could lead to significant speed up everywhere.
The one responsible for this is WP_Tax_Query->get_sql_for_clause function.
REST API to fetch posts from specific categories, e.g. /wp-json/wp/v2/posts/?per_page=10&_embed=1&categories=63545,63546,63547,63548,63549,63552,76287&page=1
executes the following SQL query:
SELECT wp_posts.ID FROM wp_posts LEFT JOIN wp_term_relationships ON ( wp_posts.ID = wp_term_relationships.object_id ) WHERE 1 = 1 AND ( wp_term_relationships.term_taxonomy_id IN ( 63545, 63546, 63547, 63548, 63549, 63552, 76287 ) ) AND wp_posts.post_type = 'post' AND( (wp_posts.post_status = 'publish') ) GROUP BY wp_posts.ID ORDER BY wp_posts.post_date DESC LIMIT 0, 10
Query took 1.8134 seconds.
This query can be optimized by using subquery:
SELECT wp_posts.ID FROM wp_posts WHERE wp_posts.ID IN ( SELECT object_id FROM wp_term_relationships WHERE wp_term_relationships.term_taxonomy_id IN ( 63545, 63546, 63547, 63548, 63549, 63552, 76287 ) ) AND wp_posts.post_type = 'post' AND wp_posts.post_status = 'publish' ORDER BY wp_posts.post_date DESC LIMIT 0, 10
Query took 0.1804 seconds seconds.
Attachments (2)
Change History (21)
#2
@
3 years ago
Running this for several days in production and speed improvement is impressive.
Definitely should use subqueries in get_sql_for_clause functions (class-wp-meta-query.php and class-wp-tax-query.php) and then completely remove this code from class-wp-query.php:
if ( ! empty( $this->tax_query->queries ) || ! empty( $this->meta_query->queries ) ) { $groupby = "{$wpdb->posts}.ID"; }
#3
@
3 years ago
- Keywords needs-patch needs-testing added
- Version 5.8.1 deleted
@Krstarica Thanks for working on this! Are you able to put together a pull request on GitHub so we can get the test suite run with these changes in place please? Docs here: https://make.wordpress.org/core/handbook/contribute/git/github-pull-requests-for-code-review/
#4
@
3 years ago
Like @Krstarica we have had identical SQL performance problems with the LEFT JOIN query from the "get_sql_for_clause" function in some WordPress installations with +500,000 records in the posts table.
We have been using their solution in production and works perfectly, returning optimal performance on SQL queries, many going from +150 seconds to milliseconds.
Hopefully, this change can be translated into an upcoming update.
This ticket was mentioned in PR #3644 on WordPress/wordpress-develop by ivptr.
2 years ago
#5
- Keywords has-patch added; needs-patch removed
Trac ticket: https://core.trac.wordpress.org/ticket/54346
#6
@
2 years ago
@johnbillion I've created pull request.
Some unit tests failed because they expect JOIN, which is removed from the query.
#7
@
19 months ago
- Milestone changed from Awaiting Review to 6.3
- Owner set to johnbillion
- Status changed from new to accepted
#8
@
16 months ago
- Keywords needs-unit-tests added
If Unit tests are failing because there are new rules in place, unit tests needs to be changed too.
I want to point that we are in 9 days before Beta 1 and changes such this needs proper testing and can have side effects, so, it will be more prudent to plan this to 6.4 already.
This ticket was mentioned in PR #4665 on WordPress/wordpress-develop by @spacedmonkey.
16 months ago
#9
- Keywords has-unit-tests added; needs-unit-tests removed
Trac ticket: https://core.trac.wordpress.org/ticket/54346
#10
@
16 months ago
@johnbillion I have updated the PR so that unit test pass.
https://github.com/WordPress/wordpress-develop/pull/4665
I am not sure what unit tests to add to this, there are some good existing ones. This is not new functionality this is testing existing functionality.
#11
@
16 months ago
@johnbillion @flixos90 I did some testing with 20K posts. I am seeing a net benefit of this change, but is extemely small. It is hard to test with out more posts.
#12
@
16 months ago
Performance benchmarks, 35k posts, on a category page. 2k requests.
Trunk | PR | |
Response Time (median) | 124.23 | 140.82 |
wp-load-alloptions-query (median) | 0.7 | 0.78 |
wp-before-template (median) | 82.21 | 93.67 |
wp-before-template-db-queries (median) | 18.71 | 20.53 |
wp-template (median) | 36.53 | 40.57 |
wp-total (median) | 119.19 | 134.82 |
wp-template-db-queries (median) | 1.89 | 2.11 |
#13
@
16 months ago
Are the trunk/PR columns the correct way around in that table? It's showing a reduction in performance.
#14
@
16 months ago
@johnbillion I did some more testing with 35k posts and I am seeing a increase in server response time.
#15
@
16 months ago
- Milestone changed from 6.3 to Future Release
Thanks, then this needs further investigation.
#16
@
16 months ago
@johnbillion If it helps, I used a plugin called fakerpress to generate 35k work of posts and categories.
I had around 500 categories as well. I think the nest query is not fast if you have lots of categories or posts. If you have 2k post of posts assigned to one category, the nested query will be slow.
#17
@
11 months ago
- Component changed from Taxonomy to Database
- Owner johnbillion deleted
- Status changed from accepted to assigned
@craigfrancis might this be something you can review?
#18
@
11 months ago
- Owner set to craigfrancis
- Status changed from assigned to accepted
Yep, will try this weekend.
@Krstarica or @joseaneto, would you be able to email the following to me (craig at craigfrancis dot co dot uk):
- The
EXPLAIN
output for this query, i.e.EXPLAIN SELECT wp_posts.ID ...
; - The output from
SHOW CREATE TABLE wp_term_relationships
; - The output from
SHOW CREATE TABLE wp_posts
; - Number of records you have in these tables (joseaneto, I believe you had "+500,000 records in the posts table" in December 2021).
I'll try to work out what's happening, and report back here.
#19
@
10 months ago
I haven't forgotten, I was able to replicate the issue, but need a bit more time to look into this.
For Ivan, with 793,227 Posts, and 53,391 Terms, where each Term is used on average by 22 Posts, and the relationship table has 1,198,905 records... it seems that, if the sub-query gets a small number of results, it's faster to join to the Posts table... but, using a sub-query can be a slower for websites with a different ratio of Posts to Terms (what I'll be looking at next).
Here is the solution.
In wp-includes/class-wp-tax-query.php replace:
with:
Such query takes 1.3058 seconds. That saves 0.5 seconds.
To further speed it up, in wp-includes/class-wp-query.php replace:
with:
Such query takes 0.1804 seconds.
Important: since we removed "GROUP BY wp_posts.ID" for tax queries, we need to make sure that all other cases in WP_Tax_Query->get_sql_for_clause use subqueries, too.