Make WordPress Core

Opened 13 months ago

Last modified 13 months ago

#59106 new enhancement

Add secondary index on `wp_posts` table to improve media queries performance

Reported by: ovidiul's profile ovidiul Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 6.4
Component: Database Keywords:
Focuses: Cc:

Description

I've been recently been involved in migrating a large site database to MySQL 8.0.

Upon testing the site, we've noticed that the MySQL 8 queries related to the Media library seem to take much longer than on MariaDB.

Debugging the issue, it seems that MySQL 8, due to its Cost optimiser https://dev.mysql.com/doc/refman/8.0/en/cost-model.html, seems to prefer to drop the type_status_date index and to a full table scan to retrieve the results.

Doing a FORCE INDEX(type_status_date) on the media query, like

SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
FROM wp_posts
FORCE INDEX(type_status_date)
WHERE 1=1
AND wp_posts.post_type = 'attachment'
AND ((wp_posts.post_status = 'inherit'
OR wp_posts.post_status = 'private'))
ORDER BY wp_posts.post_date DESC
LIMIT 0, 20

seems to improve the response time.

However, a second alternative seems also to show better results, specifically targeting only the post_type, post_date and ID, something like

create index idx_type_date on wp_posts(post_type, post_date, ID);

Running the explain analyze on the queries with each index, we seem to get a better cost number when the new index is used, 142 vs 744, which would indicate a better performance at the MySQL level, this being confirmed in our tests on a large database

EXPLAIN ANALYZE SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
FROM wp_posts
FORCE INDEX(type_status_date)
WHERE 1=1
AND wp_posts.post_type = 'attachment'
AND ((wp_posts.post_status = 'inherit'
OR wp_posts.post_status = 'private'))
ORDER BY wp_posts.post_date DESC
LIMIT 0, 20

-> Limit: 20 row(s) (no early end due to SQL_CALC_FOUND_ROWS)  (cost=744 rows=20) (actual time=12..13.1 rows=20 loops=1)
    -> Sort: wp_posts.post_date DESC  (cost=744 rows=3620) (actual time=12..12.8 rows=7407 loops=1)
        -> Filter: ((wp_posts.post_type = 'attachment') and ((wp_posts.post_status = 'inherit') or (wp_posts.post_status = 'private')))  (cost=744 rows=3620) (actual time=0.0505..8.37 rows=7407 loops=1)
            -> Index range scan on wp_posts using type_status_date over (post_type = 'attachment' AND post_status = 'inherit') OR (post_type = 'attachment' AND post_status = 'private')  (cost=744 rows=3620) (actual time=0.0459..5.63 rows=7407 loops=1)
EXPLAIN ANALYZE SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
FROM wp_posts
FORCE INDEX(idx_type_date)
WHERE 1=1
AND wp_posts.post_type = 'attachment'
AND ((wp_posts.post_status = 'inherit'
OR wp_posts.post_status = 'private'))
ORDER BY wp_posts.post_date DESC
LIMIT 0, 20

-> Limit: 20 row(s) (no early end due to SQL_CALC_FOUND_ROWS)  (cost=142 rows=20) (actual time=0.214..15.3 rows=20 loops=1)
    -> Filter: ((wp_posts.post_status = 'inherit') or (wp_posts.post_status = 'private'))  (cost=142 rows=688) (actual time=0.211..15 rows=7407 loops=1)
        -> Index lookup on wp_posts using idx_type_date (post_type='attachment') (reverse)  (cost=142 rows=3619) (actual time=0.207..13.4 rows=7407 loops=1)

To replicate this, we've installed latest WordPress version and MySQL 8, and create the following post_status count structure:

mysql> select post_status, count(*) from wp_posts group by post_status;
+-------------+----------+
| post_status | count(*) |
+-------------+----------+
| inherit     |     7407 |
| draft       |        1 |
| publish     |       23 |
| auto-draft  |        1 |
+-------------+----------+
4 rows in set (0.01 sec)

Can this be considered as a core improvement to the wp_posts table that would directly benefit the Media related queries?

Basically, if MySQL 8 decides that the new index is faster, specially for Media search queries, it will simply pick that on up as opposed to using the original index or dropping it fully if it considers its much more efficient to do a full table scan.

From our database, a search query for term test shows an improvement of 0.5 seconds of this basic search media query:

MySQL []> SELECT SQL_CALC_FOUND_ROWS wp_posts.ID FROM wp_posts FORCE INDEX(type_status_date) WHERE 1=1 AND wp_posts.post_type = 'attachment' AND ((wp_posts.post_status = 'inherit' OR
wp_posts.post_status = 'private')) and post_title like "%test%" ORDER BY wp_posts.post_date DESC LIMIT 0, 1;
+---------+
| ID      |
+---------+
| xxxxxxx |
+---------+
1 row in set, 1 warning (3.867 sec)

MySQL []> SELECT SQL_CALC_FOUND_ROWS wp_posts.ID FROM wp_posts FORCE INDEX(idx_type_date) WHERE 1=1 AND wp_posts.post_type = 'attachment' AND ((wp_posts.post_status = 'inherit' OR wppp_posts.post_status = 'private')) and post_title like "%test%" ORDER BY wp_posts.post_date DESC LIMIT 0, 1;
+---------+
| ID      |
+---------+
| xxxxxxx |
+---------+
1 row in set, 1 warning (3.244 sec)

Thanks for considering this.

Change History (1)

#1 @ovidiul
13 months ago

  • Summary changed from Add secondary index wp_posts table to improve media queries performance to Add secondary index on `wp_posts` table to improve media queries performance
Note: See TracTickets for help on using tickets.