Make WordPress Core

Opened 5 years ago

Last modified 3 months ago

#45354 new enhancement

Adding index for meta_key and post_id in postmeta

Reported by: duckdagobert's profile DuckDagobert Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version:
Component: Database Keywords: needs-testing dev-feedback
Focuses: performance Cc:

Description

We should add an index on post_id and meta_key as queries that contain these columns are incredibly common (even in loop in core on every page load basically).

An index will improve query speed up to 100% even when only 1000 posts are in db.

Test it yourself:

ALTER TABLE `wp_postmeta` ADD INDEX `wp_postmeta_key_id` (`meta_key`,`post_id`);

Change History (11)

#1 follow-up: @mukesh27
5 years ago

  • Focuses performance added
  • Keywords needs-testing dev-feedback added

@DuckDagobert when we have to call above sql query in core software?

#2 in reply to: ↑ 1 @DuckDagobert
5 years ago

No, my code is the fix/patch for the problem.

For query example take the loop query (which gets post_id and _visibility for example).
But most WP core but also plugin queries like woocommerce,... will profit immensely from this index, as they select post_id inner join on postmeta where meta_key

#3 @fliespl
5 years ago

I can confirm that adding such might improve performance on many plugins.

Let's take imagify for example (with 100000 attachment in database).

Query:

SELECT SQL_NO_CACHE count(1)
FROM wp_posts AS p 
INNER JOIN wp_postmeta AS imrwpmt1 
ON ( p.ID = imrwpmt1.post_id
AND imrwpmt1.meta_key = '_wp_attached_file' ) 
INNER JOIN wp_postmeta AS imrwpmt2 
ON ( p.ID = imrwpmt2.post_id
AND imrwpmt2.meta_key = '_wp_attachment_metadata' ) 
INNER JOIN wp_postmeta AS mt1 
ON ( p.ID = mt1.post_id
AND mt1.meta_key = '_imagify_status' ) 
WHERE p.post_mime_type IN ( 'image/jpeg','image/png','image/gif','application/pdf' ) 
AND p.post_type = 'attachment' 
AND p.post_status IN ( 'inherit','private' ) 
AND mt1.meta_value IN ( 'success', 'already_optimized' ) 
AND imrwpmt1.meta_value NOT LIKE '%://%'
AND imrwpmt1.meta_value NOT LIKE '_:\\\%'
AND ( LOWER( imrwpmt1.meta_value ) LIKE '%.jpg'
OR LOWER( imrwpmt1.meta_value ) LIKE '%.jpeg'
OR LOWER( imrwpmt1.meta_value ) LIKE '%.jpe'
OR LOWER( imrwpmt1.meta_value ) LIKE '%.png'
OR LOWER( imrwpmt1.meta_value ) LIKE '%.gif'
OR LOWER( imrwpmt1.meta_value ) LIKE '%.pdf' );

a) default wordpress indexes:
1 row in set (28.22 sec)

b) after adding post_id + meta_key index

ALTER TABLE wp_postmeta ADD INDEX( `post_id`, `meta_key`);

1 row in set (2.51 sec)

It could be 10 times faster in some situations.

#4 @lukefiretoss
4 years ago

This index in the sites database would really help a number of WordPress sites.

#5 @galbaras
2 years ago

I'd say an index like this is beneficial if it helps frontend visitors, or if it helps backend users without causing any frontend delays. With the wide variety of plugins that use post meta, this may be difficult to determine, so maybe site admins should consider this on a site-by-site basis.

@fliespl @DuckDagobert Have you tested both "post_id, meta_key" and "meta_key, post_id" to see which one is faster? These columns vary in cardinality and speed of comparison, so one may be better than the other.

Also, have you measured the impact on both frontend and backend? Can you share more contextual details?

#6 @programmin
7 months ago

What are the potential downsides of this - why this has not been added in WP db upgrades?

#7 @galbaras
7 months ago

An additional index means slower updates and slightly slower queries that don't use the index (more decisions to make on the best query plan). On sites that update meta data a lot, or sites that don't use this index much, this may not be of benefit.

Also, as I've mentioned, the index can be created in 2 ways, and this will yield different results - one may help more and one may hinder more. It really depends on the context.

Looking at the very commonly-used core functions for getting meta data, they seem to fetch all the meta data of specific IDs into a cache at once, and then grab specific values from that cache by the key, so an index on the meta key will not help speed this up.

And index on post_id, meta_key would speed things up if WordPress fetched single records by the post ID and meta key.

Core Team, am I reading this correctly? Because this could be the start of a course change that will bring much performance improvement.

Importantly, if access to wp_postmeta is done by post_id, the table should have a primary index in that field, and it's currently indexed by meta_id, which doesn't help much.

A primary index by post_id should be lightning fast, because all the meta data records for a given ID would be contiguous and require the smallest possible number of block fetches in the database.

Last edited 7 months ago by galbaras (previous) (diff)

#8 @johnbillion
7 months ago

Additional indexes also take up disk space. Ironically the sites that would benefit the most from this additional index would also experience the greatest increase in space usage. (I'm not against the change, just pointing this out.)

#9 @galbaras
7 months ago

@johnbillion Any thoughts on making post_id the primary index (easier) or changing the way WordPress retrieves meta data (faster)?

Looks like disk space is now taken up by indexes that don't help (as much as they could), and some of those can be removed in favour of better ones.

#10 follow-up: @fliespl
3 months ago

@galbaras I don't have any big performance results for this index, but post_id + meta_key (80) was a winner in my case.

I have used limitation of 80 length for meta_key to reduce index space usage + because 99% of meta_keys were well below that value, while testing on a medium sized site (100k+ posts).

https://ss.codeone.pl/ss-2023-12-18-22-42-24-1702935744-a7HMU6OU.png

Most of the use cases for postmeta I have seen on many sites involved combination of post_id + meta_key (like filtering posts for specific ACF value or woocommerce value).

Also I disagree about post_id primary key (It's not even currently possible - it would totally change structure given that you have multiple meta_keys for single post + you can have duplicates! on post_id meta_key combination). meta_id makes sense here, but combination of post+meta_key index as well.

#11 in reply to: ↑ 10 @galbaras
3 months ago

Replying to johnbillion:

Additional indexes also take up disk space. Ironically the sites that would benefit the most from this additional index would also experience the greatest increase in space usage. (I'm not against the change, just pointing this out.)

In this day and age, disk space is cheap, but performance is golden.

Replying to fliespl:

Have you tested post_id, meta_key against meta_key, post_id?

Have you tested making either of those the primary index for the table?

It would be great if you could do this and report your findings.

Note: See TracTickets for help on using tickets.