WordPress.org

Make WordPress Core

Opened 2 years ago

Last modified 4 weeks ago

#33885 new defect (bug)

meta_form performs a potentially expensive query

Reported by: jorbin Owned by:
Milestone: 4.9 Priority: normal
Severity: critical Version: 4.3.1
Component: Administration Keywords: has-patch
Focuses: performance Cc:

Description (last modified by mikeschroder)

This is a follow-up to #24498 and the research that @tollmanz did on the query used in meta_form. To summarize:

The query in meta_form can be incredibly slow on sites with a lot of post meta.

The solution to me seems to be to cache the query and update the cache of meta to use every time a new piece of post_meta is saved. This will prevent the query from needing to be run on sites with a persistent cache.

Attachments (5)

33885.diff (1.5 KB) - added by pento 2 years ago.
33885.2.diff (1.8 KB) - added by ericmann 22 months ago.
Add a filter to set predefined keys (or override with a different query) if necessary.
33885.2.2.diff (1.8 KB) - added by ericmann 22 months ago.
Fix tabs vs spaces
33885.3.diff (1.3 KB) - added by johnbillion 4 months ago.
33885.4.diff (6.2 KB) - added by lopwal 3 months ago.
Patch from https://github.com/WordPress/WordPress/pull/300

Download all attachments as: .zip

Change History (65)

#1 @mikeschroder
2 years ago

  • Description modified (diff)

#2 follow-ups: @tollmanz
2 years ago

I think I found the issue with this query. While the index was updated to have a length of 191, the field itself was not updated to varchar(191). When I change the field to varchar(191), the index works on the field and the query is fast (i.e., a few ms).

This test run (https://travis-ci.org/tollmanz/utf8mb4-query-time/jobs/82861471) shows the results with 1 million rows, using the following table:

CREATE TABLE `wp_postmeta` (
  `meta_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `post_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `meta_key` varchar(191) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `meta_value` longtext COLLATE utf8mb4_unicode_ci,
  PRIMARY KEY (`meta_id`),
  KEY `post_id` (`post_id`),
  KEY `meta_key` (`meta_key`(191))
) ENGINE=InnoDB AUTO_INCREMENT=212732257 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

The results are:

Query_ID	Duration	Query
1	0.00040875	SELECT DISTINCT meta_key\nFROM wp_postmeta\nWHERE meta_key NOT BETWEEN '_' AND '_z'\nHAVING meta_key NOT LIKE '\\_%'\nORDER BY meta_key\nLIMIT 30
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	wp_postmeta	range	meta_key	meta_key	767	NULL	519624	Using where; Using index

I think we just need to update the schema to change the field to varchar(191) and unfortunately write some upgrade routines.

This ticket was mentioned in Slack in #core by tollmanz. View the logs.


2 years ago

#5 @TheLastCicada
2 years ago

I've been following this ticket and have done some testing on a dataset from a large WordPress install that is currently unusable on 4.3 because of this bug. I can confirm that changing the field to VARCHAR(191) does result in the query using the index and brings performance to near 4.2 levels.

Here are the results of my testing:

CentOS 7
MariaDB 5.5.40

postmeta table has ~3.4 million rows and is 1.4 GB in size

4.1 query, UTF8mb4 meta_key:
SELECT meta_key
FROM wp_5_postmeta
GROUP BY meta_key
HAVING meta_key NOT LIKE '\_%'
ORDER BY meta_key
LIMIT 30;

14.47 seconds

4.3 query, UTF8mb4 meta_key:
SELECT DISTINCT meta_key
FROM wp_5_postmeta
WHERE meta_key NOT BETWEEN '_' AND '_z'
HAVING meta_key NOT LIKE '\_%'
ORDER BY meta_key
limit 30;

167 seconds

4.1 query, UTF8 meta_key (collation and encoding):
801 milliseconds

4.3 query, UTF8 meta_key (collation and encoding):
860 milliseconds

4.1 query, UTF8mb4, VARCHAR(191)
1.49 seconds

4.3 query, UTF8mb4, VARCHAR(191)
1.5 seconds

Doing an EXPLAIN on these queries confirms that before the VARCHAR(191) change, the query would not use the index when UTF8mb4 formatted. After the change, it uses the correct meta_key index (just as it did when UTF8 formatted).

From reading the MySQL documentation, this is not how I would expect indexes to work. The behavior described in the documentation seems to say that for a VARCHAR(255), the index would simply use the first 191 characters for the index and ignore the rest. I have not yet seen anything that suggests changing the field format itself is necessary.

@tollmanz's testing suite is pretty solid, but I've got a number of real sites with large postmeta tables impacted by this problem that I can test patches on if that would be helpful.

#6 in reply to: ↑ description @cmmmc
2 years ago

  • Severity changed from normal to critical
  • Version set to 4.3.1

Hi there,

I'm afraid I'm not that genius so can't contribute much other than saying that this is still an issue in WP 4.3.1.

Is there a quick and dirty (i.e. temporary) fix to easily adjust this? (whether in mysql or by editing a core file)

We've got 10M+ rows (woocommerce enabled site) and this definitely expensive query is crippling the back-end completely...

The front end seems to be OK, although it's touch and go because of the back-end issue.

I've been looking for a temporary solution and been trying to find out where this expensive query sits, but nothing I can find that works (not even in the includes/templates.php

I also already have applied these two solutions:

// Remove Custom Fields box from the Post edit screen.
function remove_post_custom_fields_now() {

  $post_types = get_post_types( '', 'names' ); 

  foreach ( $post_types as $post_type ) {
    remove_meta_box( 'postcustom' , $post_type , 'normal' );     
  }

}

add_action( 'admin_menu' , 'remove_post_custom_fields_now' );

and

//Fix Custom Field slow issue
function remove_metaboxes() {
     remove_meta_box( 'postcustom', 'page', 'normal', 'orders' );
}
add_action('admin_menu', 'remove_metaboxes');

It's getting so bad that I'm really in trouble soon...

Any temporary fixes or suggestions are much, much appreciated (hopefully others can benefit too).

Thanks, CM

#7 @finzend
2 years ago

We also had a lot of performance problems because of this query (2 million records; query ran in 8-10 seconds). Just wanted to say that the updated index fixes the problem!

@pento
2 years ago

#8 @pento
2 years ago

  • Keywords has-patch needs-testing added
  • Milestone changed from Awaiting Review to 4.4

I've been playing around a bit with tweaking the query, with not much luck. The MySQL optimizer just seems to have some issues with this particular prefix index in utf8mb4.

I'm wondering if it would be faster to do the sorting in PHP, instead. 33885.diff is first pass at that, grabbing a small chunk of rows, filtering them, then going back for more as needed.

For those of you with large data sets you can test on, could you please apply this patch, and tell me:

  • What does performance look like to you?
  • Does it give the same results as the current query?

This ticket was mentioned in Slack in #core by jorbin. View the logs.


22 months ago

#10 @pento
22 months ago

Here's a little nudge for all of you - we need to get this solved pretty soon (as in, in the next day or two, with 4.4 RC 1 due on the 18th), if this change is going to make it into 4.4.

Please test the patch and see if it helps.

#11 @ericmann
22 months ago

The expensive query bit me last Friday when it took a client site down. We're not using custom fields in the site, so the meta box had been disabled pretty much everywhere except for on Pages. Someone was editing a page and this triggered several instances of the query stacking up on the database layer and, eventually, locking things up entirely.

The first change I attempted was altering the database to set the size of the meta_key column to the same as the index. That by itself went a long way to help - before the change, the original query took > 140s (s as in seconds) - after the change, the same query took only 0.015s.

So this is definitely a change I'd like to see us make.

However, the ask was on this specific patch. I've run both pre and post tests on Pages (again, the only post type utilizing custom fields):

Trunk: 96.7036ms

Patched: 39.9126ms

This is showing significant improvement. I like how we're doing the filtering in PHP as it prevents running a super long query in the database that could lock things up. However, the looping behavior of the patch could result in running this new query multiple times. In my situation, it was only running once. If meta is built in such a way that this were to run more than twice, the overall performance here would actually be 'worse' than with the existing query.

For comparison, running EXPLAIN on the existing query (without the DB schema change) shows the query is scanning 1.2 million rows in order to generate our list of 30 meta keys (this is a somewhat large database). Running EXPLAIN on the patched version of the query (again, without the DB schema change) shows that we're scanning even more rows (2.5 million) to generate 115 records, and then filtering them out to just the 30 we wanted.

Trunk:

id, select_type, table,       type,  possible_keys, key,      key_len, ref, rows,    Extra
1,  SIMPLE,      wp_postmeta, range, meta_key,      meta_key, 767,        , 1244198, Using where; Using temporary; Using filesort

Patched:

id, select_type, table,       type, possible_keys, key, key_len, ref, rows,    Extra
1,  SIMPLE,      wp_postmeta, ALL,               ,    ,        ,    , 2488394, Using temporary

So, in this one case, a single run of the query feels faster. But it ultimately grabs far more information and loads far more results into memory in order for PHP to do the filtering outside of the database. I'm convinced we have a problem, but I'm not sure this is the right solution. Great contender, but still raises some red flags for me.

This ticket was mentioned in Slack in #core by ericmann. View the logs.


22 months ago

#13 @sc0ttkclark
22 months ago

+1 this so hard

This ticket was mentioned in Slack in #core by wonderboymusic. View the logs.


22 months ago

#15 @jorbin
22 months ago

  • Milestone changed from 4.4 to Future Release

With no commits, I'm moving this out of 4.4.

@ericmann
22 months ago

Add a filter to set predefined keys (or override with a different query) if necessary.

#16 @ericmann
22 months ago

As an alternative approach, patch 33885.2.diff adds a query earlier in the function's execution to allow short-circuiting everything if necessary. If custom fields are a finite set, they can be hard-coded into an array. If developers instead want to pull them with a different query (or from a provider like ElasticSearch for performance) they can hook into the same filter to grab the data as necessary.

This has the advantage of leaving existing functionality in place while still giving affected sites the ability to override the defaults in favor of performance.

@ericmann
22 months ago

Fix tabs vs spaces

#17 @nacin
22 months ago

So this is pretty terrible.

Really what I'd like to do here is just kill the damn meta box. Perhaps in 4.5 we can try to nuke it for all new installs.

If a site is affected, and they would like to use this filter, can't they just instead remove the meta box entirely? What's the need for the filter, in that case? The venn diagram of (1) sites that will apply this filter and (2) sites that use custom fields is, I'm pretty sure, just wired.com (per @tollmanz, who like @helen is sitting next to me at php[world]). As a workaround, they could either intercept the query, copy-paste the meta box, or build a better user experience. I will lose zero sleep over this very rare situation. Thanks @helen for getting me riled up about this.

But if we're going to do the filter, let's have it be null by default, that way someone can return an empty array.

#18 @helen
22 months ago

  • Milestone changed from Future Release to 4.4

I agree that this meta box is terrible and that the general goal should be to remove it entirely, but I think in the meantime, a short-circuit is smart and the idea of populating the dropdown with values that aren't necessarily existing meta keys is a pretty interesting one. I wouldn't advertise that as a good use, though, since we don't want to actually encourage using the default meta box :)

Strongly agree about using null. Going to commit this as a fix for 4.4 - longer term we shouldn't have the meta box registered by default. Not sure what can realistically be done about the actual underlying DB problem.

#19 @helen
22 months ago

In 35717:

Custom fields: Allow for short-circuiting the meta key dropdown.

Adds the postmeta_form_keys filter which allows for a potentially expensive query against postmeta to be avoided.

props ericmann, tollmanz, nacin.
see #33885.

#20 follow-up: @SergeyBiryukov
22 months ago

Should we also pass the post type to the postmeta_form_keys filter, as #18979 suggests?

#21 in reply to: ↑ 20 @helen
22 months ago

Replying to SergeyBiryukov:

Should we also pass the post type to the postmeta_form_keys filter, as #18979 suggests?

Maybe the whole $post object, then.

#22 @helen
22 months ago

In 35730:

Pass the $post object as context to postmeta_form_keys.

see #33885, #18979.

This ticket was mentioned in Slack in #core by wonderboymusic. View the logs.


22 months ago

#24 follow-up: @helen
22 months ago

  • Milestone changed from 4.4 to Future Release

Moving to future release aka punting, this isn't really "fixed" for 4.4 so much as you can just get around it for now without having to alter the table.

#25 @Val_
19 months ago

Just my 0.02$

Found this slow query while testing WP 4.4.2
Returns 0.5 million rows (of 1.9 million). Using where; Using temporary; Using filesort. 2.22 seconds

After modifying meta_key field to varchar(191), everything seems to a lot better:
Returns couple rows. Using where; Using index for group-by. Taking 0.0010 seconds

Waiting to final fix for this. Thank you.

#26 @DJosephDesign
19 months ago

I needed this! I have WordPress Multisite and one of the subsites has a large bbPress forum.

But I'm a little confused by the posts here. I'm on WordPress 4.4 and I need to know how I can implement this temporary fix.

I swapped out the code in template.php with the code changes in 33885.2.2.diff. Then, I ran "ALTER TABLE wp_##_postmeta MODIFY meta_key varchar(191);" (where "##" is the blog ID of my subsite).

Things are definitely faster! I went from 80-second load times on the editor to only a few seconds. But did I do this correctly?

Last edited 19 months ago by DJosephDesign (previous) (diff)

#27 follow-up: @robscott
18 months ago

@DJosephDesign got a large woocommerce-filled DB running on WP 4.4.2. Did your solution work to speed up admin, or do any issues persist?

#28 in reply to: ↑ 27 @DJosephDesign
18 months ago

Replying to robscott:

@DJosephDesign got a large woocommerce-filled DB running on WP 4.4.2. Did your solution work to speed up admin, or do any issues persist?

It's working for me and it definitely speed up my post editor. But I don't think this affects the rest of the admin, because this addresses post meta.

#29 @robscott
18 months ago

@DJosephDesign - the admin is peppered with slow queries, mainly based around the wp_postmeta table, which triggers such things as admin reporting for woocommerce, and a variety of other not-so-necessary but nonetheless loaded features.

In staging tested these updates and got some mild performance improvements. Queries are still slow, just not quite as painful.

#30 in reply to: ↑ 2 @jstensved
17 months ago

Spot on, I can confirm this is fixing critical issues.

I updated on a client site with 20+ million rows i wp_postmeta and it didn't respond at all after update. Just waiting 200+ seconds on each page load. After changing the field length to 191 on meta_key its back to it's original speed and returning pages in a few milliseconds again.

As a side note I don't think its good practice to have very longer keys for the meta_key anyway so the best solution would probably be to shorten the field in a upcoming release.

It would be nice if we could add som proactive scripts to the WP upgrade wizard warning the user about breaking changes before update. This kind of info would fit there and I can think up a lot more good cases where both plugin and core can detect issue before a core update to avoid downtime.

Replying to tollmanz:

I think I found the issue with this query. While the index was updated to have a length of 191, the field itself was not updated to varchar(191). When I change the field to varchar(191), the index works on the field and the query is fast (i.e., a few ms).

This test run (https://travis-ci.org/tollmanz/utf8mb4-query-time/jobs/82861471) shows the results with 1 million rows, using the following table:

CREATE TABLE `wp_postmeta` (
  `meta_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `post_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `meta_key` varchar(191) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `meta_value` longtext COLLATE utf8mb4_unicode_ci,
  PRIMARY KEY (`meta_id`),
  KEY `post_id` (`post_id`),
  KEY `meta_key` (`meta_key`(191))
) ENGINE=InnoDB AUTO_INCREMENT=212732257 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

The results are:

Query_ID	Duration	Query
1	0.00040875	SELECT DISTINCT meta_key\nFROM wp_postmeta\nWHERE meta_key NOT BETWEEN '_' AND '_z'\nHAVING meta_key NOT LIKE '\\_%'\nORDER BY meta_key\nLIMIT 30
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	wp_postmeta	range	meta_key	meta_key	767	NULL	519624	Using where; Using index

I think we just need to update the schema to change the field to varchar(191) and unfortunately write some upgrade routines.

#31 follow-up: @ocean90
17 months ago

#36762 was marked as a duplicate.

#32 in reply to: ↑ 31 ; follow-up: @patzo
17 months ago

Replying to ocean90:

#36762 was marked as a duplicate.

has anyone able to resolve this issue? i believe my ticket was a duplicate.

just really confusing when i read this page.

#33 in reply to: ↑ 32 @robscott
17 months ago

Replying to patzo:

Replying to ocean90:

#36762 was marked as a duplicate.

has anyone able to resolve this issue? i believe my ticket was a duplicate.

just really confusing when i read this page.

There are a number of factors at play, almost all related to sites with a lot stored in wp_postmeta.

Performance improvements can be found by implementing some of the solutions posted above, namely this one:

https://core.trac.wordpress.org/ticket/33885#comment:2

The SQL query which will get this done for you may be similar to this one:

https://core.trac.wordpress.org/ticket/33885?replyto=32#comment:26

If you're not using MU, and you have standard prefix, it will be very similar to the following:

ALTER TABLE wp_postmeta MODIFY meta_key varchar(191);

Highly recommend you do this in a test environment, and backup DB before running any alter table stuff. At the very least backup wp_postmeta.

Then see if the slow queries are resolved. There are a few other things we needed to look at on the sites which had slow queries. Mainly very large installs with a lot of things in meta.

#34 @andres.tarasco
14 months ago

Awesome! Our wordpress 4.5.3 + wpml + woocommerce site ( running a mariadb 5.5.47 database on a server with 12 cores + 64GB RAM) was facing several performance issues for a long time.

The alter table command fixed the utf8mb4 index issue and our server is now working as expected. Thanks!

Btw, why this fix is not yet available on wordpress core?

#35 @non-entity9
14 months ago

I'm running WP 4.5.3 + Woocommerce (60,000 products) on Nginx 1.9.14, Mariah DB 10.0.24, PHP 5.4.45, with 8 cores and 6GB ram.

Before I test this out, is there any particular information I can provide that will be of use to those working on this fix?

As I seem to be the exact intended audience for this fix , I thought I'd chip in if possible.

#36 in reply to: ↑ 2 @amfriedman
14 months ago

Yes, this solution worked for me. I have a wp_postmeta table with 600M rows and was experiencing the exact same slow query (60-80 sec). I used this handy command-line utility from the Percona Toolkit to modify the meta_key column from varchar(255) to varchar(191) without affecting the current site:

pt-online-schema-change --alter "MODIFY COLUMN meta_key VARCHAR(191)" D=mydatabase,t=wp_postmeta

Page load time of the Edit Post screen improved by over 50x.

Replying to tollmanz:

I think I found the issue with this query. While the index was updated to have a length of 191, the field itself was not updated to varchar(191). When I change the field to varchar(191), the index works on the field and the query is fast (i.e., a few ms).

This test run (https://travis-ci.org/tollmanz/utf8mb4-query-time/jobs/82861471) shows the results with 1 million rows, using the following table:

CREATE TABLE `wp_postmeta` (
  `meta_id` bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  `post_id` bigint(20) unsigned NOT NULL DEFAULT '0',
  `meta_key` varchar(191) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  `meta_value` longtext COLLATE utf8mb4_unicode_ci,
  PRIMARY KEY (`meta_id`),
  KEY `post_id` (`post_id`),
  KEY `meta_key` (`meta_key`(191))
) ENGINE=InnoDB AUTO_INCREMENT=212732257 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci;

The results are:

Query_ID	Duration	Query
1	0.00040875	SELECT DISTINCT meta_key\nFROM wp_postmeta\nWHERE meta_key NOT BETWEEN '_' AND '_z'\nHAVING meta_key NOT LIKE '\\_%'\nORDER BY meta_key\nLIMIT 30
id	select_type	table	type	possible_keys	key	key_len	ref	rows	Extra
1	SIMPLE	wp_postmeta	range	meta_key	meta_key	767	NULL	519624	Using where; Using index

I think we just need to update the schema to change the field to varchar(191) and unfortunately write some upgrade routines.

#37 @jb510
13 months ago

I was consistently seeing a 15s query time, ran ALTER TABLE wp_postmeta MODIFY meta_key varchar(191); on the table and now it's down to 0.3s.

The wp_post_meta table is about 2m rows.

TY @tollmanz hoping someone can get this into core soon?

#38 in reply to: ↑ 24 ; follow-up: @jb510
13 months ago

Replying to helen:

Moving to future release aka punting, this isn't really "fixed" for 4.4 so much as you can just get around it for now without having to alter the table.

Why avoid altering the table here? Is the concern someone could be using a 191+ char key and updating could truncate the key? Could we check max field length and then update? Or are we hoping for an upstream fox from MariaDB?

#39 in reply to: ↑ 38 ; follow-up: @dd32
13 months ago

Replying to jb510:

Why avoid altering the table here? Is the concern someone could be using a 191+ char key and updating could truncate the key? Could we check max field length and then update? Or are we hoping for an upstream fox from MariaDB?

Ideally MySQL/MariaDB/whomever else fixes whatever the bug is that we're running into.

We can't really decrease the length of the column, mostly due to problems arising from truncating keys, but also because it adds unexpected limits to the key lengths which have existed this way for quite some time. Having a shorter index is really a bandaid to allow it to work over all hosts.

Ideally I think we're probably best removing this metabox query entirely, and moving towards something else. We could leverage the meta registration process for example for pre-filling the meta keys without a query or somesuch too (or just remembering the last x meta_keys used by users in a transient or option)

#40 @robscott
8 months ago

Hey,

Just posting back here because I refer often to this link (I just sent someone here) to add 2 cents. On most managed WP hosts, when the meta table is somewhat large (WARNING non specific language!), this can cause a big speed increase. This is most often (for me) on quite big WooCommerce installations. It almost always speeds up admin and uncached (logged in user) pageload times significantly.

So 8s down to 1.5s loading time on admin load.

This is on real live sites in the wild! Usually hosted on a dedicated managed WP host of one flavour or another.

I check for the existence of > 190 char meta, then, if that's not there, I then run the query:

ALTER TABLE wp_postmeta MODIFY meta_key varchar(191);

#41 @SergeyBiryukov
7 months ago

#39878 was marked as a duplicate.

#42 in reply to: ↑ 39 @matt_fw
6 months ago

Replying to dd32:

Ideally MySQL/MariaDB/whomever else fixes whatever the bug is that we're running into.

This is not a bug, it's well documented and expected behaviour (https://dev.mysql.com/doc/refman/5.7/en/charset-unicode-conversion.html.

The query is slow because MySQL can't use partial index on meta_key to optimize ORDER BY clause (at least for now). There is a workaround ( innodb_large_prefix ) mentioned in the documentation I linked to but it requires configuration changes to MySQL so it's not a general solution.

Last edited 6 months ago by matt_fw (previous) (diff)

#43 follow-up: @pento
4 months ago

Unfortunately, I don’t see MySQL/MariaDB allowing sorting by partial indexes any time soon (and certainly not back porting it to older versions), which means we need to implement something in WordPress.

I don’t like the option of making meta_key shorter, as we’ll either end up truncating existing data, or (if we check for longer values before truncating) have some sites with meta_key as VARCHAR(191), and some with it as VARCHAR(255). Diverging database structures makes ongoing maintenance much more difficult, which I would prefer to avoid. That leaves the remaining options of more explicitly recommending the postmeta_form_keys filter, removing the meta box, or caching the query result.

I don’t think that the postmeta_form_keys filter is the right solution, particularly for sites that have regularly changing meta keys. We’d effectively be asking folks to write their own caching solution.

Removing the meta box is probably a big task that will involve a lot of ongoing work unrelated to this particular ticket. Ultimately, that’s the right solution, but I’m not sure if it’s something we should tackle to solve this bug.

That leaves caching the query result. Now, we can either cache using WP_Object_Cache (so the cache is really only valuable to sites with a persistent caching setup), or cache the results in an option, so that it’s valuable to any site. I lean towards using the object cache, as the usual caching solution, but I’m open to arguments either way.

#44 in reply to: ↑ 43 @dd32
4 months ago

Replying to pento:

I find myself agreeing with @pento here. Unfortunately I don't think there's some magical quick fix which we've missed.

Diverging DB structures and shortening field lengths is pretty much off the table IMHO, if only to prevent data destruction now or later (in a few years when the reason for some people having different field lengths has been long forgotten)

Removing the meta box is probably a big task that will involve a lot of ongoing work unrelated to this particular ticket. Ultimately, that’s the right solution, but I’m not sure if it’s something we should tackle to solve this bug.

Personally I see the need for the metabox, but I also think it could benefit greatly from a re-think, which will probably happen as part of a new post interface at some point in the future, probably reliant upon meta keys being registered and some form of UI being assigned, but that isn't for this ticket :)

That leaves caching the query result. Now, we can either cache using WP_Object_Cache (so the cache is really only valuable to sites with a persistent caching setup), or cache the results in an option, so that it’s valuable to any site. I lean towards using the object cache, as the usual caching solution, but I’m open to arguments either way.

A Transient seems like the best of both worlds here - memory cached when available, DB when not. The downside is that some people will be clearing all transients more often, and I fear those who are impacted by this the most are probably the same users, despite that, I still think it's the best option.

I'm thinking something along the logic lines of:

if transient doesn't exist then
   create transient as unique meta_key

on initial db upgrade then
   queue a single cron task to generate transient

on new meta_key addition then
   suffix to transient meta_key

for display then
   merge registered meta_keys with transient and display
Last edited 4 months ago by dd32 (previous) (diff)

#45 @johnbillion
4 months ago

This seems to be a lot of work when the simplest solution would be to remove the query which populates the meta keys dropdown entirely. The default behaviour would fall back to a text input for the meta key, and the postmeta_form_keys filter could still be used on sites which want to provide a list of keys.

#46 @robscott
4 months ago

Agree - though it may be as easy to make this an option, which defaults to "off", rather than using a filter. Its calling the expensive query when it wasn't (necessarily) needed that's the problem imo.

Is it not an option to update the database to a shorter meta_key (varchar(191)) after confirming the non existence of keys longer than that. I don't see how that could destroy any data, ever?

#47 @johnjamesjacoby
4 months ago

Looking at this more, we are lucky that ordering by meta_key is not very popular. This appears to be the only place in core where that happens, and none of the more popular plugins do this either.

In addition, WP_Meta_Query does not support ordering by meta_key – any attempts to do so are ignored.


The simplest option is @pento's original patch. Removing the ORDER BY clause (or using ORDER BY NULL) prevents MySQL from Using filesort on the result set, which does result in an overall performance improvement.

Unfortunately, it isn't until the VARCHAR length matches the meta_key index length that Using temporary stops and the meta_key index is fully utilized.


As far as directly altering _postmeta in concerned:

ALTER TABLE wp_postmeta MODIFY meta_key varchar(191);

I'd like to caution against doing this for a few reasons:

  • Doing this only to _postmeta and not the other meta-data tables goes against the normalization of the meta API. It's a change you will quickly forget you made here, and will inevitably cause a problem years from now in your application.
  • You'll want to run ANALYZE TABLE wp_postmeta; afterwards, or visit /wp-admin/maint/repair.php and follow the on-screen instructions. This is because ALTERing the database table alone does not rebuild the key distributions.
  • On tables with millions of rows, both the ALTER and the above ANALYZE TABLE queries will take several seconds (or minutes) to complete.
  • On installations with replication, you will want to monitor your secondaries, making sure their key distributions match the primary.
  • Running this ALTER has the hidden consequence of also updating the meta_key index to assume the length of the meta_key column, so future ALTERs on this column will directly effect the index too.
    KEY `meta_key` (`meta_key`(191))
    // Becomes:
    KEY `meta_key` (`meta_key`)
    // Because column & key share the same length, the key length definition is dropped
    

My conclusions:

  • Future versions of MySQL and MariaDB will default to innodb_large_prefix being `ON` by default. This won't directly avoid temporary tables & filesorts, but it will generally improve query performance the way @matt_fw eluded to earlier.
  • For anyone that has added an index to their meta_value columns (not uncommon for sites that do complex JOINs), you'll want to keep a similar eye on your customizations.
  • Because of how rare & weird it is to ORDER BY meta_key, I do think it's OK to not optimize the meta database tables for his behavior.
  • Thankfully, even when their lengths are misaligned, the meta_key index does still function as intended for all queries other than ORDER BY meta_key, so the meta_key index is not wholly ineffective - only for ORDER BY queries.
  • @dd32's transient approach in #44 is a sound work-around for this isolated problem, for now.
  • For later, we should consider what a redux of this meta-box experience looks like.

Tangent ideas:

  • Like we have wp_is_large_network(), we could consider a wp_is_large_site() companion. In theory, it could be used by plugins like WooCommerce or bbPress to announce to the rest of the environment "hey, maybe the posts table will have more rows than is considered typical" so other aspects of the system (I.E. this meta-box query) can respond accordingly.
  • Pippin and I have started working on a Meta Manager plugin, which includes a WP_Meta_Data_Query class for querying meta tables directly (without JOINing their paired object table.) I'd written those classes to eventually make into a core ticket. It's a bit premature to mention them now, but they still might be useful if we need to build a caching strategy around meta-only queries.
Last edited 4 months ago by johnjamesjacoby (previous) (diff)

#48 @robscott
4 months ago

@johnjamesjacoby nice points, well made :)

However, your caution aside, in reality, sites (like the wp_is_large_site() woocommerce sites (great idea btw) you mention) which exhibit this issue are managed by people who are seeing 8 second load times every time they do some action in admin of their site. This is not okay for such users, who may go to some other software because of it...

Now, as WooCommerce is moving towards using "other places" to store data, in addition to the other forthcoming updates, I suspect this issue will wash through in time. BUT, where this is a problem, right now, it is a big one, which can drive site admin pretty crazy!

The solution, while a bandaid, to perform some optimizations on the database, does work, today, to make admin faster.

That solution being ALTER TABLE wp_postmeta MODIFY meta_key varchar(191);

"It's a change you will quickly forget you made here, and will inevitably cause a problem years from now in your application."

What, specifically, are the inevitable problems here? (assuming I don't forget changes made to the db!)

#49 @johnjamesjacoby
4 months ago

Hey @robscott, thanks for the reply. I’ll cover my concern quickly here, and I’m happy to chat more with anyone more outside of this ticket (don’t want to pollute anyone’s inboxes, and don’t want to leave you hanging either.)

My concerns with one-off database alters are:

  • Future attempts to alter may fail without warning, and nobody will remember why that might be
  • Database alters aren’t in version control, so there is no history or log of that event having occurred to refer back to
  • One table performs differently than the others, using the same procedural API on top of it, again, without an obvious reason why to other people who work on the site
  • Reverting this change later (once WordPress itself is improved) may not be desirable, or forgotten completely

TL;DR - It’s confusing for other people if something else goes wrong. Improve without compromising future improvements.

#50 @jb510
4 months ago

If we can't/shouldn't truncate the column it seems like the best the answer is to not make the query in the first place. Either but reverting to a text field as John Billion suggested above, or not loading the metabox itself unless explictly wanted.

Would lazy loading that box or just lazy loading the drop down be viable?

Are we past the everything must work in a nojs world? Maybe Ajax/lazy load the drop down when focused, but for nojs just leave a text input?

FWIW we just disable that box via filter on all sites globally now and have had no issues.

#51 @johnjamesjacoby
4 months ago

FWIW we just disable that box via filter on all sites globally now and have had no issues.

This is my probably permanent recommended fix for existing installations, because by the time you have this much meta-data in that table, you will not derive very much joy from trying to choose a meta-key from a drop-down anyways.

Add this to an mu-plugin of your choosing:

add_action( 'add_meta_boxes', function() {
	remove_meta_box( 'postcustom', null, 'normal' );
} );

Another approach:

// Could use 'load-post-new.php' instead of 'admin_init'
add_action( 'admin_init', function() {
	foreach ( get_post_types_by_support( 'post-custom' ) as $type ) {
		remove_post_type_support( $type, 'post-custom' );
	}
} );

Long running queries are long running queries. Lazy loading the box or drop-down options only defers the problem, and "hiding" the meta-box doesn't help either because the same code still executes – it's just not visible. The meta-box needs to be removed after it's been added.

#52 @johnbillion
4 months ago

  • Keywords needs-patch added; has-patch needs-testing removed

Let's just remove the query from this meta box. If a site really wants it back, the existing postmeta_form_keys filter can be used to populate the list via whatever means is desired, including running the query that is currently in place.

@johnbillion
4 months ago

#53 @johnbillion
4 months ago

  • Keywords has-patch added; needs-patch removed
  • Milestone changed from Future Release to 4.8.1

#54 @pento
4 months ago

Removing the query will be a problem for the 99% of sites that aren't running into this performance issue.

The best option still seems to be caching the query result.

#55 @johnjamesjacoby
3 months ago

The best option still seems to be caching the query result.

I still agree with @pento, what @dd32 proposed is the best overall option.

@johnbillion's patch approach would be fine for a new meta-box, but it should really keep working as is for the existing one.

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


2 months ago

#57 @jbpaul17
2 months ago

  • Milestone changed from 4.8.1 to 4.9

Per today's bug scrub and based on what appears to be lack of agreement on approach, we're punting this to 4.9.

This ticket was mentioned in Slack in #core by jjj. View the logs.


4 weeks ago

#59 @jbpaul17
4 weeks ago

@johnbillion @lopwal are either of your latest patches in line with the approach described by @pento @dd32 @johnjamesjacoby? If so, then let's get this tested and moved along. If not, then it would be great to get a patch early in the 4.9 cycle so we can get solid testing/feedback... thanks!

#60 @robscott
4 weeks ago

@jbpaul17 I will put my hand up to test and banchmark this stuff when any patches are forthcoming. Most of my 'solutions' are not core enough to warrent any commentary here... but I have gone far too long (10 actual human years) without any meaningful contribution to wp core. I am convinced I can assist here! Particularly with some monster woo sites causing all manner of postmeta fun.

Last edited 4 weeks ago by robscott (previous) (diff)
Note: See TracTickets for help on using tickets.