#24498 closed enhancement (fixed)
Improving performance of meta_form()
Reported by: | lumaraf | Owned by: | pento |
---|---|---|---|
Milestone: | 4.3 | Priority: | normal |
Severity: | normal | Version: | 3.5 |
Component: | Administration | Keywords: | has-patch |
Focuses: | performance | Cc: |
Description (last modified by )
The query executed in meta_form() in wp-admin/includes/template.php took over 5 seconds to execute on my largest wordpress installation. I changed it to use a NOT BETWEEN to enable MySQL to execute the query by just using a small part of the meta_key index. My new query executes in less than 10 milliseconds.
Here is the improved query:
SELECT DISTINCT meta_key FROM wp_postmeta WHERE meta_key NOT BETWEEN '_' AND '_z' HAVING meta_key NOT LIKE '\_%' ORDER BY meta_key
Attachments (9)
Change History (51)
#5
@
9 years ago
I tested different queries on a local environment and also on some production systems with real data.
Although the original query was never really slow, I've come to the conclusion that the proposed query is at least 40% faster in general.
However, maybe we don't really need this query at all. It's not that useful. I suggest either removing the select box completely (in favor of the text input) or adding an ajax autocomplete to the text input.
Heck, we could even remove the whole meta box for new installs. Who uses that anyway?
#6
@
9 years ago
- Keywords has-patch added; needs-patch removed
Btw, this issue got a bit of traction thanks to a CSS-Tricks article yesterday: https://css-tricks.com/swapping-a-wordpress-core-meta-box-to-speed-up-editing/
This ticket was mentioned in Slack in #core by dd32. View the logs.
9 years ago
@
9 years ago
This alternative patch removes the query entirely and only allows entering meta keys through the text input
#10
@
9 years ago
Just to chime in with testing results on a site with 2.5M postmeta rows...
The default query takes 6.85s
This takes 1.58s:
SELECT DISTINCT meta_key FROM wp_postmeta WHERE meta_key NOT LIKE '\\_%' ORDER BY meta_key limit 30
And the proposed patch 24498.patch takes 2.5s
#11
@
9 years ago
Hi,
Recently, I sent an email to the Wordpress dev mailing list and Chris Christoff told me about this bug report.
So, I'm copy/pasting the my email here:
---
The problem was at line 674, where the SQL query was defined.
I'm putting the raw (already processed) SQL query here:
SELECT meta_key FROM wp_postmeta GROUP BY meta_key HAVING meta_key NOT LIKE '\_%' ORDER BY meta_key LIMIT 30
To run this query, MySQL will sort all meta_key lines, including all custom fields and all the other fields (starting with '_'), unify them, select the desired fields, and them return them.
The problem with its query is that it will run a sort over all of the entries (due to the GROUP BY part), and select the desired entries after the group by (HAVING part). A more efficient solution should consider excluding unnecessary entries before sorting/unifying them.
A better query could be:
SELECT DISTINCT meta_key FROM wp_postmeta WHERE meta_key NOT LIKE '\_%' ORDER BY meta_key LIMIT 30
In my case, this simply change gave us a performance improvement of 4x (from 8s to less then 2s).
So, I'm proposing this patch to this file, and I hope that it can be useful to make the Wordpress project better.
--- wp-admin/includes/template.php.orig 2015-07-16 00:22:28.000000000 -0300 +++ wp-admin/includes/template.php 2015-07-16 00:23:06.000000000 -0300 @@ -671,10 +671,9 @@ * @param int $limit Number of custom fields to retrieve. Default 30. */ $limit = apply_filters( 'postmeta_form_limit', 30 ); - $sql = "SELECT meta_key + $sql = "SELECT DISTINCT meta_key FROM $wpdb->postmeta - GROUP BY meta_key - HAVING meta_key NOT LIKE %s + WHERE meta_key NOT LIKE %s ORDER BY meta_key LIMIT %d"; $keys = $wpdb->get_col( $wpdb->prepare( $sql, $wpdb->esc_like( '_' ) . '%', $limit ) );
It's interesting to see the @pross (comment 10) proposed the same patch that I did :).
#12
@
9 years ago
- Milestone changed from Awaiting Review to 4.3
- Owner set to pento
- Status changed from new to assigned
24498-test.php is a performance test I ran on these queries, the results are attached as CSVs.
For a quick explanation of the variables:
$connections
- I originally planned on running these tests on multiple versions of MySQL, but decided that it wasn't really necessary, as these are fairly simple queries.
$engines
- obvious.
$row_counts
- how many rows to insert into the test table.
$percent_underscores
- the percentage of rows to have starting with a _
, which affects the comparison part of the query.
$unique_keys_count
- the number of unique keys to insert into the table. This affects the index cardinality.
Summary: The original query is clearly slower, but there's very little difference between the two proposed queries, with the query in 24498.patch being slight quicker in most circumstances.
Unless anyone can show consistent data that one or the other is significantly quicker, I'm happy for 24498.patch to go into 4.3.
#13
@
9 years ago
Hi @pento,
The real optimization is using the WHERE (exclusions before sorting) instead of only using HAVING (exclusions after sorting).
About the query proposed in the patch:
SELECT DISTINCT meta_key FROM wp_postmeta WHERE meta_key NOT BETWEEN '_' AND '_z' HAVING meta_key NOT LIKE '\_%' ORDER BY meta_key LIMIT 30
This patch runs an exclusion, after a sort, and after it runs an other exclusion.
The other one (comments 10 and 11) runs an exclusion and after a sort (there is no need of running another exclusion after the sort, because there is no HAVING clause).
SELECT DISTINCT meta_key FROM wp_postmeta WHERE meta_key NOT LIKE '\_%' ORDER BY meta_key LIMIT 30
The execution plan is very similar in both cases. Both of them are practically equivalent in terms of efficient.
It's possible that excluding using a 'not in range' logic be a bit quicker that excluding using a 'not like' logic. But it will only make difference if there is no meta_key entry in the form _X, where X is a character greater than z. For example, '_Á' or '_ß'. In these cases, the second query would run in the same performance or even a little faster then the first one. More over, the NOT BETWEEN
performance depends on the used CHARSET and its results depend on the COLLATION. It's possible that for some cases the first query will be better and for others the second query will be better.
So, in this specific case, I prefer the simpler solution (comments 10/11). But I remember that both of them are correct and much more efficient than the original one.
Final note:
The first query could be rewritten also as:
SELECT DISTINCT meta_key FROM wp_postmeta WHERE meta_key NOT BETWEEN '_' AND '_z' AND meta_key NOT LIKE '\_%' ORDER BY meta_key LIMIT 30
This moves the final exclude (after the sort) to before the sort. Technically, it will be a little better.
It's also worth to notice that using the next query is wrong (because the '_Á' and '_ß' bug and the problem of changing the COLATION). So, the NOT LIKE clause (after or before the sort) is still required.
SELECT DISTINCT meta_key FROM wp_postmeta WHERE meta_key NOT BETWEEN '_' AND '_z' ORDER BY meta_key LIMIT 30
#14
@
9 years ago
Heya folks,
I just wanted to chime in real quick to say that @pento's 24498.patch was a very significant improvement for one of our customers here with ~3.8 million rows in _postmeta (InnoDB).
The single affected query went from ~17 seconds to ~0.03 seconds.
Original query:
SELECT meta_key FROM wp_postmeta GROUP BY meta_key HAVING meta_key NOT LIKE '\\_%' ORDER BY meta_key LIMIT 30; 30 rows in set (16.83 sec)
Initial improvement suggestion:
SELECT DISTINCT meta_key FROM wp_postmeta WHERE meta_key NOT LIKE '\_%' ORDER BY meta_key LIMIT 30; 30 rows in set (4.03 sec)
@pento version
SELECT DISTINCT meta_key FROM wp_postmeta WHERE meta_key NOT BETWEEN '_' AND '_z' AND meta_key NOT LIKE '\_%' ORDER BY meta_key LIMIT 30; 30 rows in set (0.03 sec)
It's worth noting that most of this site's _postmeta are from WooCommerce, which mostly begin with underscores, so excluding those more efficiently really has a big impact.
#16
follow-up:
↓ 17
@
9 years ago
@pento, the "range" one on your test, how does that perform when the meta starts with _Á
or _ß
as suggested in comment:13?
#17
in reply to:
↑ 16
@
9 years ago
Replying to chriscct7:
@pento, the "range" one on your test, how does that perform when the meta starts with
_Á
or_ß
as suggested in comment:13?
I feel like this is a edgecase that we don't necessarily need to optimize for heavily. The further regexp/having clauses would filter out those, and they're certainly going to be less common.
The other option is that it's written as meta_key NOT BETWEEN '_' AND '`'
which should filter out all which start with _
I think - but i'm not sure what performance implications that would have (`
is the next character up from _
)
#18
in reply to:
↑ 15
@
9 years ago
Replying to obenland:
@pento, can you make a decision for 4.3 on this before beta4?
Given the wide variety of behaviour seen here for the different query options, which I couldn't reproduce in testing, we'll probably need to punt this one.
I'm currently busy with 4.2.3, so if someone feels like having a go at this, please do. Beta4 is due out on the 22nd - if we can have something solid by then, I'm happy to include it.
#19
@
9 years ago
Given the wide variety of behaviour seen here for the different query options, which I couldn't reproduce in testing, we'll probably need to punt this one.
In your opinion @pento, do you expect either of the proposed queries here to be slower than what we currently have?
If it's faster in 80% of cases, we should get this in, and iterate/improve it in 4.4 (I'll happily do both of these)
#20
@
9 years ago
Either will be faster than the original - as @rarylson noted in comment:11, the aim is to exclude the full scan caused by the GROUP BY
in the original query. The worst case is that the original query and both of the proposed queries are about the same speed when the percentage of meta_keys beginning with _
is very low.
#21
@
9 years ago
I concur with @dd32. Let's get one of the two in and then we can improve upon it next release.
This ticket was mentioned in Slack in #core by obenland. View the logs.
9 years ago
#23
@
9 years ago
One main thing to keep in mind while testing this, is that (although you would think it wouldn't make that much of a difference) latin1 vs utf8 vs utf8mb4 seems to make a significant different here in performance, however the original being slow stands out under them all.
24498.patch seems to be safe to me, so lets go for it for 4.3.
If this still has performance issues for larger sites, I'd suggest that it should perhaps be cached in a transient (and cleared when an non-underscored field is set). This might be worth looking at in 4.4.
The other option is that it's written as meta_key
NOT BETWEEN '_' AND '`'
which should filter out all which start with _ I think - but i'm not sure what performance implications that would have (` is the next character up from _)
Just to loop back to this, the performance isn't great, MySQL doens't optimize it the same :)
#25
@
9 years ago
I have a site with 4.5 million wp_postmeta rows and this query is still slow (20 seconds).
It looks like the where clause is used but in my case that still means creating a 4.1 million row temp table because of the distinct. I can write a patch with a transient if that's the solution that will be accepted, but at least in my case the 20 second delay every time its empty for a feature that isn't useful isn't great (even if they used the meta_key drop down, showing 30 of the 2k keys isn't helpful).
Should I start a new ticket or reopen this one?
#26
@
9 years ago
@jeichorn Lets start a new ticket :) "Cache the list of keys in meta_form()" or something.
Curious though, if this takes 20s, how long did the previous one take? Unfortunately there isn't really much we can do about the temporary table here I don't think.
#27
@
9 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
There's a couple other low hanging optimizations it looks like we can do here.
First, I think we can get rid of the orderby, since we already run the result through natcasesort.
Also, we can limit the meta it searches through by limiting to the post_id in the where clause.
Attaching something I just wrote on a phone (so if you want to test it make sure to sanity check it for autocorrect gremlins before trying to apply it)
#28
@
9 years ago
- Resolution set to fixed
- Status changed from reopened to closed
Ah whoops, I'm thinking of the wrong form here. Both optimizations are invalid.
Stealing from DD32's feedback:
Limiting to $post’s post_id makes the function fast, but also useless, it’s designed for adding other meta to the current post that you’ve used elsewhere.. doing this just means that it’ll only show what you can already see. Ordering is done so that it gets the first 30(by default) results, ie. A, B, C, D instead of A, Z, G, R
#29
follow-up:
↓ 30
@
9 years ago
After chatting with @chriscct7 however, there is another angle which we could try here.
We could perform a query like this:
SELECT DISTINCT meta_key FROM wp_postmeta pm LEFT JOIN wp_posts p ON pm.post_id = p.id WHERE p.post_type IN ( 'post', 'page', 'current-post-type-of-the-post') AND meta_key NOT BETWEEN '_' AND '_z' HAVING meta_key NOT LIKE '\_%' ORDER BY meta_key DESC
I've not benchmarked it yet, but the idea is that plugins which add lots of meta_keys probably do it on "hidden" data post_types rather than the post_type that a user is actually editing. Since the user hasn't consciously added the meta keys to those posts, they shouldn't be shown here anyway.
If someone has a LOT of posts however (or only posts), this could potentially be slower under the conditions.
If anyone has a site that fits either of those extremes (thousands of meta_keys on non-post post_types - a large WooCommerce site for example I believe) or a site with many thousands of posts and can test that query out.. it'd be appreciated :)
#30
in reply to:
↑ 29
@
9 years ago
Replying to dd32:
After chatting with @chriscct7 however, there is another angle which we could try here.
We could perform a query like this:
SELECT DISTINCT meta_key FROM wp_postmeta pm LEFT JOIN wp_posts p ON pm.post_id = p.id WHERE p.post_type IN ( 'post', 'page', 'current-post-type-of-the-post') AND meta_key NOT BETWEEN '_' AND '_z' HAVING meta_key NOT LIKE '\_%' ORDER BY meta_key DESCI've not benchmarked it yet, but the idea is that plugins which add lots of meta_keys probably do it on "hidden" data post_types rather than the post_type that a user is actually editing. Since the user hasn't consciously added the meta keys to those posts, they shouldn't be shown here anyway.
If someone has a LOT of posts however (or only posts), this could potentially be slower under the conditions.
If anyone has a site that fits either of those extremes (thousands of meta_keys on non-post post_types - a large WooCommerce site for example I believe) or a site with many thousands of posts and can test that query out.. it'd be appreciated :)
That last query fixes the problem until you try to add a post of big type (This site has a location database with a couple 100k posts). The problem seems pretty intractable once you do any operation that requires a temp table with a couple 100k rows performance collapses. This effect seems to be made worse by utf8mb4.
At the end of the day this is pretty frustrating. Fighting a performance problems for a feature that spends most of its time hidden from view. Maybe just a add a hook for manually setting the meta_key list, and not running the query in that case would be a good solution. Its still a landmine waiting to blow up people's sites, but at least people can fix it.
#32
@
9 years ago
#18979 which was closed as duplicate, has it's own patch on the idea of post type limitation
This ticket was mentioned in Slack in #core by antpb. View the logs.
9 years ago
#35
follow-up:
↓ 36
@
9 years ago
I ran into this problem recently. When testing an update of a WP site with ~4 million post meta rows, I began seeing issues with this query. Before I knew this ticket existed, I worked on a reduced test case to illustrate the problem. Fortunately, I think it'll be really helpful for testing new queries as we move along.
I created a script and Travis CI integration (https://github.com/tollmanz/utf8mb4-query-time) that generates test data and a number of scenarios to test these queries. The Travis CI integration does the following:
- Creates a SQL file with 1 million insert statements to populate a
wp_postmeta
table - Creates a 4.1.x style
wp_postmeta
table without theutf8mb4
character set/collation and populates it with the test data- Runs the 4.1.x style query
- Runs the 4.3.x style query
- Creates a 4.3.x style
wp_postmeta
table with theutf8mb4
character set/collation and populates it with the test data- Runs the 4.1.x style query
- Runs the 4.3.x style query
- Creates a 4.3.x style
wp_postmeta
table with theutf8mb4
character set/collation; however, switches themeta_key
column back toutf8
. This helps isolate the character set as the concern (and additionally allows me to test a temporary workaround that I might use). The test data is populated into this table.- Runs the 4.1.x style query
- Runs the 4.3.x style query
To be clear, the "4.1.x style query" is:
SELECT meta_key FROM wp_postmeta GROUP BY meta_key HAVING meta_key NOT LIKE '\_%' ORDER BY meta_key LIMIT 30;
And the "4.3.x style query" is:
SELECT DISTINCT meta_key FROM wp_postmeta WHERE meta_key NOT BETWEEN '_' AND '_z' HAVING meta_key NOT LIKE '\_%' ORDER BY meta_key LIMIT 30;
If you view the Travis build, you'll be able to see query results, timing information, and EXPLAIN
information.
The TL;DR results of a recent run (https://travis-ci.org/tollmanz/utf8mb4-query-time/builds/80031807) were:
| utf8 | utf8mb4 | utf8 meta_key | |-------------|--------------|---------------| 4.1.x query | 0.00036725s | 21.15712650s | 0.00038975s | 4.3.x query | 0.00051800s | 21.58118725s | 0.00043800s |
If anyone would like me to add more test queries or change things up here, I'm more than happy to do so. This is really easy now that it's scripted. Additionally, you could just fork it, make changes, and PR, which will kick off a run.
Is there any reason that the meta_key
column has to be utf8mb4? Would it make sense to consider reverting it to its previous state? That feels kinda dirty to me, but it is at least worth consideration.
Note...I tried to run @dd32's query in these tests, but it's not set up to have a wp_posts
table yet, so I'd need to do some more work on that. I did, however, try the query against the DB that started me looking at this and the query was taking ~15s. The post's table is in the neighborhood of 500k rows.
h/t to @jorbin for working through parts of this with me.
#36
in reply to:
↑ 35
@
9 years ago
We should likely spin this off into it's own ticket. That said, I added an additional test case based off tollmanz's github and travis setup (Awesome work right there. Thank You) which sets the innodb_large_prefix option to on. This increases the limit beyond 767 byte limit.
| utf8 | utf8mb4 | utf8 meta_key | with option utf8 | with option mb4 | |-------------|--------------|---------------|------------------|-----------------| 4.1.x query | 0.00037275s | 39.45441000s | 0.00044975s | 0.00030775s | 17.67479500s | 4.3.x query | 0.00047425s | 22.14528975s | 0.00051775s | 0.00052575s | 20.47027650s |
The options doesn't reduce the query down to the pre utf8mb4 levels.
This ticket was mentioned in Slack in #core by jorbin. View the logs.
9 years ago
#38
follow-up:
↓ 39
@
9 years ago
Hey @tollmanz, sorry about taking so long to get back to you.
Could you please open up a new ticket? Also, would it be possible for you to run these tests against MariaDB 10.0? I'm reaching out to some MariaDB contacts who've offered to help, but it'd be good to show them numbers from MariaDB.
#39
in reply to:
↑ 38
@
9 years ago
Replying to pento:
@pento we've also got a site that we can test on MariaDB if you'd like. Just let me know what you're looking for (same EXPLAINs as @tollmanz ran?). Thanks!
Hey @tollmanz, sorry about taking so long to get back to you.
Could you please open up a new ticket? Also, would it be possible for you to run these tests against MariaDB 10.0? I'm reaching out to some MariaDB contacts who've offered to help, but it'd be good to show them numbers from MariaDB.
#40
@
9 years ago
@pento I tried MariaDB, but got the same results: https://travis-ci.org/tollmanz/utf8mb4-query-time/builds/82649692. @jorbin opened a new ticket here: https://core.trac.wordpress.org/ticket/33885. Shall we discuss there?
#41
@
9 years ago
I think I figured out the issue, and posted in the new ticket: https://core.trac.wordpress.org/ticket/33885#comment:2.
#42
@
20 months ago
Hi,
I'm following up to the slow wordpress dashboard.
I've noticed this post titles Improving performance of meta_form() with ticket #24498
I have been using the following code in the functions.php file. However, the following code has to be removed as it removes the custom fields section. Can anyone advise the best available code to add to the functions.php file to quicken the dashboard.
add_action( 'admin_menu', 's9_remove_post_custom_fields_metabox');
/
- Remove Ancient Custom Fields metabox from post editor
- because it uses a very slow query meta_key sort query
- so on sites with large postmeta tables it is super slow
- and is rarely useful anymore on any site
*/
functions9_remove_post_custom_fields_metabox() {
foreach( get_post_types( , 'names') as$post_type) {
remove_meta_box( 'postcustom', $post_type, 'normal');
}
}
add_action( 'admin_menu', 's9_remove_post_custom_fields_metabox');
When I update any product I receive the following slow query under the Query Monitor. The only way to quicken it is to use the full code further below but it removes the custom fields section.
SELECT DISTINCT meta_key
FROM orw_postmeta
WHERE meta_key NOT BETWEEN '_'
AND '_z'
HAVING meta_key NOT LIKE '
_%'
ORDER BY meta_key
LIMIT 30
meta_form()
WordPress Core 30 51.9917
/
- Remove Ancient Custom Fields metabox from post editor
- because it uses a very slow query meta_key sort query
- so on sites with large postmeta tables it is super slow
- and is rarely useful anymore on any site
*/
functions9_remove_post_custom_fields_metabox() {
foreach( get_post_types( , 'names') as$post_type) {
remove_meta_box( 'postcustom', $post_type, 'normal');
}
}
add_action( 'admin_menu', 's9_remove_post_custom_fields_metabox');
#32361 was marked as a duplicate.