#32264 closed enhancement (wontfix)
wp_enqueue_media() is slow on large sites
Reported by: | philipjohn | Owned by: | rhurling |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 3.9 |
Component: | Posts, Post Types | Keywords: | |
Focuses: | performance | Cc: |
Description
Following on from #27985:
Despite the improvements in #27985, it looks like wp_enqueue_media() is still slow on large sites.
On WordPress.com VIP we're seeing multiple sites reporting very slow post saving. I've identified that the queries in [28194] are taking longer than 3,500ms on saving posts.
Here's debug bar output:
SELECT ID FROM wp_x_posts WHERE post_type = 'attachment' AND post_mime_type LIKE 'audio%' LIMIT 1 include, wp_enqueue_media, wpdb->get_var, hyperdb->query #17 (3,599.6ms @ 6032.12ms)
and EXPLAIN on that query:
$wpdb->get_row("EXPLAIN SELECT ID FROM wp_x_posts WHERE post_type = 'attachment' AND post_mime_type LIKE 'audio%' LIMIT 1", ARRAY_A); array(10) { 'id' => string(1) "1" 'select_type' => string(6) "SIMPLE" 'table' => string(17) "wp_x_posts" 'type' => string(3) "ref" 'possible_keys' => string(16) "type_status_date" 'key' => string(16) "type_status_date" 'key_len' => string(2) "22" 'ref' => string(5) "const" 'rows' => string(6) "253750" 'Extra' => string(11) "Using where" }
Attachments (9)
Change History (29)
#2
@
9 years ago
- Keywords needs-patch good-first-bug added
- Type changed from defect (bug) to enhancement
I think as per the discussion on #27985, we should cache the result of each of these queries in a transient in much the same way we do for is_multi_author()
.
#3
@
9 years ago
- Keywords has-patch added; needs-patch removed
I'm thinking about adding another transient for the query that was added in [29271], but I'm not so sure about that.
Furthermore I'm not sure if I should prefix the transient name or not.
#4
@
9 years ago
@rhuruling: Has that made a difference to the queries in your testing? I tried a similar change but the value of the transient was false, causing the query to run every time still.
#5
@
9 years ago
Wait... the patch before worked...
Well, now both should work and it's more preference. And now I'm not sure why i rewrote it a bit.
@philipjohn: In my testing both patches prevent the query from being run more than once, since the transient value is an empty string for me, because I think boolean's get converted.
Did you test it with an object cache? Maybe that's interfering, because the boolean isn't converted to a string or something like that (which would mean that only the 2nd patch would work and I didn't write it for nothing).
#6
@
9 years ago
Did you test it with an object cache?
Ah yes I did. I will test this patch again today and report back.
#7
@
9 years ago
WFM! I'm adding a modified patch that also caches the months query, which is also slow on large sites.
#8
@
9 years ago
- Keywords needs-patch added; has-patch removed
None of the above patches work because they use a persistent transient which is never flushed. When an audio/video file is uploaded, the transient will contain a stale value.
These two transients should be flushed whenever an audio/video attachment is uploaded or deleted.
Same goes for the month query which should probably be flushed whenever any attachment is uploaded or deleted.
#9
@
9 years ago
- Keywords has-patch added; needs-patch removed
I created a new patch per @johnbillion's suggestion, which deletes the transient in wp_insert_post.
Edit: I forgot to delete the transients on attachment deletion, so I added that to the patch.
#11
@
9 years ago
Related: #31071 - transients here (for post_mime_type
) just mask the fact that the schema doesn't scale in these circumstances
Also, we shouldn't do false === ( $var = whatever() )
. Variable assignment shouldn't happen in the middle of checking conditions.
#12
follow-up:
↓ 14
@
9 years ago
Is there a strong reason to use transients for this instead of just having a object cached result that is keyed on something that includes the last_changed value for posts?
I guess the only reason would be to try and leverage this for sites without a persistent cache backend, I'm not sure I'd want to run without one for this many posts however ;)
#14
in reply to:
↑ 12
@
9 years ago
Replying to westi:
Is there a strong reason to use transients for this instead of just having a object cached result that is keyed on something that includes the last_changed value for posts?
I guess the only reason would be to try and leverage this for sites without a persistent cache backend, I'm not sure I'd want to run without one for this many posts however ;)
My biggest fear with using the caching backend is that we'd be putting something into the system that's not clear or visible. A landmine; magic functionality that takes place if your infrastructure is set up in a certain way. Unless you also add in an admin notice that says "hey, your site is pretty big, it's going to run slow unless you implement an object cache", then I think it be implemented in a way that any site can take advantage of.
A smaller reason is that dev and QA environments don't always have caching (on purpose) and that's where our dev team and product managers spend the greatest amount of their time. So we'd have to implement object caching, which would change the way our team works and the resources required for our dev environments, or deal with a really really slow dev environment. Neither one is the end of the world, but enough reason for me to agree with using a transient to store this info.
#15
@
9 years ago
Is there a reason to not add an index (as suggested in comment:1), in addition to caching the query? It's great to cache the result, but we're not fixing the core issue of having a slow query that is improved by (and needs) an index.
#16
@
9 years ago
On 32264.6.patch:
- There is some repeated code. I would recommend making a function to purge the media cache, which takes a post ID, and hooking into
add_attachment
anddelete_attachment
. This becomes a little more challenging when considering the next two notes... - It's only necessary to delete the
has_audio
,has_video
transients onadd_attachment
if the transient value is 0. For a site that adds a lot of audio or video, this can be a noticeable performance improvement, since that query would only have to be run the first time, and then only after an audio or video file is deleted. - It's only necessary to delete the
has_audio
,has_video
transients ondelete_attachment
if the transient value is 1. - It's only necessary to delete the
media_months
transient onadd_attachment
if the month and year of the attachment are not already included in the transient. Furthermore, if they aren't included, we could just add them to the transient instead of deleting it and forcing that query to run again.
@philipjohn: I'm happy to make these changes if you don't have time.
In general, I'm in favor of these being transients over using the object cache. I don't see a reason to run these queries on every single pageload of the new/edit post screen, regardless of the site's size or object cache preference.
#17
@
9 years ago
32264.7.patch makes a start on that @mboynes
It introduces four new functions:
media_has_audio()
is the new home for the transient setting and DB query
media_has_video()
as above but for videos
media_months()
as above but for the months query
wp_check_has_media()
hooks into add_attachment
and delete_attachment
, checking the current values of the relevant transient and deleting it where neccessary and thus causing the calls at wp-includes/media.php:2921-2922 to re-create the transient only when needed.
The transient deletion in wp-includes/post.php has been removed.
I'm not confident the new functions and add_action()
calls are really in the right place, so feedback welcome there.
The media_months
transient still needs to be deleted. I'm thinking an additional hook for that, rather than rolling into wp_check_has_media()
#18
@
9 years ago
I wonder if we should introduce the has_audio
and has_video
counters (in the options table) instead of using these potential slow SQL queries at all?
These counters could be updated when we upload or delete audio/video files.
This could of course introduce potential sync problems, but then we could adjust the options value if needed.
@wonderboymusic helped do some debugging on this issue. The problem is on a site with 1.7 million attachments and 0 audio or video attachments, this query has to do a ginormous table scan using a query that is not able to take advantage of existing indexes. His suggestion was to add an index to post_mime_type which reduces this query from taking several seconds, down to milliseconds.