Make WordPress Core

Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#48451 closed defect (bug) (worksforme)

Regression: wp_update_attachment_metadata filter now fires very often and without complete metadata

Reported by: ianmjones's profile ianmjones Owned by:
Milestone: Priority: normal
Severity: normal Version: 5.3
Component: Media Keywords: has-patch needs-testing close
Focuses: Cc:

Description

WP5.3 has changed how often the wp_update_attachment_metadata filter is fired when an image is uploaded to the Media Library.

It fires initially with just the full (and potentially original_image) details but no sizes, and then again as each image sub-size is generated and added to the sizes array.

With wp_update_attachment_metadata now firing a minimum of 5 times (10 for "large" files) instead of once, I predict quite a resource hit from plugins processing the filter. There's plenty of image related plugins that hang off that filter to trigger optimizations and other media related processes assuming that the upload is now complete and thumbnails available.

We regularly see customers with 25-40 or more registered image sizes, that's going to mean rather a lot of database calls to update the metadata as each sub-size is generated, followed by plugins processing the image(s) each time.

Maybe the solution is for wp_update_attachment_metadata() to gain an $unfiltered param that is set to true during initial upload and sub-size generation, with a final call to the function without $unfiltered when done and dusted?

A common pattern throughout WP5.2 (and still in WP5.3) is as follows...

wp_update_attachment_metadata( $id, wp_generate_attachment_metadata( $id, $file ) );

Such as in \media_handle_upload() and friends.

So that's why a lot of plugins wait for the complete metadata to be generated (wp_generate_attachment_metadata) and then saved (wp_update_attachment_metadata) before handling the new upload.

Attachments (1)

48451.diff (3.1 KB) - added by azaozz 5 years ago.

Download all attachments as: .zip

Change History (17)

#1 follow-up: @azaozz
5 years ago

  • Keywords needs-patch 2nd-opinion added
  • Milestone changed from Awaiting Review to 5.3

Moving to 5.3 for consideration.

At first look:

A common pattern throughout WP5.2 (and still in WP5.3) is as follows...
wp_update_attachment_metadata( $id, wp_generate_attachment_metadata( $id, $file ) );

Right, but wp_update_attachment_metadata() is also used in quite a few other cases. For example when editing an attachment post, in wp_ajax_crop_image(), to update the ID3 data for audio files in wp_ajax_save_attachment(), when editing an image (crop, rotate, ...), for custom background and header images, etc.

Seems using the same filter to detect when images have been uploaded successfully is problematic?

Until now the wp_generate_attachment_metadata filter was the (de facto) "upload is successful" hook. This is still the case in n 5.3, however when the initial upload request fails and the client makes another request to finish image post-processing wp_generate_attachment_metadata is not fired. There is a proposed "action" in (the new) wp_update_image_subsizes(), see https://core.trac.wordpress.org/attachment/ticket/47872/47872.3.diff.

Thinking that adding an $unfiltered param to wp_update_attachment_metadata() and avoid filtering would be wrong. Other plugins may be using that filter for other purposes, for example moving/updating/fixing the paths to the image sub-sizes.

It's also possible that the image attachment meta data is stored temporarily "somewhere" while image sub-sizes are being created (another post meta?), then "moved" to the proper place. At first look not liking much that workaround.

Any other ideas and/or possible fixes? :)

#2 @azaozz
5 years ago

Looking again at the proposed wp_update_image_subsizes action in https://core.trac.wordpress.org/attachment/ticket/47872/47872.3.diff.

Think the problem reported here is related and would be better to handle that on this ticket. Both are about firing a hook after an image has been uploaded and all sub-sizes were created successfully.

Perhaps it would be better to fire wp_generate_attachment_metadata and then save the meta again with wp_update_attachment_metadata() at the end of wp_update_image_subsizes()?

#3 in reply to: ↑ 1 @ianmjones
5 years ago

Replying to azaozz:

Right, but wp_update_attachment_metadata() is also used in quite a few other cases. For example when editing an attachment post, in wp_ajax_crop_image(), to update the ID3 data for audio files in wp_ajax_save_attachment(), when editing an image (crop, rotate, ...), for custom background and header images, etc.

Seems using the same filter to detect when images have been uploaded successfully is problematic?

I should clarify the problem here, it's not so much that the wp_update_attachment_metadata is used for knowing when an upload is complete, it is more used to know when the image and its sub-sizes have potentially changed.

So initially yes, on a new upload we (WP Offload Media) use it as a signal that an image has been uploaded and all of its thumbnails are present and correct and ready to be sent on their merry way to a storage provider.

However, if thumbnails are regenerated, or an image optimizer plugin batch job processes the image and updates the thumbnails, they too generally call wp_update_attachment_metadata when they are done with the Media Library item. And again, that means plugins like WP Offload Media can re-offload the images to S3/GCS/DO Spaces etc.

The problem with the new setup is that it fires wp_update_attachment_metadata a lot, and with partial metadata (e.g. no sizes array initially). Which is very different to how it has been.

This ticket was mentioned in Slack in #core by azaozz. View the logs.


5 years ago

#5 @ianmjones
5 years ago

  • Summary changed from Regression: wp_update_attachment_metadata filter now fires very often and without compete metadata to Regression: wp_update_attachment_metadata filter now fires very often and without complete metadata

@azaozz
5 years ago

#6 @azaozz
5 years ago

  • Keywords has-patch needs-testing added; needs-patch removed

Went through several possible options on how to fix this. It's true that using wp_update_attachment_metadata as a hook that means "file uploaded successfully" is somewhat non-standard. It is intended for filtering the actual data before saving it in the DB.

However it seems several plugins use it to continue post-processing after a file was uploaded. A better hook for that perhaps would be wp_generate_attachment_metadata. At the same time it's true there is a "change in behaviour" for the wp_update_attachment_metadata filter. As the attachment metadata is updated after each image sub-size is generated, now it runs several times in a row.

Seems the best way to fix that new behaviour would be to not run wp_update_attachment_metadata() for interim updates of the meta, only run it at the end after all post-processing has been done successfully. This change will also prevent firing the wp_update_attachment_metadata filter in cases where the post-processing of an image fails after all retry attempts and the upload is deemed "unsuccessful" and the attachment is deleted.

In 48451.diff:

  • Do not use wp_update_attachment_metadata() for interim updates of the data while an image is being post-processed after upload.
  • Introduce wp_interim_update_attachment_metadata filter for these interim updates.

This ticket was mentioned in Slack in #core-media by mike. View the logs.


5 years ago

#8 @joemcgill
5 years ago

After looking through the changes proposed here, I think that @azaozz's initial suggestion in comment #1 is the right approach.

Right now, wp_update_attachment_metadata() is used everywhere in Core where an attachment's metadata is being updated—and is the recommended function that should be used by plugins when updating attachment post meta for any number of reasons. People are able to reliably hook into the wp_update_attachment_metadata filter to modify attachment post metadata before it is saved to the database. With the proposed change in 48451.diff, we would be breaking that assumption and introducing complexity which means people who need to be able to detect and modify data before it's written to the database would now have to listen to multiple different filters depending on why attachment metadata is being updated. This additional complexity makes the system more brittle and introduces lots of potential for inconsistencies where bugs can/will occur.

For example, imagine that some plugin in the future decided to call wp_create_image_subsizes() for some reason. WordPress would update the post meta for an attachment, but an extension hooked to wp_update_attachment_metadata to move files elsewhere would never be notified.

I'd prefer that we keep the system consistent and we always call wp_update_attachment_metadata() when we're updating attachment data in the database for any reason and ensure we have hooks in place for people who need more efficient ways of knowing when subsize generation is completed or files have changed.

#9 follow-up: @johnbillion
5 years ago

  • Keywords close added; 2nd-opinion removed

Chatted with Joe about this at WCUS and we're in agreement that it's unsafe for an extension to assume that the image sizes array passed to this filter is complete, especially as it's called in different places in core for many reasons.

In addition to the points above, the change in the patch has potential to result in a situation where image size meta data has partially been generated and stored, but the final call to wp_update_attachment_metadata() never gets called due to memory allocation issue, which would result in an attachment with metadata that's made available but has never made its way through the wp_update_attachment_metadata filter. This is also the case in the interim while image sizes are being generated.

If an extension uses the wp_update_attachment_metadata filter to offload media elsewhere, it should keep track of which files of which sizes have been offloaded to avoid duplicate processing, regardless of the incremental image size generation changes that have gone into 5.3.

Recommending for close.

Version 0, edited 5 years ago by johnbillion (next)

#10 in reply to: ↑ 9 @ianmjones
5 years ago

Replying to johnbillion:

If an extension uses the wp_update_attachment_metadata filter to offload media elsewhere, it should keep track of which files of which sizes have been offloaded to avoid duplicate processing, regardless of the incremental image size generation changes that have gone into 5.3.

When I first discovered this change in behaviour a couple of weeks ago, I developed a solution for the next version of our plugin that works very similar to this. With any luck we'll have it ready for when WP5.3 is released.

Hopefully other plugin developers that relied on the age-old way wp_update_attachment_metadata has been used have caught this change too and have new versions in the works.

However, the tens, if not hundreds of thousands of sites that are going to be using plugins like ours that upgrade to WP5.3 but don't upgrade their plugins for whatever reason are going to feel some pain.

Recommending for close.

Makes sense if there is a chance that clearer indicators of the stages of upload might be added via the other ticket.

#11 @azaozz
5 years ago

Makes sense if there is a chance that clearer indicators of the stages of upload might be added via the other ticket.

The wp_generate_attachment_metadata filter now runs after an uploaded image has been post-processed, see #48472.

A possible alternative would be to pass more context to wp_update_attachment_metadata. It won't prevent it from running but may be useful when a plugin wants to avoid filtering on interim updates.

To be on the "safe side" thinking we should still keep this open for 1-2 days, hopefully there will be more feedback after https://make.wordpress.org/core/2019/11/05/use-of-the-wp_update_attachment_metadata-filter-as-upload-is-complete-hook/.

This ticket was mentioned in Slack in #core by david.baumwald. View the logs.


5 years ago

#13 follow-up: @vanyukov
5 years ago

So at the end, which is the recommended way to track images uploads? Use wp_update_attachment_metadata and track what sizes have been processed or hook into wp_generate_attachment_metadata and process all the image sizes after the upload finishes?

#14 in reply to: ↑ 13 @azaozz
5 years ago

Replying to vanyukov:

which is the recommended way to track images uploads?

The wp_generate_attachment_metadata filter was fixed in #48472 to always run at the end of images post-processing after upload. In addition it won't run if an image was uploaded but post-processing failed and the new attachment and the original image were deleted on the last retry request. It is the best hook to use when additional post-processing is needed after a successful upload.

This ticket was mentioned in Slack in #core-media by mike. View the logs.


5 years ago

#16 @azaozz
5 years ago

  • Milestone 5.3 deleted
  • Resolution set to worksforme
  • Status changed from new to closed

Based on the comments/reviews here and the feedback on https://make.wordpress.org/core/2019/11/05/use-of-the-wp_update_attachment_metadata-filter-as-upload-is-complete-hook/ thinking this shouldn't change for 5.3.

Closing as worksforme for now. Please open new ticket(s) with ideas/suggestions/patches on how to improve it in the future.

Last edited 5 years ago by azaozz (previous) (diff)
Note: See TracTickets for help on using tickets.