Make WordPress Core

Opened 7 years ago

Last modified 3 years ago

#41816 new defect (bug)

attachment_url_to_postid() should find post_id for URLs of intermediate size images

Reported by: pbiron's profile pbiron Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 4.9
Component: Media Keywords: has-patch dev-feedback 2nd-opinion has-unit-tests
Focuses: Cc:

Description

attachment_url_to_postid() fails to find the post ID when given a URL of an intermediate sized image.

For example,

$url_of_full_sized_image = 'http://example.com/wp-content/uploads/test.jpg';
$attachment_id = attachment_url_to_postid( $url_of_full_sized_image );
// $attachment_id now equals 3

$url_of_intermediate_sized_image = 'http://example.com/wp-content/uploads/test-150x150.jpg';
$attachment_id = attachment_url_to_postid( $url_of_intermediate_sized_image );
// $attachment_id now equals 0 but should equal 3, the same as for the full-sized image

This should be fixed.

Attachments (2)

41816.diff (1.9 KB) - added by pbiron 7 years ago.
41816-unit-tests.diff (4.6 KB) - added by pbiron 7 years ago.

Download all attachments as: .zip

Change History (12)

@pbiron
7 years ago

#1 @pbiron
7 years ago

  • Keywords has-patch dev-feedback 2nd-opinion added

The patch is based on the code in this post on WP Scholar, which is based on this wordpress.stackexchange answer by rarst.

#2 @pbiron
7 years ago

  • Keywords has-unit-tests added

In addition to adding new unit tests, I had to change the existing unit tests for attachment_url_to_postid()...which is always a sketchy proposition, so let me explain why I did this.

The old unit tests used WP_UnitTest_Factory->attachment->create_object(), which does not actually perform an "upload"...and hence does not generate the _wp_attachment_metadata postmeta that the mod relies on.

Hence, I changed the old tests to use WP_UnitTest_Factory->attachment->create_upload_object() which actually performs an "upload".

Because of this change in the old unit tests, I would really appreciate close scrutiny of both the patch and the unit tests!!

#3 @pbiron
7 years ago

Another thing I'd appreciate feedback on is the heuristic nature of the mod.

Notice that, when the attachment is an image, the query does a _wp_attachment_metadata LIKE xxx. Does anyone have any ideas for how to make the query any more "exact"?

#4 @pbiron
7 years ago

Yet another thing I'd appreciate feedback on is the following.

Suppose the thumbnail size is 150x150, an image is uploaded and the thumbnail sized image is added to a post, e.g., the post content looks like:

blah, blah, blah

<img src='http://example.com/wp-content/uploads/test-150x150.jpg' />

yada, yada, yada

Then at a later time, the thumbnail size is changed to 100x100 and attachment metadata is updated for all existing images. Even tho the image in that post will still display correctly,

$attachment_id = attachment_url_to_postid( 'http://example.com/wp-content/uploads/test-150x150.jpg' );

will not find it's post ID (since _wp_attachment_metadata['sizes'] will no longer contain test-150x150.jpg) and it would be good if it could.

The only ways I can think to solve this particular problem are very computationally expensive and/or prone to false positives. Can anyone think an air-tight solution that is efficient?

Note

This particular question arises from my attempt to fix what I consider a bug in the exporter (see Exporter Redux: Issue 3). An export with post_type = post will not contain the attachment and, hence, when imported into another site the image in that post will still point to the exporting site instead of being rewritten to point to a newly imported attachment.

To address that exporter bug, I've written code that extracts attachment URLs from posts being exported, gets the attachment post ID using attachment_url_to_postid() (actually, until the patch in this ticket is accepted/merged, it uses a differently named function whose implementation is the same as in the patch) and adds those attachment posts to the export, but because of this problem that code is missing attachments such as the above.

Granted, even if this problem is solved, there is still a chance the intermediate size in question is not available on the importing site, and hence, the image in the imported post wouldn't display. But, that particular problem is relatively easy to fix by having the exporter add "metadata" to the WXR file for such intermediate image sizes which the importer could then add (using add_image_size()) temporarily during the import.

This ticket was mentioned in Slack in #core-media by antpb. View the logs.


6 years ago

#6 @pbiron
6 years ago

I've abandoned the approach in 41816.diff (the "hueristic" isn't reliable).

Instead, a while ago I came up with an "exact" algorithm that I implemented as a function that can be hooked to atthement_url_to_postid in a gist.

However, as mentioned in that gist, there are several cases where the _wp_attachment_metadata postmeta it relies on can be incorrect and until those are fixed I don't think it is worth pursuing converting that algorithm into a patch.

Related: #44127, #42437, #44095

#7 follow-up: @leodandesign
6 years ago

I've implemented a very simple solution that uses preg_replace to remove the dimension element of the url and then run the attachment_url_to_postid() function.

<?php
$image = preg_replace('/-\d+[Xx]\d+\./', '.', $thumbnail );
$image_id = attachment_url_to_postid($image);

It's not elegant, but effective. Couldn't this be dropped into attachment_url_to_postid() function to get the url using a simple pattern instead of making another database lookup.

In the real world, I am using this to pull through a full size image url if the image is part of the $post->post_content for a Schema Json-LD object.

I've also used it in the past as part of function to filter the img src to replace it with the smaller img of the same ratios, for a front end javascript function to dynamically calculate the correct srcset img to get based on image width on-screen.

Again, not elegant, but works.

Last edited 6 years ago by leodandesign (previous) (diff)

#8 in reply to: ↑ 7 @pbiron
6 years ago

Replying to leodandesign:

I've implemented a very simple solution that uses preg_replace to remove the dimension element of the url and then run the attachment_url_to_postid() function.

Unfortunately, that approach can fail (badly), see #42437.

#9 @celloexpressions
4 years ago

#51058 is a slightly different result of this same problem. attachment_url_to_postid() should be able to identify any WordPress-generated files that are related back to the same attachment post/file. Or, there should be a separate function that serves this purpose.

There are several situations where plugins or themes are only able to access a single attachment-related URL (ex. one scaled image size), and they need to find the attachment ID to access other attachment data.

#10 @jdorner
3 years ago

I've submitted a patch to https://core.trac.wordpress.org/ticket/51058 that addresses this issue.

Note: See TracTickets for help on using tickets.