Opened 8 years ago
Closed 5 years ago
#39768 closed defect (bug) (fixed)
Incorrect image returned with attachment_url_to_postid()
Reported by: | ben.greeley | Owned by: | SergeyBiryukov |
---|---|---|---|
Milestone: | 5.3.3 | Priority: | normal |
Severity: | normal | Version: | 4.7.2 |
Component: | Media | Keywords: | has-patch commit fixed-major |
Focuses: | Cc: |
Description
When two files are uploaded to the media library during the same month that have the same name and differ in capitalization, the function
attachment_url_to_postid()
can return the incorrect image.
For example, if two images were uploaded named untitled.jpg and Untitled.jpg, the SQL query that is being run in
attachment_url_to_postid()
isn't case-sensitive so it will use the first image that is returned, which may not be correct if the desired image is Untitled.jpg, but the image information that is returned is for untitled.jpg. The meta_value for each _wp_attached_file is saved as 2017/02/untitled.jpg and 2017/02/Untitled-1.jpg, so a query to match http://url.dev/wp-content/uploads/sites/9/2017/02/Untitled.jpg returns 2017/02/untitled.jpg, which is the incorrect image.
I created a suggested workaround, which grabs matches in case there was a duplicate filename and returns the closest case-sensitive result. https://gist.github.com/bengreeley/29a6edaca5c370091dd2cafc3369e400
Attachments (5)
Change History (23)
#2
@
8 years ago
@tristangemus thanks for the prompt response. The use case is one image was named Unknown.jpg that was uploaded to a site. Later, unknown.jpg was added to the media library, specifically these two images: http://www.lamag.com/wp-content/uploads/sites/9/2017/01/Unknown.jpg and http://www.lamag.com/wp-content/uploads/sites/9/2017/01/unknown.jpg . When the full image is embedded onto a post, we have a link like <img class="aligncenter wp-image-671724 size-full" src="http://www.lamag.com/wp-content/uploads/sites/9/2017/01/unknown.jpg" />
There is some code that runs attachment_url_to_postid( 'http://www.lamag.com/wp-content/uploads/sites/9/2017/01/unknown.jpg' ), but the image that it returns is a reference to http://www.lamag.com/wp-content/uploads/sites/9/2017/01/Unknown.jpg . It appears this is happening because the table already has one value of 2017/01/Unknown.jpg, which matches its SQL query trying to find 2017/01/unknown.jpg since mySQL queries aren't case-sensitive.
I'm able to reproduce in my local environment.
#3
@
8 years ago
Right but if you use Wordpress to insert the image, it should be referencing the image as Unknown-1.jpg. The only way I see this being an issue is if the image is manually being typed into the editor/codebase. In that case, you will get the incorrect image, but Wordpress is selecting the proper one based on intended logic.
If you are able to take a quick screen recording and upload it to Youtube, that would be helpful so we can see if this is a bug or not.
Thanks!
#4
follow-up:
↓ 6
@
5 years ago
I'm not sure how, but we're also running into a situation when two images named, say, Home.png
and home.png
, are uploaded days apart, and WP doesn't assign -1
to one of them.
The issue is then that attachment_url_to_postid()
runs a query without checking the case, 2 rows are returned from the db, and it always picks the first one, which is wrong.
A simple solution seems to be forcing the case check in this function's query using BINARY
https://stackoverflow.com/a/5629121/47680.
I'm really not sure why -1
isn't appended, and in my test just now it did, but we have several examples in the db where mixed case files exist in the same dir.
We've worked around it by duplicating and fixing attachment_url_to_postid() ourselves, but would love to see the fix in the core.
#6
in reply to:
↑ 4
;
follow-up:
↓ 7
@
5 years ago
- Milestone changed from Awaiting Review to 5.4
Replying to archon810:
I'm really not sure why
-1
isn't appended, and in my test just now it did
Yes, it appears to be fixed with the introduction of _wp_check_existing_file_names()
in 5.3.1, [46822] / #42437, so should not be an issue going forward.
I'm up for updating the query in attachment_url_to_postid()
to be case-sensitive and handle already existing images, if that does not reduce performance.
#7
in reply to:
↑ 6
;
follow-up:
↓ 8
@
5 years ago
Replying to SergeyBiryukov:
I'm up for updating the query in
attachment_url_to_postid()
to be case-sensitive and handle already existing images, if that does not reduce performance.
On a site with ~32000 attachments, seeing no noticeable difference in performance:
SELECT post_id FROM wp_postmeta WHERE meta_key = '_wp_attached_file' AND meta_value = '2009/03/myimage.jpg' Query took 0.0292 seconds. SELECT post_id FROM wp_postmeta WHERE meta_key = '_wp_attached_file' AND BINARY meta_value = '2009/03/myimage.jpg' Query took 0.0294 seconds
Both queries use the index:
id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
1 | SIMPLE | wp_postmeta | ref | meta_key | meta_key | 768 | const | 30557 | Using index condition; Using where |
Should be good to go.
#8
in reply to:
↑ 7
;
follow-up:
↓ 9
@
5 years ago
I'm sure you saw this https://stackoverflow.com/a/56283818, but just in case.
Replying to SergeyBiryukov:
Replying to SergeyBiryukov:
I'm up for updating the query in
attachment_url_to_postid()
to be case-sensitive and handle already existing images, if that does not reduce performance.
On a site with ~32000 attachments, seeing no noticeable difference in performance
SELECT post_id FROM wp_postmeta WHERE meta_key = '_wp_attached_file' AND meta_value = '2009/03/myimage.jpg' Query took 0.0292 seconds. SELECT post_id FROM wp_postmeta WHERE meta_key = '_wp_attached_file' AND BINARY meta_value = '2009/03/myimage.jpg' Query took 0.0294 secondsBoth queries use the index:
id select_type table type possible_keys key key_len ref rows Extra 1 SIMPLE wp_postmeta ref meta_key meta_key 768 const 30557 Using index condition; Using where Should be good to go.
#9
in reply to:
↑ 8
@
5 years ago
- Keywords has-patch added
Replying to vsamoletov:
I'm sure you saw this https://stackoverflow.com/a/56283818, but just in case.
I missed that, thanks for bringing it to my attention.
39768.diff uses the BINARY
approach.
It looks like that could introduce some unintended regressions when dealing with accented characters, or if the column charset doesn't match the session charset, e.g. latin1
vs. utf8mb4
.
Using CONVERT()
appears to be a better option. While utf8mb4
should be available on any modern install, that may not always be the case, so we have to check for it specifically, and fall back to the current query otherwise. This option also has a minor performance difference from the queries above in my testing:
SELECT post_id FROM wp_postmeta WHERE meta_key = '_wp_attached_file' AND meta_value = CONVERT( '2009/03/myimage.jpg' USING utf8mb4 ) COLLATE utf8mb4_bin Query took 0.0315 seconds
39768.2.diff uses the CONVERT()
approach.
Given that the main issue of wp_unique_filename()
being case-sensitive is fixed as of 5.3.1, I'm now a bit less convinced attachment_url_to_postid()
needs this change. That said, if 39768.2.diff looks sensible, I guess we could give it a go.
#10
follow-up:
↓ 11
@
5 years ago
@vsamoletov has a good point regarding the warning.
But @SergeyBiryukov please consider the fact that while new cases may be resolved in 5.3.1, existing cases are still in the database and should be resolved properly.
An alternate solution, and one we went with, is to get all the rows in PHP and if >1, go through them and do a case sensitive match in PHP.
#11
in reply to:
↑ 10
@
5 years ago
- Keywords commit added
Replying to archon810:
An alternate solution, and one we went with, is to get all the rows in PHP and if >1, go through them and do a case sensitive match in PHP.
Thanks, that actually seems the most straightforward option here, without any side effects that I can think of.
Implemented in 39768.3.diff.
#14
@
5 years ago
- Keywords fixed-major added
- Milestone changed from 5.4 to 5.3.3
- Resolution fixed deleted
- Status changed from closed to reopened
Reopening for 5.3.3 consideration.
#16
@
5 years ago
- Milestone changed from 5.3.3 to 5.4
Moving all unfixed tickets from 5.3.3 to milestone 5.4, as there is no plan for a 5.3.3 maintenance release for now.
Welcome to core @ben.greeley
From what I can see Wordpress is handling this correctly. If a duplicate is found using the same case or not, that image is appended with 1, 2, and so on. If you uploaded an image as untitled.jpg twice, then the second time around the URL you need to pass should have the be appended with -1.
What's the use case where you're passing the original image URL even though Wordpress changes it before uploading?