Opened 3 years ago
Last modified 2 months ago
#55274 reviewing defect (bug)
Special chars in attachment filename breaks srcset
Reported by: | dravnic | Owned by: | audrasjb |
---|---|---|---|
Milestone: | Future Release | Priority: | normal |
Severity: | normal | Version: | 4.4 |
Component: | Media | Keywords: | |
Focuses: | Cc: |
Description
There is currently fix that handles spaces in filenames, but it doesn't handle other special characters e.g. $! or any other special char
Spaces issue was reported there:
https://core.trac.wordpress.org/ticket/36549
And fixed here:
https://core.trac.wordpress.org/changeset/38052
Attachments (5)
Change History (14)
#1
@
3 years ago
- Keywords has-patch has-unit-tests added
- Milestone changed from Awaiting Review to 6.0
- Owner set to audrasjb
- Status changed from new to reviewing
#3
@
3 years ago
Please consider patches 55274.3.diff and 55274.2.2.diff (don't know how to remove 55274.diff and 55274.2.diff)
rawurlencode is better to encode path names than urlencode. e.g. spaces will be converted to "%20" instead "+". As well updated unit test as there was a typo.
#4
@
3 years ago
Fixed another typo in the unit test. Sorry for multiple patches :)
Final patches are:
55274.3.diff
55274.2.3.diff
#5
@
3 years ago
I was reviewing this discussion: #36549
The patches proposed here are fixing url special chars before the filter. This could break filter behavior if filter function expects urls that aren't encoded or if the filter itself changes urls and introduces special chars.
The fix proposed by @joemcgill https://core.trac.wordpress.org/ticket/36549#comment:6 (https://core.trac.wordpress.org/attachment/ticket/36549/36549.2.diff) seems better as it would escape url after the filter.
Note that esc_attr()
that was mentioned in the above discussion didn't properly encode special chars for urls. so esc_url()
should be done before.
#6
@
3 years ago
I've tried using esc_url()
like in https://core.trac.wordpress.org/attachment/ticket/36549/36549.2.diff but it didn't help.
The image filename that I used to test this is smeće-1024x614.png
the special char in this filename is ć
- a Croatian letter. In the src attribute path is properly encoded sme%C4%87e-1024x614.png
}
With current code and with esc_url()
the image was encoded in srcset as smeće-1024x614.png
which is an invalid url.
When patched with https://core.trac.wordpress.org/attachment/ticket/55274/55274.3.diff it was encoded as sme%C4%87e-1024x614.png
which is ok.
So rawurlencode()
seems a better option, but it cannot be used after filter because then it would encode entire url not only filename. Maybe to extract filename from the full path and then encode it and append back on its path. Although directory name can as well contain special chars. To cover these cases it work require a bit more logic.
It would be great if we could encode before the filter and then leave it to filter to handle its urls properly.
#8
@
3 years ago
- Milestone changed from 6.0 to Future Release
It seems the various patches on this ticket need consolidation and a little more work so I am moving this of the 6.0 milestone.
Ideally, I think the tests would be separate and in addition to the spaces test. It would also be good to test some accents and non-latin characters as indicated in comment 6.
#9
@
2 months ago
Currently investigating an issue that may be related to this bug. MacOS generated screenshots have auto generated filenames that include spaces, but not all are stripped out by WP core:
Example file uploaded via media libary:
https://redacted.com/wp-content/uploads/2022/09/Screenshot-2022-09-19-at-12.09.42%E2%80%AFPM.jpg
It seems to miss that last space in there even though the original filename contains many spaces.
<meta property="og:image" content="https://redacted.com/wp-content/uploads/2022/09/Screenshot-2024-09-19- at-12.09.42 PM.jpg"/>
This renders as a meta tag, Facebook debugger reports:
Corrupted Image Provided og:image URL, https://redacted.com/wp-content/uploads/2022/09/Screenshot-2022-09-19-at-12.09.42 PM.jpg could not be processed as an image. It may be corrupted or may have an invalid format.
Potentially related thread:
https://wordpress.org/support/topic/spaces-in-filenames-leads-to-unloaded-image-invalid-error/
Hello @dravnic, Welcome to WordPress Core Trac and thank you for opening this ticket,
The proposal looks good at a glance. Also, thank you for updating unit tests.
Moving for 6.0 consideration.