Make WordPress Core

Opened 3 years ago

Last modified 2 months ago

#55274 reviewing defect (bug)

Special chars in attachment filename breaks srcset

Reported by: dravnic's profile dravnic Owned by: audrasjb's profile audrasjb
Milestone: Future Release Priority: normal
Severity: normal Version: 4.4
Component: Media Keywords:
Focuses: Cc:

Description

There is currently fix that handles spaces in filenames, but it doesn't handle other special characters e.g. $! or any other special char

Spaces issue was reported there:
https://core.trac.wordpress.org/ticket/36549

And fixed here:
https://core.trac.wordpress.org/changeset/38052

Attachments (5)

55274.diff (1.0 KB) - added by dravnic 3 years ago.
55274.2.diff (2.2 KB) - added by dravnic 3 years ago.
55274.3.diff (1.0 KB) - added by dravnic 3 years ago.
55274.2.2.diff (2.2 KB) - added by dravnic 3 years ago.
55274.2.3.diff (2.3 KB) - added by dravnic 3 years ago.

Download all attachments as: .zip

Change History (14)

@dravnic
3 years ago

@dravnic
3 years ago

#1 @audrasjb
3 years ago

  • Keywords has-patch has-unit-tests added
  • Milestone changed from Awaiting Review to 6.0
  • Owner set to audrasjb
  • Status changed from new to reviewing

Hello @dravnic, Welcome to WordPress Core Trac and thank you for opening this ticket,

The proposal looks good at a glance. Also, thank you for updating unit tests.
Moving for 6.0 consideration.

#2 @dravnic
3 years ago

  • Keywords has-patch has-unit-tests removed

Thanks @audrasjb, glad to contribute!

@dravnic
3 years ago

@dravnic
3 years ago

#3 @dravnic
3 years ago

Please consider patches 55274.3.diff and 55274.2.2.diff (don't know how to remove 55274.diff and 55274.2.diff)

rawurlencode is better to encode path names than urlencode. e.g. spaces will be converted to "%20" instead "+". As well updated unit test as there was a typo.

@dravnic
3 years ago

#4 @dravnic
3 years ago

Fixed another typo in the unit test. Sorry for multiple patches :)

Final patches are:
55274.3.diff
55274.2.3.diff

#5 @dravnic
3 years ago

I was reviewing this discussion: #36549

The patches proposed here are fixing url special chars before the filter. This could break filter behavior if filter function expects urls that aren't encoded or if the filter itself changes urls and introduces special chars.

The fix proposed by @joemcgill https://core.trac.wordpress.org/ticket/36549#comment:6 (https://core.trac.wordpress.org/attachment/ticket/36549/36549.2.diff) seems better as it would escape url after the filter.

Note that esc_attr() that was mentioned in the above discussion didn't properly encode special chars for urls. so esc_url() should be done before.

Last edited 3 years ago by SergeyBiryukov (previous) (diff)

#6 @dravnic
3 years ago

I've tried using esc_url() like in https://core.trac.wordpress.org/attachment/ticket/36549/36549.2.diff but it didn't help.

The image filename that I used to test this is smeće-1024x614.png the special char in this filename is ć - a Croatian letter. In the src attribute path is properly encoded sme%C4%87e-1024x614.png}

With current code and with esc_url() the image was encoded in srcset as smeće-1024x614.png which is an invalid url.

When patched with https://core.trac.wordpress.org/attachment/ticket/55274/55274.3.diff it was encoded as sme%C4%87e-1024x614.png which is ok.

So rawurlencode() seems a better option, but it cannot be used after filter because then it would encode entire url not only filename. Maybe to extract filename from the full path and then encode it and append back on its path. Although directory name can as well contain special chars. To cover these cases it work require a bit more logic.

It would be great if we could encode before the filter and then leave it to filter to handle its urls properly.

Version 0, edited 3 years ago by dravnic (next)

#7 @costdev
3 years ago

  • Version set to 4.4

#8 @peterwilsoncc
3 years ago

  • Milestone changed from 6.0 to Future Release

It seems the various patches on this ticket need consolidation and a little more work so I am moving this of the 6.0 milestone.

Ideally, I think the tests would be separate and in addition to the spaces test. It would also be good to test some accents and non-latin characters as indicated in comment 6.

#9 @bozzmedia
2 months ago

Currently investigating an issue that may be related to this bug. MacOS generated screenshots have auto generated filenames that include spaces, but not all are stripped out by WP core:

Example file uploaded via media libary:
https://redacted.com/wp-content/uploads/2022/09/Screenshot-2022-09-19-at-12.09.42%E2%80%AFPM.jpg

It seems to miss that last space in there even though the original filename contains many spaces.

<meta property="og:image" content="https://redacted.com/wp-content/uploads/2022/09/Screenshot-2024-09-19-
at-12.09.42 PM.jpg"/>

This renders as a meta tag, Facebook debugger reports:

Corrupted Image
Provided og:image URL, https://redacted.com/wp-content/uploads/2022/09/Screenshot-2022-09-19-at-12.09.42 PM.jpg could not be processed as an image. It may be corrupted or may have an invalid format.

Potentially related thread:
https://wordpress.org/support/topic/spaces-in-filenames-leads-to-unloaded-image-invalid-error/

Note: See TracTickets for help on using tickets.