#63585 closed defect (bug) (duplicate)
Common MacOS unicode characters in filenames break
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | |
Component: | Media | Keywords: | needs-patch |
Focuses: | Cc: |
Description
When uploading "Screenshot 2025-06-17 at 4.43.59 AM.png", the default naming for a screenshot in MacOS, I got this error both times. When I renamed it to "spotlight-with-shortcut.png" it worked fine.
The URL in the media library was: https://ma.tt/files/2025/06/Screenshot-2025-06-17-at-5.20.16?PM.png
Which obviously breaks because of the query character.
@zieladam debugged it:
That last space is a Unicode ‘NARROW NO-BREAK SPACE’ (U+202F) – see the 8239 towards the end of the list of codepoints:
> [..."Screenshot 2025-06-17 at 4.43.59 AM.png"].map(s=>s.codePointAt(0)) (39) [83, 99, 114, 101, 101, 110, 115, 104, 111, 116, 32, 50, 48, 50, 53, 45, 48, 54, 45, 49, 55, 32, 97, 116, 32, 52, 46, 52, 51, 46, 53, 57, 8239, 65, 77, 46, 112, 110, 103]
I personally would prefer we returned to something along the lines of:
<?php remove_accents( $filename ); preg_replace( '/[^a-z0-9 -]/g', '', $filename) preg_replace( '/\s+/', '-', $filename );
I don't think there is any huge downside to being very conservative with uploaded filenames, and it also probably helps with portability if a site is migrating between different filesystems or OSes. I'm a fan of being ultra-paranoid any time we write user-supplied data to the filesystem.
Change History (8)
#3
@
4 weeks ago
This ticket is a duplicate of #62995, and it's probably a good opportunity to garden a bit and close out #39791, #30495, and want to reference #15955 that has a lot of prior art and seems to be one of the places we decided to try and support Unicode in filenames. The simple regex above isn't going to cut it! I think there is some PHP version and library weirdness.
The compatibility issues is well-raised by @compute: "This is really a problem in terms of sharing images on Facebook as they do not accept æøå in their filename."
#4
@
3 weeks ago
- Milestone Awaiting Review deleted
- Resolution set to duplicate
- Status changed from new to closed
Duplicate of #62995.
#5
@
9 days ago
Noting that I was only able to reproduce this with the media experiments plugin or client side media processing enabled in the Gutenberg plugin. Core seems to correctly handle these images now - on a vanilla install of trunk with no plugins, I was able to upload my screenshot to both the editor and the media library without issue.
#6
@
9 days ago
I added an issue upstream: https://github.com/swissspidy/media-experiments/issues/969
Hello and thanks for the report,
It looks like
 
(Narrow No-Break Space) is handled insanitize_title_with_dashes()
: https://github.com/WordPress/wordpress-develop/blob/6.8/src/wp-includes/formatting.php#L2360As soon as this function is used by
sanitize_file_name()
, I would assume that this character should be handled just fine when uploading a media, but it seems like it's not the case. Maybe it's related to the conditional usingseems_utf8()
to determine whethersanitize_title_with_dashes()
should be applied or not.