WordPress.org

Make WordPress Core

Opened 3 years ago

Last modified 10 months ago

#30495 new defect (bug)

Unicode character U+000B is not removed by sanitize_file_name

Reported by: Craxic Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 4.0.1
Component: Formatting Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

It seems that the following expression is true:

json_encode(sanitize_file_name(json_decode('"\u000B"'))) == "\u000b"

On Google App Engine, for example, a file name with a U+000B character cannot be saved.

Since the description of the function states:

Removes special characters that are illegal in filenames on certain
operating systems and special characters requiring special escaping
to manipulate at the command line.

... then I think this is a bug.

Thanks!

Attachments (1)

line-tabulation-sanitization.30495.diff (1.7 KB) - added by sanchothefat 2 years ago.
Adds a step to remove all control characters in the 1-31 then mops up white space. Has unit test.

Download all attachments as: .zip

Change History (5)

This ticket was mentioned in Slack in #core by jorbin. View the logs.


2 years ago

#2 @jorbin
2 years ago

  • Component changed from Filesystem API to Formatting
  • Keywords needs-patch needs-unit-tests added

There might be a benefit in a check inside sanitize_file_name to remove everything that matches [:space:]. This is going to need both unit tests and a patch to move forward.

@sanchothefat
2 years ago

Adds a step to remove all control characters in the 1-31 then mops up white space. Has unit test.

#3 @sanchothefat
2 years ago

  • Keywords has-patch has-unit-tests added; needs-patch needs-unit-tests removed

#4 @mgutt
10 months ago

Regarding "then mops up white space" take a look at this changeset:
https://core.trac.wordpress.org/changeset/29715

Last edited 10 months ago by mgutt (previous) (diff)
Note: See TracTickets for help on using tickets.