WordPress.org

Make WordPress Core

Opened 18 months ago

Last modified 2 months ago

#50855 accepted defect (bug)

sanitize_file_name function not working as expected if there is '%20' in filename

Reported by: dishitpala Owned by: audrasjb
Milestone: 6.0 Priority: normal
Severity: normal Version: 5.4.2
Component: Formatting Keywords: has-patch has-screenshots has-unit-tests dev-feedback
Focuses: Cc:

Description

We have added '%' as a special character in $special_char variable
We are also replacing '%20' with '-'

Here the sequence of str_replace was not appropriate

We need to replace '%20' with '-' before all the special character are replaced

Current behavior:

  • Filename Before: this%20is%20example.png
  • Filename after sanitization: this20is20example.png

Expected behavior:

  • Filename Before: this%20is%20example.png
  • Filename after sanitization: this-is-example.png

File reference: https://github.com/WordPress/WordPress/blob/master/wp-includes/formatting.php

Function name: sanitize_file_name

Attachments (6)

sanitize-file-name.patch (841 bytes) - added by dishitpala 18 months ago.
Capture d’écran 2021-04-26 à 21.17.56.png (296.2 KB) - added by audrasjb 9 months ago.
Before 50855.diff
Capture d’écran 2021-04-26 à 21.18.08.png (323.0 KB) - added by audrasjb 9 months ago.
After 50855.diff
50855.diff (646 bytes) - added by audrasjb 9 months ago.
Formatting: Reprioritization of the characters to remove from filenames in sanitize_file_name()
50855.2.diff (1.6 KB) - added by boblinthorst 7 months ago.
50855-before-and-after.png (246.1 KB) - added by hellofromTonya 2 months ago.
Test: before and after applying 50855.2.diff patch. Works as expected ✅

Download all attachments as: .zip

Change History (18)

#1 @audrasjb
9 months ago

  • Keywords needs-refresh added

It would be better to keep this line after the sanitize_file_name_chars hook.

@audrasjb
9 months ago

Formatting: Reprioritization of the characters to remove from filenames in sanitize_file_name()

#2 @audrasjb
9 months ago

  • Keywords has-screenshots added; needs-refresh removed
  • Milestone changed from Awaiting Review to 5.8
  • Owner set to audrasjb
  • Status changed from new to accepted

Moving for 5.8 consideration.
@SergeyBiryukov since you committed the last few changes we introduced in WP 5.5, any thoughts on this change?

#3 @audrasjb
8 months ago

Just re-tested my patch and it still applies cleanly against trunk. Marking this for commit.

#4 @audrasjb
8 months ago

  • Keywords commit added

#5 @desrosj
8 months ago

  • Keywords needs-unit-tests added; commit removed

I'd love to see a unit test or two added here to demonstrate the incorrect behavior before, and confirm the correct behavior after, and to prevent this from happening again going forward.

#6 @desrosj
8 months ago

  • Milestone changed from 5.8 to 5.9

I'm going to punt this one as 5.8 beta 1 is being packaged within the hour. I think with proper unit tests, this could potentially be considered for a 5.8.x minor.

#7 @boblinthorst
7 months ago

Fixed the existing unit tests. Worked on this with @rolfsiebers and @sresok.

@hellofromTonya
2 months ago

Test: before and after applying 50855.2.diff patch. Works as expected ✅

#8 @hellofromTonya
2 months ago

  • Keywords has-unit-tests added; needs-unit-tests removed

While it worked in testing that specific use case, the tests for wp_unique_filename() failed. Investigating.

#10 @hellofromTonya
2 months ago

  • Keywords dev-feedback added

[29290] #16630 added the logic to search for both %20 and + and then replace them with +.

But then [35122] #16226 added the % and + as special characters which replaces each with an empty string (strips them out).

Notice the timing of these changes: [29290] happened first and then [35122].

sanitize-file-name.patch 50855.diff 50855.2.diff move both the %20 and + to - replacers above the special characters replacers. That could be a breaking change. Why?

  • The + character: since [35122] (in 10/13/2015), the + character has been removed (i.e. replaced with an empty string). Moving it above the special characters means it no longer is stripped out but instead is replaced with a -.
  • The % character: since [29290] (in 07/24/2014), the % character was stripped leaving the 20. Notice Pippin identified that behavior here.

Given the history, I'm seeking more input on:

  • Should the + character be stripped (as special character) or replaced with a -? If special character, then is the + in this code needed?
    $filename = str_replace( array( '%20', '+' ), '-', $filename );
    
  • Is leaving 20 (stripping the %) by design? If no, do you see any issues replacing %20 with -?

#12 @hellofromTonya
2 months ago

  • Milestone changed from 5.9 to 6.0

With 5.9 Beta 1 tomorrow and this ticket needing more discussion and consensus, moving it to 6.0.

Note: See TracTickets for help on using tickets.