Make WordPress Core

Opened 2 years ago

Last modified 3 months ago

#60864 new defect (bug)

URL sanitizing strips valid characters instead of encoding, documented use is invalid

Reported by: kkmuffme's profile kkmuffme Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version:
Component: Security Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

  • esc_url and sanitize_url strip characters that don't need to be stripped but can be HTML encoded to make them safe, e.g. "<" causing some URLs to be broken.

Change History (5)

#1 @kkmuffme
2 years ago

#56160 was marked as a duplicate.

This ticket was mentioned in PR #6335 on WordPress/wordpress-develop by @kkmuffme.


2 years ago
#2

  • Keywords has-patch has-unit-tests added

#3 @kkmuffme
2 years ago

Before I fix all the broken tests, I want to gather some feedback on this - the tests are broken because the characters were stripped instead of encoded, which caused some URLs to be invalid, leading to 404s.

This ticket was mentioned in Slack in #core by kkmuffme. View the logs.


2 years ago

#5 @dmsnell
3 months ago

@kkmuffme looks like I missed wp_sanitize_redirect() when we updated UTF-8 support in WordPress 6.9.

Perhaps you might look that over and see if you want to update the patch.

I appreciate the comment about HTML4 but I suspect it’s not necessary, especially not to link to the old HTML4 specs. Is the change in line 4495/4506 meaningful? It looks like it might just move around some characters and shows up as a changed line, but maybe I’m overlooking something. If it’s not different semantically, can we avoid rearranging the characters so they don’t appear in the diff?

Can you speak to the differentiation in here of using the term URI instead of URL? Can you also verify that the changes will not mistakenly apply to parts of a URL that should not be transformed?

Note: See TracTickets for help on using tickets.