Opened 2 years ago
Last modified 3 months ago
#60864 new defect (bug)
URL sanitizing strips valid characters instead of encoding, documented use is invalid
| Reported by: |
|
Owned by: | |
|---|---|---|---|
| Milestone: | Awaiting Review | Priority: | normal |
| Severity: | normal | Version: | |
| Component: | Security | Keywords: | has-patch has-unit-tests |
| Focuses: | Cc: |
Description
- wp_sanitize_redirect creates different URL instead of correctly percent-encoding e.g. for URLs that contain "<" - followup to https://core.trac.wordpress.org/ticket/31486 where this was partially fixed
- sanitize_url is documented to sanitize for redirect usage but the URI is not valid for redirects since it's not percent-encoded - followup to https://core.trac.wordpress.org/ticket/56160
- esc_url and sanitize_url strip characters that don't need to be stripped but can be HTML encoded to make them safe, e.g. "<" causing some URLs to be broken.
Change History (5)
This ticket was mentioned in PR #6335 on WordPress/wordpress-develop by @kkmuffme.
2 years ago
#2
- Keywords has-patch has-unit-tests added
Trac ticket: https://core.trac.wordpress.org/ticket/60864
#3
@
2 years ago
Before I fix all the broken tests, I want to gather some feedback on this - the tests are broken because the characters were stripped instead of encoded, which caused some URLs to be invalid, leading to 404s.
This ticket was mentioned in Slack in #core by kkmuffme. View the logs.
2 years ago
#5
@
3 months ago
@kkmuffme looks like I missed wp_sanitize_redirect() when we updated UTF-8 support in WordPress 6.9.
Perhaps you might look that over and see if you want to update the patch.
I appreciate the comment about HTML4 but I suspect it’s not necessary, especially not to link to the old HTML4 specs. Is the change in line 4495/4506 meaningful? It looks like it might just move around some characters and shows up as a changed line, but maybe I’m overlooking something. If it’s not different semantically, can we avoid rearranging the characters so they don’t appear in the diff?
Can you speak to the differentiation in here of using the term URI instead of URL? Can you also verify that the changes will not mistakenly apply to parts of a URL that should not be transformed?
#56160 was marked as a duplicate.