Opened 17 months ago
Last modified 5 weeks ago
#60347 new defect (bug)
wp_kses breaking text fragments links
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | |
Component: | Formatting | Keywords: | has-unit-tests changes-requested has-test-info |
Focuses: | Cc: |
Description
Hello.
It seems that wp_kses() (probably wp_kses_bad_protocol()) is breaking text fragments links (https://developer.mozilla.org/en-US/docs/Web/Text_fragments).
For example:
<a href="#:~:text=highlight>Link</a>
This issue became more prominent as recently ACF started escaping HTML using the wp_kses() function (https://www.advancedcustomfields.com/blog/acf-6-2-5-security-release/).
I confirmed the issue with ACF's support.
Attachments (1)
Change History (18)
#3
@
7 months ago
- Component changed from Security to Formatting
- Keywords needs-patch needs-unit-tests good-first-bug added
@
7 months ago
This patch fixes the issue where kses removes text fragment links (e.g., #:~:text) as invalid protocols. The fix adds a check in wp_kses_bad_protocol to allow such fragments. The patch also includes unit tests to verify the behavior.
#4
follow-up:
↓ 5
@
7 months ago
That's great. Thanks, @shanemuir.
Will this be a part of the next release?
#5
in reply to:
↑ 4
@
7 months ago
Replying to asafm7:
That's great. Thanks, @shanemuir.
Will this be a part of the next release?
The patch will first need to undergo a review. If no issues are identified, it will then be assigned to a milestone, which will determine the future release in which this patch will be included.
This ticket was mentioned in PR #7890 on WordPress/wordpress-develop by @shanemuir.
7 months ago
#7
This PR addresses the issue in Trac ticket #60347, where wp_kses() removes valid text fragment links (e.g., #:~:text=highlight) by treating them as invalid protocols.
Changes:
- Updated wp_kses_bad_protocol() in kses.php to explicitly allow text fragments (#:~:text).
- Added a unit test to verify that text fragments are preserved in the href attribute.
Testing:
- Run npm run test:php to confirm all tests pass.
- Apply the patch and manually test with an anchor tag containing #:~:text in the href attribute.
- Example: <a href="#:~:text=highlight">Link</a> should pass validation.
Impact:
This change ensures that text fragment links are correctly preserved by wp_kses(), aligning with modern URL standards and avoiding unnecessary sanitization.
Trac ticket: https://core.trac.wordpress.org/ticket/60347
This ticket was mentioned in Slack in #core by shanemuir. View the logs.
3 months ago
This ticket was mentioned in Slack in #core-test by shanemuir. View the logs.
3 months ago
3 months ago
#11
This patch looks generally okay. Left couple questions for possible changes/improvements.
Looking at the specs: https://wicg.github.io/scroll-to-text-fragment/#syntax and https://wicg.github.io/scroll-to-text-fragment/#the-fragment-directive it seems there aren't any security implications. URLs starting with #:~:
should conform to the existing standards for URLs starting with #
. On the other hand there is no bypass for the latter so thinking a more in depth review by the security team is in order.
#12
follow-up:
↓ 14
@
3 months ago
- Keywords changes-requested has-testing-info added; has-patch needs-testing removed
Test Report
Description
🟠 This report validates that the indicated patch works as expected, with some caveats
Patch tested: https://github.com/WordPress/wordpress-develop/pull/7890.diff
Environment
- WordPress: 6.9-alpha-60093-src
- PHP: 8.2.28
- Server: nginx/1.27.4
- Database: mysqli (Server: 8.4.4 / Client: mysqlnd 8.2.28)
- Browser: Chrome 134.0.0.0
- OS: Windows 10/11
- Theme: My Twenty Twenty Child Theme 1.0
- MU Plugins: None activated
- Plugins:
- Test Reports 1.2.0
- Text Fragment Link Tester 1.0
Bug Reproduction
- Add the supplemental code for testing to functions.php, a plugin, or anywhere you can execute it
- Go to the Tools > Text Fragment Tester page
- 🐞 Bug appears, the URL is not shown as expected
Expected Results
- We can confirm that the URL remains unmodified in the script
Actual Results
- 🟠 Issue resolved with the patch, but it appears that the patch could be improved.
Additional Notes
- As @azaozz has pointed out in the PR, there are some issues with filtering that should be covered without generating a security issue https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Fragment
- Check if it is possible to generalize to more potential fragments types, not just text fragments
- Also, the
wp_kses_bad_protocol
function is specifically checking for protocols, consider if it's out of context given that a fragment is not technically a protocol, maybe a new specific function makes sense for this likewp_kses_bad_fragment
- @johnbillion anything
kses
is never a good first bug...I fear anything kses 😱
Supplemental Artifacts
Here is the code for testing
<?php add_action('admin_menu', 'text_fragment_tester_menu'); function text_fragment_tester_menu() { add_management_page( 'Text Fragment Tester', 'Text Fragment Tester', 'manage_options', 'text-fragment-tester', 'text_fragment_tester_page' ); } function text_fragment_tester_page() { $original_link = '<a href="#:~:text=highlight">Original Link</a>'; $allowed_html = array( 'a' => array( 'href' => array(), ) ); $filtered_link = wp_kses($original_link, $allowed_html); echo '<div class="wrap">'; echo '<h1>Text Fragment Link Test</h1>'; echo '<h2>Original HTML:</h2>'; echo '<pre>' . esc_html($original_link) . '</pre>'; echo '<div>' . $original_link . '</div>'; echo '<h2>After wp_kses:</h2>'; echo '<pre>' . esc_html($filtered_link) . '</pre>'; echo '<div>' . $filtered_link . '</div>'; echo '</div>'; }
#14
in reply to:
↑ 12
;
follow-up:
↓ 15
@
3 months ago
@SirLouen thank you for your test report.
I've updated the PR to allow for disabling text fragments. We can add logic to allow for time based fragments but after a quick test wp_kses
doesn't impact these.
I think you are right in that these are not protocols and maybe would be better worked in a more specific function.
#15
in reply to:
↑ 14
;
follow-up:
↓ 16
@
3 months ago
Replying to shanemuir:
I think you are right in that these are not protocols and maybe would be better worked in a more specific function.
Yes, that solution can solve the issue for now, but if new problems arise in the future, it's gonna be bread for today and hunger for tomorrow because it's completely out of context (and probably someone else will be writing on top of this aother function to handle new scenarios)
I'm not knowledgeable of every other function existing, but issuing some searches, I can't seem to see any other better function to place this in the kses files. So personally, I would opt for a new function to parse this specifically (and no need to use the array)). Problem with this, is that the amount of wet code regarding the call of this function is brutal.
https://github.com/WordPress/wordpress-develop/blob/baab2b8f2273319817edede1b3360f212cf42241/src/wp-includes/kses.php#L1415-L1509
More specifically the entry point is here:
https://github.com/WordPress/wordpress-develop/blob/baab2b8f2273319817edede1b3360f212cf42241/src/wp-includes/kses.php#L1446
Again, this is not generally recommended because it's a lot of work for the reviewers, but still I would rather opt for a full refactor for all that wet code. It's not a straight and easy solution but get on the challenge.
And now you have plenty of time because we are in 6.9-alpha 💪
#16
in reply to:
↑ 15
@
3 months ago
@SirLouen Thanks for this, however I will need to spend more time debugging and understanding what kses.php
is doing since as soon as I extract my logic into it's own specific method text fragments are no longer preserved so something further on in wp_kses_bad_protocol
is removing text fragments when the check is not part of the 'short circuit' logic.
Need to wrap my head around it see what's going on...
I forgot to mention that it only happens to relative links, without a protocol.
Like in the example I provided:
<a href="#:~:text=highlight>Link</a>