Make WordPress Core

Opened 17 months ago

Last modified 5 weeks ago

#60347 new defect (bug)

wp_kses breaking text fragments links

Reported by: asafm7's profile asafm7 Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version:
Component: Formatting Keywords: has-unit-tests changes-requested has-test-info
Focuses: Cc:

Description

Hello.

It seems that wp_kses() (probably wp_kses_bad_protocol()) is breaking text fragments links (https://developer.mozilla.org/en-US/docs/Web/Text_fragments).

For example:
<a href="#:~:text=highlight>Link</a>

This issue became more prominent as recently ACF started escaping HTML using the wp_kses() function (https://www.advancedcustomfields.com/blog/acf-6-2-5-security-release/).

I confirmed the issue with ACF's support.

Attachments (1)

60347-kses-text-fragments.patch (1.9 KB) - added by shanemuir 7 months ago.
This patch fixes the issue where kses removes text fragment links (e.g., #:~:text) as invalid protocols. The fix adds a check in wp_kses_bad_protocol to allow such fragments. The patch also includes unit tests to verify the behavior.

Download all attachments as: .zip

Change History (18)

#1 @asafm7
17 months ago

I forgot to mention that it only happens to relative links, without a protocol.

Like in the example I provided:

<a href="#:~:text=highlight>Link</a>

#2 @asafm7
12 months ago

Hi, is there any way to promote a fix for this issue?

#3 @johnbillion
7 months ago

  • Component changed from Security to Formatting
  • Keywords needs-patch needs-unit-tests good-first-bug added

@shanemuir
7 months ago

This patch fixes the issue where kses removes text fragment links (e.g., #:~:text) as invalid protocols. The fix adds a check in wp_kses_bad_protocol to allow such fragments. The patch also includes unit tests to verify the behavior.

#4 follow-up: @asafm7
7 months ago

That's great. Thanks, @shanemuir.

Will this be a part of the next release?

#5 in reply to: ↑ 4 @shanemuir
7 months ago

Replying to asafm7:

That's great. Thanks, @shanemuir.

Will this be a part of the next release?

The patch will first need to undergo a review. If no issues are identified, it will then be assigned to a milestone, which will determine the future release in which this patch will be included.

#6 @shanemuir
7 months ago

  • Keywords has-patch has-unit-tests added; needs-patch needs-unit-tests removed

This ticket was mentioned in PR #7890 on WordPress/wordpress-develop by @shanemuir.


7 months ago
#7

This PR addresses the issue in Trac ticket #60347, where wp_kses() removes valid text fragment links (e.g., #:~:text=highlight) by treating them as invalid protocols.

Changes:

  • Updated wp_kses_bad_protocol() in kses.php to explicitly allow text fragments (#:~:text).
  • Added a unit test to verify that text fragments are preserved in the href attribute.

Testing:

  1. Run npm run test:php to confirm all tests pass.
  2. Apply the patch and manually test with an anchor tag containing #:~:text in the href attribute.
  • Example: <a href="#:~:text=highlight">Link</a> should pass validation.

Impact:

This change ensures that text fragment links are correctly preserved by wp_kses(), aligning with modern URL standards and avoiding unnecessary sanitization.

Trac ticket: https://core.trac.wordpress.org/ticket/60347

#8 @shanemuir
4 months ago

  • Keywords needs-testing added

This ticket was mentioned in Slack in #core by shanemuir. View the logs.


3 months ago

This ticket was mentioned in Slack in #core-test by shanemuir. View the logs.


3 months ago

@azaozz commented on PR #7890:


3 months ago
#11

This patch looks generally okay. Left couple questions for possible changes/improvements.

Looking at the specs: https://wicg.github.io/scroll-to-text-fragment/#syntax and https://wicg.github.io/scroll-to-text-fragment/#the-fragment-directive it seems there aren't any security implications. URLs starting with #:~: should conform to the existing standards for URLs starting with #. On the other hand there is no bypass for the latter so thinking a more in depth review by the security team is in order.

#12 follow-up: @SirLouen
3 months ago

  • Keywords changes-requested has-testing-info added; has-patch needs-testing removed

Test Report

Description

🟠 This report validates that the indicated patch works as expected, with some caveats

Patch tested: https://github.com/WordPress/wordpress-develop/pull/7890.diff

Environment

  • WordPress: 6.9-alpha-60093-src
  • PHP: 8.2.28
  • Server: nginx/1.27.4
  • Database: mysqli (Server: 8.4.4 / Client: mysqlnd 8.2.28)
  • Browser: Chrome 134.0.0.0
  • OS: Windows 10/11
  • Theme: My Twenty Twenty Child Theme 1.0
  • MU Plugins: None activated
  • Plugins:
    • Test Reports 1.2.0
    • Text Fragment Link Tester 1.0

Bug Reproduction

  1. Add the supplemental code for testing to functions.php, a plugin, or anywhere you can execute it
  2. Go to the Tools > Text Fragment Tester page
  3. 🐞 Bug appears, the URL is not shown as expected

Expected Results

  • We can confirm that the URL remains unmodified in the script

Actual Results

  • 🟠 Issue resolved with the patch, but it appears that the patch could be improved.

Additional Notes

  1. As @azaozz has pointed out in the PR, there are some issues with filtering that should be covered without generating a security issue https://developer.mozilla.org/en-US/docs/Web/URI/Reference/Fragment
  2. Check if it is possible to generalize to more potential fragments types, not just text fragments
  3. Also, the wp_kses_bad_protocol function is specifically checking for protocols, consider if it's out of context given that a fragment is not technically a protocol, maybe a new specific function makes sense for this like wp_kses_bad_fragment
  4. @johnbillion anything kses is never a good first bug...I fear anything kses 😱

Supplemental Artifacts

Here is the code for testing

<?php

add_action('admin_menu', 'text_fragment_tester_menu');

function text_fragment_tester_menu() {
    add_management_page(
        'Text Fragment Tester', 
        'Text Fragment Tester', 
        'manage_options', 
        'text-fragment-tester', 
        'text_fragment_tester_page'
    );
}

function text_fragment_tester_page() {
    $original_link = '<a href="#:~:text=highlight">Original Link</a>';
    $allowed_html = array(
        'a' => array(
            'href' => array(),
        )
    );
    $filtered_link = wp_kses($original_link, $allowed_html);
    
    echo '<div class="wrap">';
    echo '<h1>Text Fragment Link Test</h1>';
    
    echo '<h2>Original HTML:</h2>';
    echo '<pre>' . esc_html($original_link) . '</pre>';
    echo '<div>' . $original_link . '</div>';
    
    echo '<h2>After wp_kses:</h2>';
    echo '<pre>' . esc_html($filtered_link) . '</pre>';
    echo '<div>' . $filtered_link . '</div>';
    echo '</div>';
}
Last edited 3 months ago by SirLouen (previous) (diff)

#13 @johnbillion
3 months ago

  • Keywords good-first-bug removed

#14 in reply to: ↑ 12 ; follow-up: @shanemuir
3 months ago

@SirLouen thank you for your test report.

I've updated the PR to allow for disabling text fragments. We can add logic to allow for time based fragments but after a quick test wp_kses doesn't impact these.

I think you are right in that these are not protocols and maybe would be better worked in a more specific function.

#15 in reply to: ↑ 14 ; follow-up: @SirLouen
3 months ago

Replying to shanemuir:

I think you are right in that these are not protocols and maybe would be better worked in a more specific function.

Yes, that solution can solve the issue for now, but if new problems arise in the future, it's gonna be bread for today and hunger for tomorrow because it's completely out of context (and probably someone else will be writing on top of this aother function to handle new scenarios)

I'm not knowledgeable of every other function existing, but issuing some searches, I can't seem to see any other better function to place this in the kses files. So personally, I would opt for a new function to parse this specifically (and no need to use the array)). Problem with this, is that the amount of wet code regarding the call of this function is brutal.
https://github.com/WordPress/wordpress-develop/blob/baab2b8f2273319817edede1b3360f212cf42241/src/wp-includes/kses.php#L1415-L1509

More specifically the entry point is here:
https://github.com/WordPress/wordpress-develop/blob/baab2b8f2273319817edede1b3360f212cf42241/src/wp-includes/kses.php#L1446

Again, this is not generally recommended because it's a lot of work for the reviewers, but still I would rather opt for a full refactor for all that wet code. It's not a straight and easy solution but get on the challenge.

And now you have plenty of time because we are in 6.9-alpha 💪

#16 in reply to: ↑ 15 @shanemuir
3 months ago

@SirLouen Thanks for this, however I will need to spend more time debugging and understanding what kses.php is doing since as soon as I extract my logic into it's own specific method text fragments are no longer preserved so something further on in wp_kses_bad_protocol is removing text fragments when the check is not part of the 'short circuit' logic.

Need to wrap my head around it see what's going on...

#17 @wordpressdotorg
5 weeks ago

  • Keywords has-test-info added; has-testing-info removed
Note: See TracTickets for help on using tickets.