Make WordPress Core

Opened 2 years ago

Closed 2 years ago

#55407 closed defect (bug) (fixed)

Allow case-insensitive elements for KSES

Reported by: r-a-y's profile r-a-y Owned by: peterwilsoncc's profile peterwilsoncc
Milestone: 6.0 Priority: normal
Severity: normal Version:
Component: Formatting Keywords: has-patch needs-unit-tests
Focuses: Cc:

Description

I'm trying to allow some HTML elements that are capitalized.

eg. <Post attribute="value"></Post>

However, attempting to add a custom element with the 'wp_kses_allowed_html' filter fails due to the strtolower() additions in kses.php. I've attached a patch, which removes these strtolower() calls. Let me know what you think.

HTML elements do not have to be lowercase. The W3C notes this here: https://w3c.github.io/html-reference/documents.html#case-insensitivity

Attachments (4)

55407.01.patch (938 bytes) - added by r-a-y 2 years ago.
55407.02.patch (1.5 KB) - added by r-a-y 2 years ago.
Removes strtolower() call in force_balance_tags() as well
55407.diff (608 bytes) - added by peterwilsoncc 2 years ago.
55407.2.diff (621 bytes) - added by peterwilsoncc 2 years ago.

Download all attachments as: .zip

Change History (9)

@r-a-y
2 years ago

#1 @SergeyBiryukov
2 years ago

  • Keywords needs-unit-tests added
  • Milestone changed from Awaiting Review to 6.0

@r-a-y
2 years ago

Removes strtolower() call in force_balance_tags() as well

#2 @peterwilsoncc
2 years ago

The proposed patch is problematic in that it will make KSES case sensitive.

Consider the HTML <DIV CLASS="55407"></DIV>, the first altered line removes strtolower() from the check of allowed elements. This will cause a false negative and result in the element being removed.

See https://3v4l.org/5X0OJ for an example of the difference.

In it's current form, KSES will maintain the case in which the element is entered.

<DIV class="55407">Some div</DIV>

<Not-Allowed>Not Allowed element</Not-AlloWed>

Is stored in the database as

<DIV class="55407">Some div</DIV>

Not Allowed element

Adding elements to the allowed HTML array or via the appropriate filter do need to be in lowercase. I think this needs to remain the case as the allowed HTML shouldn't contain the same element in different cases, eg

<?php
$allowed_html = [
   'div' => [ /* etc */ ],
   'DIV' => [ /* etc */ ],
];

#3 @r-a-y
2 years ago

Consider the HTML <DIV CLASS="55407"></DIV>

Ah, that's a good point for existing content that might have uppercase elements already. Not sure how many WP installations this would actually affect though.

@peterwilsoncc
2 years ago

#4 @peterwilsoncc
2 years ago

I'll resolve this ticket with the documentation fix in 55407.diff.

As browsers treat HTML tags and attributes without case sensitivity, KSES needs to as well to match the behavior.

The affect of this is KSES does need to be a little stricter with developers when adding elements to the array of supported values. This requires the values all be in lowercase.

I did consider whether it could be resolved by code to allow both uppercase and lowercase elements in the allow list, but it would have a performance hit looping through the array each time and would cause problems for developers wishing to remove elements from the allow list. Removing div would also require removing DIV, Div, DIv, and the other possible combinations.

#5 @peterwilsoncc
2 years ago

  • Owner set to peterwilsoncc
  • Resolution set to fixed
  • Status changed from new to closed

In 53034:

KSES: Document HTML allow list is in lowercase.

Expand documentation of the wp_kses_allowed_html hook to indicate that developers must add permitted HTML tags and attributes in lowercase for KSES to recognise they are permitted.

Props r-a-y, SergeyBiryukov, peterwilsoncc.
Fixes #55407.
See #53399.

Note: See TracTickets for help on using tickets.