Make WordPress Core

Opened 4 weeks ago

Last modified 7 days ago

#61531 assigned enhancement

HTML API: Audit class name methods for consistency and correctness

Reported by: jonsurrell's profile jonsurrell Owned by:
Milestone: Future Release Priority: normal
Severity: normal Version: 6.5
Component: HTML API Keywords: has-patch has-unit-tests
Focuses: Cc:

Description (last modified by jonsurrell)

The Tag Processor CSS class name methods class_list, add_class, remove_class, and has_class. These should behave consistently regarding case-sensistivity.

These methods are intended to align with CSS class selectors. CSS class selector matching behavior is complicated and may depend on the document. This makes it difficult to determine the correct behavior for these methods.

At the moment, class_list yields ASCII lower-cased class names, but remove_class and add_class match case sensitive class names.

Attachments (2)

quirks-mode.png (232.4 KB) - added by dmsnell 4 weeks ago.
Illustration of CSS selector behavior in quirks mode.
no-quirks-mode.png (222.7 KB) - added by dmsnell 4 weeks ago.
Illustration of CSS selector behavior in no-quirks mode.

Download all attachments as: .zip

Change History (10)

This ticket was mentioned in PR #6931 on WordPress/wordpress-develop by @jonsurrell.


4 weeks ago
#1

  • Keywords has-patch has-unit-tests added

Trac ticket: https://core.trac.wordpress.org/ticket/61531

The Tag Processor CSS class name methods class_list, add_class, and remove_class should be consistent regarding case sensitivity.

These methods are intended to align with CSS class names, meaning that matching should be done ASCII case-insensitive. class_list already yields lower case unique class names, but remove_class and add_class do not have similar behavior of treating case-insensitive matching classes as equal.

  • add_class should only add classes that are not already present (compared ASCII case-insensitive).
  • remove_class should remove all matching classes (compared ASCII case-insensitive).

This was discussed with @dmsnell on Slack here: https://wordpress.slack.com/archives/C05NFB818PQ/p1719403633636769

Related to #61520 which documents the lower-casing behavior of class_list.

#2 @jonsurrell
4 weeks ago

  • Description modified (diff)
  • Milestone changed from 6.7 to Future Release
  • Owner jonsurrell deleted
  • Summary changed from HTML API: Tag processor class name methods should behave consistently with case sensitivity to HTML API: Audit class name methods for consistency and correctness
  • Type changed from defect (bug) to enhancement

After further review, it's difficult to determine exactly the correct behavior of these class methods should be. I've updated the summary to talk about auditing the behavior and I've changed the milestone to "future release."

@jonsurrell commented on PR #6931:


4 weeks ago
#3

This doesn't seem like the right approach, it may not align with browser behavior.

@dmsnell
4 weeks ago

Illustration of CSS selector behavior in quirks mode.

@dmsnell
4 weeks ago

Illustration of CSS selector behavior in no-quirks mode.

#4 @dmsnell
4 weeks ago

After a lengthy and extremely valuable chat with @jonsurrell and some independent testing I think we can say comfortably that the behavior in the HTML API is wrong, but that making it right is a bit more complicated than a simple bug fix would be.

In the HTML API we have to make ourselves aware of the quirks mode of the document being parsed. For normative posts it's probably safe to assume no quirks mode, as most WordPress themes output an HTML doctype that enters no-quirks mode. But, we may encounter quirks-mode documents, and the HTML Processor will likely need to communicate eventually which one a given document is.

When matching against a document which is in quirks mode, class names must be matched ASCII case-insensitively; class selectors are otherwise case-sensitive, only matching class names they are identical to.

CSS Selectors Level 4

I created the following document to illustrate the relevant semantics: in no-quirks mode, CSS class selectors match byte-for-byte with what's in the decoded class attribute; in quirks mode, they match in an ASCII-case-insensitive manner.

Remove the <!DOCTYPE html> line to test in quirks mode.

<!DOCTYPE html>
<style>
.one { border-left: 3px solid red; }
.ONE { border-top: 3px solid red; }
.right { border-right: 3px solid blue; }
.green { color: green; }
.italic { font-style: italic; }
.underline { text-decoration: underline; }
</style>

<div>
	<p class="one">One</p>
	<p class="ONE">ONE</p>
	<p class="&#x6f;ne">&amp;#x64;ne</p>
	<p class="&#79;NE">&amp;#79;NE</p>
	<p class="oNe one ONE oNe &#x6f;ne oNe &#79;NE">All of them</p>
</div>

<script>
const by = s => document.querySelectorAll( s );

by( '.one' ).forEach( node => node.classList.add('grEen') );
by( '.ONE' ).forEach( node => node.classList.add('right') );
by( '[class~="one"]' ).forEach( node => node.classList.add( 'italic' ) );
by( '[class~="ONE"]' ).forEach( node => node.classList.add( 'underline' ) );
</script>
No-Quirks Mode Quirks Mode
Illustration of CSS selector behavior in no-quirks mode. Illustration of CSS selector behavior in quirks mode.

#5 @jonsurrell
8 days ago

#61655 was marked as a duplicate.

This ticket was mentioned in PR #6933 on WordPress/wordpress-develop by @jonsurrell.


8 days ago
#6

The HTML API remove_class regarding white space opts to maintain "inter-class" whitespace.

https://github.com/WordPress/wordpress-develop/blob/5effffccd005ca3dae6631ee73b1096b8f54a861/src/wp-includes/html-api/class-wp-html-tag-processor.php#L2223-L2230

This seems reasonable and justified, but normalization seems inconsistent. Leading space is maintained, but trailing space is not.

It seems more consistent to trim leading and trailing space, and actually maintain _only_ inter-class whitespace.

Trac ticket: https://core.trac.wordpress.org/ticket/61531 ~https://core.trac.wordpress.org/ticket/61655~

#7 @dmsnell
7 days ago

In 58740:

HTML API: Remove leading whitespace after removing class names.

In part of a larger review of CSS semantics and behaviors, this patch
takes the opportunity to remove leading whitespace in an updated class
attribute after the first class in the attribute has been removed.

Previously, if the first class name had been removed, the whitespace
that formerly followed it would remain in the class attribute. This
stood in contrast to removing other class names, which removed their
associated whitespace.

There should be no semantic or functional changes in this patch, only
a slightly-large diff for modified HTML documents that looks prettier
when removing the first class name in a class attribute.

Developed in https://github.com/WordPress/wordpress-develop/pull/6933
Discussed in https://core.trac.wordpress.org/ticket/61531

Props dmsnell, jonsurrell.
See #61531.

Note: See TracTickets for help on using tickets.