#61531 closed enhancement (fixed)
HTML API: Audit class name methods for consistency and correctness
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | 6.7 | Priority: | normal |
Severity: | normal | Version: | 6.5 |
Component: | HTML API | Keywords: | has-patch has-unit-tests |
Focuses: | Cc: |
Description (last modified by )
The Tag Processor CSS class name methods class_list
, add_class
, remove_class
, and has_class
. These should behave consistently regarding case-sensistivity.
These methods are intended to align with CSS class selectors. CSS class selector matching behavior is complicated and may depend on the document. This makes it difficult to determine the correct behavior for these methods.
At the moment, class_list
yields ASCII lower-cased class names, but remove_class
and add_class
match case sensitive class names.
Attachments (2)
Change History (26)
This ticket was mentioned in PR #6931 on WordPress/wordpress-develop by @jonsurrell.
8 months ago
#1
- Keywords has-patch has-unit-tests added
#2
@
8 months ago
- Description modified (diff)
- Milestone changed from 6.7 to Future Release
- Owner jonsurrell deleted
- Summary changed from HTML API: Tag processor class name methods should behave consistently with case sensitivity to HTML API: Audit class name methods for consistency and correctness
- Type changed from defect (bug) to enhancement
After further review, it's difficult to determine exactly the correct behavior of these class methods should be. I've updated the summary to talk about auditing the behavior and I've changed the milestone to "future release."
@jonsurrell commented on PR #6931:
8 months ago
#3
This doesn't seem like the right approach, it may not align with browser behavior.
#4
@
8 months ago
After a lengthy and extremely valuable chat with @jonsurrell and some independent testing I think we can say comfortably that the behavior in the HTML API is wrong, but that making it right is a bit more complicated than a simple bug fix would be.
In the HTML API we have to make ourselves aware of the quirks mode of the document being parsed. For normative posts it's probably safe to assume no quirks mode, as most WordPress themes output an HTML doctype that enters no-quirks mode. But, we may encounter quirks-mode documents, and the HTML Processor will likely need to communicate eventually which one a given document is.
When matching against a document which is in quirks mode, class names must be matched ASCII case-insensitively; class selectors are otherwise case-sensitive, only matching class names they are identical to.
I created the following document to illustrate the relevant semantics: in no-quirks
mode, CSS class selectors match byte-for-byte with what's in the decoded class
attribute; in quirks
mode, they match in an ASCII-case-insensitive manner.
Remove the <!DOCTYPE html>
line to test in quirks mode.
<!DOCTYPE html> <style> .one { border-left: 3px solid red; } .ONE { border-top: 3px solid red; } .right { border-right: 3px solid blue; } .green { color: green; } .italic { font-style: italic; } .underline { text-decoration: underline; } </style> <div> <p class="one">One</p> <p class="ONE">ONE</p> <p class="one">&#x64;ne</p> <p class="ONE">&#79;NE</p> <p class="oNe one ONE oNe one oNe ONE">All of them</p> </div> <script> const by = s => document.querySelectorAll( s ); by( '.one' ).forEach( node => node.classList.add('grEen') ); by( '.ONE' ).forEach( node => node.classList.add('right') ); by( '[class~="one"]' ).forEach( node => node.classList.add( 'italic' ) ); by( '[class~="ONE"]' ).forEach( node => node.classList.add( 'underline' ) ); </script>
No-Quirks Mode | Quirks Mode |
---|---|
![]() | ![]() |
This ticket was mentioned in PR #6933 on WordPress/wordpress-develop by @jonsurrell.
7 months ago
#6
The HTML API remove_class
regarding white space opts to maintain "inter-class" whitespace.
This seems reasonable and justified, but normalization seems inconsistent. Leading space is maintained, but trailing space is not.
It seems more consistent to trim leading and trailing space, and actually maintain _only_ inter-class whitespace.
Trac ticket: https://core.trac.wordpress.org/ticket/61531 ~https://core.trac.wordpress.org/ticket/61655~
This ticket was mentioned in PR #7169 on WordPress/wordpress-develop by @jonsurrell.
6 months ago
#9
Update that HTML Processor and Tag Processor to handle CSS classes in a case-sensitive way by default.
This aligns with "no-quirks" or "standards" mode behavior for class name handling.
Remove forced lowercasing in ::class_list
.
When the HTML Processor is in no-quirks mode, add_class
, remove_class
and has_class
operate in a _case-sensitive_ way:
$processor = WP_HTML_Processor::create_full_parser( '<!DOCTYPE html><div class="FOO">' );
$processor->next_tag( 'DIV' );
var_dump( $processor->has_class('foo') );
// bool(false)
var_dump( $processor->has_class('FOO') );
// bool(true)
When the HTML Processor is in quirks mode (only available via the fragment parser at the moment), add_class
, remove_class
and has_class
operate in a _case-sensitive_ way:
$processor = WP_HTML_Processor::create_fragment('<div class="FOO">', '<body>', 'UTF-8', 'quirks-mode');
$processor->next_tag( 'DIV' );
var_dump( $processor->has_class('foo') );
// bool(true)
var_dump( $processor->has_class('FOO') );
// bool(true)
add_class
and remove_class
get similar treatment. Case-insensitive classes matching remove_class( $class_name )
will be removed. Case insensitive duplicate classes will not be added when calling add_class( $class_name )
.
class_list
_does_ produce case-insensitive duplicates in all cases. This deduplication would be easy to perform in quirks mode, however it's unclear what form of casing should be yielded.
---
The Tag Processor and HTML Processor classes have several methods for dealing with CSS class names: class_list
, add_class
, remove_class
and has_class
.
These methods are intended to provide a CSS class selector-like interface with the class attribute.
Class name matching (CSS class selectors `.className {}` or `getElementsByClassName( "className" )`) is case sensitive in no-quirks mode and case-insensitive in quirks-mode.
The list of elements with class names classNames for a node root is the HTMLCollection returned by the following algorithm:
Let classes be the result of running the ordered set parser on classNames.
If classes is the empty set, return an empty HTMLCollection.
Return an HTMLCollection rooted at root, whose filter matches descendant elements that have all their classes in classes.
The comparisons for the classes must be done in an ASCII case-insensitive manner if root’s node document’s mode is "quirks"; otherwise in an identical to manner.
When matching against a document which is in quirks mode, class names must be matched ASCII case-insensitively; class selectors are otherwise case-sensitive, only matching class names they are identical to.
Trac ticket: https://core.trac.wordpress.org/ticket/61531
This ticket was mentioned in PR #7187 on WordPress/wordpress-develop by @jonsurrell.
6 months ago
#10
Replace null bytes with the � replacement character in ::class_list
.
The browser will replace null bytes with � in attribute values:-state)
U+0000 NULL
This is an unexpected-null-character parse error. Append a U+FFFD REPLACEMENT CHARACTER character to the current attribute's value.
If we consider that class_list is a processed version of the class attribute, this null-byte replacement is appropriate. At this time, get_attribute is closer to "raw" access to the attribute value as authored.
Related to https://github.com/WordPress/wordpress-develop/pull/7184.
Trac ticket: https://core.trac.wordpress.org/ticket/61531
@jonsurrell commented on PR #6931:
6 months ago
#11
https://github.com/WordPress/wordpress-develop/pull/7169 is a fresh attempt at this.
@jonsurrell commented on PR #7169:
6 months ago
#12
I plan to revisit this when https://github.com/WordPress/wordpress-develop/pull/7195 is complete. That will allow us to test with quirks mode using the full parser and not need to change the fragment factory method signature.
@jonsurrell commented on PR #7169:
6 months ago
#13
I've moved quirks mode into the Tag Processor, it impacts CSS class name handling and it makes sense to have that there.
Now that #7195 / [58925] has landed, changes in quirks mode handling are tested by using the full HTML parser in quirks mode and no-quirks mode.
Quirks mode is not exposed publicly at this time (no change from current trunk
). In a previous commit, an argument was added to ::create_fragment
to adjust the compat mode. This is no longer necessary and has been reverted.
@jonsurrell commented on PR #7169:
6 months ago
#14
I think the concerns have been addressed and this is ready for another review.
@jonsurrell commented on PR #7187:
6 months ago
#17
I'm pretty confident that with @coversDefaultClass
correctly set, then @covers ::methodName
works fine. There are existing examples of that in the codebase.
6 months ago
#18
Absolutely @sirreal - but I'm speaking more about WPCS conventions than working code. I know there are examples in the codebase already demonstrating ::
without the class and that it works fine. I just want to make sure we don't introduce code that is going to upset anyone.
6 months ago
#19
@sirreal I've updated some comments and changed the behavior of add_class()
and remove_class()
so that they preserve the casing of the first-provided class name for all lexical variations. I'll entertain ideas on this, but it's how we handle attributes, and I think it's the most respectful we can be to developers who are setting values when casing doesn't matter.
it's really quite a surprise, and I hope quirks mode is almost never used, given the conflict between CSS selectors matching the class
attribute value, and those given as class name selectors.
6 months ago
#20
@sirreal I've also moved the Trac ticket reference to the top of the description to make it easier to find, and used the shorthand notation instead of the full link.
Trac ticket: https://core.trac.wordpress.org/ticket/61531
Related to #61520 which documents the lower-casing behavior of
class_list
.