#61348 closed enhancement (fixed)
HTML API: Report real and virtual nodes in the HTML Processor.
Reported by: | dmsnell | Owned by: | dmsnell |
---|---|---|---|
Milestone: | 6.6 | Priority: | normal |
Severity: | normal | Version: | 6.6 |
Component: | HTML API | Keywords: | has-patch has-unit-tests needs-dev-note dev-reviewed |
Focuses: | Cc: |
Description
HTML is a kind of short-hand for a DOM structure. This means that there are many cases in HTML where an element's opening tag or closing tag is missing (or both). This is because many of the parsing rules imply creating elements in the DOM which may not exist in the text of the HTML.
The HTML Processor, being the higher-level counterpart to the Tag Processor, is already aware of these nodes, but since it's inception has not paused on them when scanning through a document. Instead, these are visible when pausing on a child of such an element, but otherwise not seen.
The HTML Processor ought to fully represent the DOM structure a browser would see, which includes representing these "virtual" nodes which are implicitly created.
For example, the HTML string <p><div>Content</p></div>
looks like it contains overlapping P
and DIV
elements, but in reality the first P
is implicitly closed by the <div>
and the second </p>
is unexpected and creates an empty P
element.
The current HTML Processor in trunk
will visit these tags in sequence (where a +
indicates opening a node while -
indicates closing one): +P +DIV #text -P -DIV
.
The HTML Processor ought to represent this the way code traversing a DOM tree would: +P -P +DIV #text +P -P -DIV
. Notably, in this sequence we can see the missing/implicit/virtual nodes that were created as part of applying the semantic HTML rules.
Change History (28)
This ticket was mentioned in PR #6348 on WordPress/wordpress-develop by @dmsnell.
6 months ago
#1
#2
@
6 months ago
- Owner set to dmsnell
- Resolution set to fixed
- Status changed from new to closed
In 58304:
This ticket was mentioned in PR #6726 on WordPress/wordpress-develop by @jonsurrell.
6 months ago
#4
This is a follow-up to 163c3fd11026e0a6f93a7f88ab32dd1679ad0f88 / #6348 with two small fixes:
- Fix non-static is_tag_closer method called statically
- Fix call to parent is_tag_closer() that should call instance method
Trac ticket: https://core.trac.wordpress.org/ticket/61348
This ticket was mentioned in PR #6860 on WordPress/wordpress-develop by @dmsnell.
6 months ago
#8
See Core-61348
Since the introduction of _visiting all nodes_ in the HTML Processor, the internal logic within the class can be confusing. This new method indicates if the currently-matched node is real or virtual, and can be used to clarify that logic.
@jonsurrell commented on PR #6860:
6 months ago
#9
This fixes a regression that was currently in 6.6 that has tests in #6765 where COMMENT_AS_PI_NODE_LOOKALIKE
comments did not expose their "tag name" (PI _target_), it was always #comment
. So <?target foo ?> would have no way to access the
target` part of the comment.
#11
@
6 months ago
Dev note prepped in Updates to the HTML API (public preview)
This ticket was mentioned in PR #6914 on WordPress/wordpress-develop by @dmsnell.
5 months ago
#13
Follow-up to [58304]
See Core-61348
Previously the breadcrumbs were only generated for real nodes, and when visiting virtual nodes, the parser had already traversed past them to the next real node, advancing the breadcrumbs ahead of the matched token.
Test in the Playground
#14
@
5 months ago
- Keywords dev-feedback added
- Resolution fixed deleted
- Status changed from closed to reopened
#17
@
5 months ago
Requesting backport into 6.6, fixing a bug introduced during the 6.6 release cycle.
I prepared a commit message, in case it helps.
HTML API: Report breadcrumbs properly when visiting virtual nodes. When [58304] introduced the abililty to visit virtual nodes in the HTML document, those being the nodes which are implied by the HTML but no explicitly present in the raw text, a bug was introduced in the `get_breadcrumbs()` method because it wasn't updated to be aware of the virtual nodes. Therefore it would report the wrong breadcrumbs for virtual nodes. Since the new `get_depth()` method is based on the same logic it was also broken for virtual nodes. In this patch, the breadcrumbs have been updated to account for the virtual nodes and the depth method has been updated to rely on the fixed breadcrumb logic. Developed in https://github.com/WordPress/wordpress-develop/pull/6914 Discussed in https://core.trac.wordpress.org/ticket/61348 Reviewed by jonsurrell, zieladam. Merges [58588] to the 6.6 branch. Follow-up to [58304]. Props dmsnell, hellofromtonya, joemcgill, jonsurrell, zieladam. Fixes #61348.
This ticket was mentioned in Slack in #core-committers by dmsnell. View the logs.
5 months ago
5 months ago
#19
get_current_depth may report the wrong depth in some cases that are fixes for breadcrumbs with this change.
good point. in my head it was using breadcrumbs() already, but now it actually is with https://github.com/WordPress/wordpress-develop/commit/21bc01b956ded3bef4ed0df0371ba376bc6c1a8f
What is the minimal test cases that would cover this issue? We are in the RC phase, so better to stay on the safe side. When testing with the Playground link provided everything works as expected.
#20
@
5 months ago
- Keywords dev-reviewed added; dev-feedback removed
Okay, to backport to the 6.6 branch.
@jonsurrell commented on PR #6914:
5 months ago
#22
</p>
is a good test case for this. I'll add some unit tests in another PR.
In this PR that's been fixed.
5 months ago
#23
To close the loop here, I backported changes with https://core.trac.wordpress.org/changeset/58590 to the 6.6 release branch.
This ticket was mentioned in PR #6929 on WordPress/wordpress-develop by @jonsurrell.
5 months ago
#24
Add unit tests for virtual node breadcrumbs and depth. This includes tests for behaviors that were fixed in https://github.com/WordPress/wordpress-develop/pull/6914#issuecomment-2196299009.
Trac ticket: https://core.trac.wordpress.org/ticket/61348
@jonsurrell commented on PR #6914:
5 months ago
#25
It would be great to land https://github.com/WordPress/wordpress-develop/pull/6929 as a follow-up to this with unit tests.
5 months ago
#27
Committed with https://core.trac.wordpress.org/changeset/58592.
@jonsurrell commented on PR #6929:
5 months ago
#28
I'll address some follow-up test tweaks in https://github.com/WordPress/wordpress-develop/pull/7030.
Trac ticket: Core-61348
## Summary
Creates virtual nodes when pushing to and popping from the stack of open elements. It's these nodes that are returned by
next_tag()
, while subclassed methods intercept tag information, all within the HTML Processor.## Splitting time!
get_current_depth()
to return the depth of the currently-matched element in the stack of open elements. This needs to account formarker
and other not-yet-implemented items in the stack.expects_closer()
or similar function to indicate if the currently-matched element needs or expects a closing element.## Questions
get_depth()
from the HTML API itself instead of exporting that HTML nuance onto the caller.## Related work
## Examples
cc: @sirreal