Make WordPress Core

#60455 closed defect (bug) (fixed)

HTML API: Trigger active format reconstruction when reaching text nodes.

Reported by: dmsnell's profile dmsnell Owned by: dmsnell's profile dmsnell
Milestone: 6.5 Priority: normal
Severity: normal Version: 6.5
Component: HTML API Keywords: has-patch has-unit-tests dev-reviewed fixed-major
Focuses: Cc:

Description

When encountering text nodes in an HTML document, the HTML Parser needs to run the active format reconstruction algorithm, even if it doesn't stop to visit those text nodes. This is because the formats, which might need reconstructing, will impact the breadcrumbs of all downstream nodes from the text node.

In this patch, this process is triggered, but the text nodes are then skipped, since the HTML Processor doesn't currently support visiting them.

Change History (15)

This ticket was mentioned in PR #6054 on WordPress/wordpress-develop by @dmsnell.


12 months ago
#1

Trac ticket: Core-60455

## Status

  • [ ] Needs tests to confirm behaviors.

## Description

When encountering text nodes in an HTML document, the HTML Parser needs to run the active format reconstruction algorithm, even if it doesn't stop to visit those text nodes. This is because the formats, which might need reconstructing, will impact the breadcrumbs of all downstream nodes from the text node.

In this patch, this process is triggered, but the text nodes are then skipped, since the HTML Processor doesn't currently support visiting them.

@jonsurrell commented on PR #6054:


12 months ago
#2

I expected this to fix some of the tests from #5794 but this seems to have no change at all on the html5lib tests results.

#3 @swissspidy
12 months ago

  • Milestone changed from Awaiting Review to Future Release

#4 @dmsnell
12 months ago

@swissspidy I’ve been on vacation the past week but I’d like to still get this in during the beta for 6.5

It took us unfortunately too long to realize this is crucial to support before even supporting text nodes directly. Without it the breadcrumbs, or HTML path, will be wrong in some supported situations.

I’ll be back on this in a couple days.

@dmsnell commented on PR #6054:


12 months ago
#5

@westonruter would you mind giving this another pass? I added tests to assert the behavior, and they fail in trunk for me. I'd like to merge this ASAP to ensure the bug doesn't leak into 6.5

@dmsnell commented on PR #6054:


12 months ago
#6

I'm not sure what is happening, because I thought this was all green, but it wasn't testing properly for me, and upon deeper inspection, I think many things need updating.

In 96975a249a I have made the tests pass, but I'm concerned this is too rushed and that we should try a different approach, namely, detecting if we _need_ reconstruction and bailing on a text node if we do.

#7 @dmsnell
11 months ago

  • Milestone changed from Future Release to 6.5

@swissspidy @westonruter would appreciate some review on this. It seems ready to go, but should be ported into the 6.5 branch. Should I go ahead and merge it into trunk?

The quickest context is that we initially thought that we could overlook text nodes in the HTML Processor (vs. the Tag Processor) because 6.5 wasn't going to support visiting them in that class. However, those text nodes can trigger a process called active format reconstruction which can change the stack of open elements. Thus, we're in a bind and have to support text nodes regardless.

It's thankfully a small patch, and technically removes an oddity of the Tag vs. HTML Processor interfaces in the process, because with this patch both can now visit all tokens in a document.

#8 @swissspidy
11 months ago

Sure, trunk is open for business. It‘s just the 6.5 branch that needs double sign-off

#9 @swissspidy
11 months ago

  • Keywords has-unit-tests added; needs-unit-tests removed
  • Owner set to dmsnell
  • Status changed from new to assigned

#10 @dmsnell
11 months ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 57806:

HTML API: Trigger active format reconstruction when reaching text nodes.

When encountering text nodes in an HTML document, the HTML parser needs
to run the active format reconstruction algorithm, even if it doesn't
stop to visit those text nodes. This is because the formats, which might
need reconstructing, will impact the breadcrumbs of all downstream nodes
from the text node.

In this patch, this process is triggered, which properly triggers the
active format reconstruction. It also enables the visiting of other token
types as is possible in the Tag Processor.

Developed in https://github.com/WordPress/wordpress-develop/pull/6054
Discussed in https://core.trac.wordpress.org/ticket/60170

Props: dmsnell, jonsurrell, westonruter.
Fixes: #60455.
Follow-up to: [57348].

#12 @dmsnell
11 months ago

  • Keywords dev-feedback added

Reopening for back-port into 6.5.

#13 @swissspidy
11 months ago

  • Keywords dev-reviewed fixed-major added; dev-feedback removed

#14 @gziolo
11 months ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

@dmsnell, you didn't reopen the ticket. I also don't see the commit backported to 6.5, but the milestone is still set to 6.5 with all required labels to proceed. I'm reopening so we don't miss it.

#15 @swissspidy
11 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 57823:

HTML API: Trigger active format reconstruction when reaching text nodes.

When encountering text nodes in an HTML document, the HTML parser needs
to run the active format reconstruction algorithm, even if it doesn't
stop to visit those text nodes. This is because the formats, which might
need reconstructing, will impact the breadcrumbs of all downstream nodes
from the text node.
In this patch, this process is triggered, which properly triggers the
active format reconstruction. It also enables the visiting of other token
types as is possible in the Tag Processor.

Developed in https://github.com/WordPress/wordpress-develop/pull/6054
Discussed in https://core.trac.wordpress.org/ticket/60170

Reviewed by swissspidy.
Merges [57806] to the to the 6.5 branch.

Props: dmsnell, jonsurrell, westonruter.
Fixes: #60455.
Follow-up to: [57348].

Note: See TracTickets for help on using tickets.