Make WordPress Core

Opened 4 weeks ago

Closed 4 weeks ago

Last modified 4 weeks ago

#60382 closed defect (bug) (fixed)

HTML API: WP_HTML_Processor::next_token nests void tags

Reported by: jonsurrell's profile jonsurrell Owned by: jonsurrell's profile jonsurrell
Milestone: 6.5 Priority: normal
Severity: normal Version: trunk
Component: HTML API Keywords: has-patch has-unit-tests
Focuses: Cc:


When processing HTML like <br><br> with WP_HTML_Processor::next_tag(), 2 sibling BR tags are correctly found.

However, when using WP_HTML_Processor::next_token(), void tags are not correctly handled, resulting in breadcrumbs from the second BR tag like [ 'HTML', 'BODY', 'BR', 'BR' ].

Change History (8)

This ticket was mentioned in PR #5975 on WordPress/wordpress-develop by @jonsurrell.

4 weeks ago

  • Keywords has-patch has-unit-tests added
  • Use $p variable for processor like other tests
  • Add failing test

Trac ticket:

#2 @jonsurrell
4 weeks ago

This was found thanks to the external test suite proposed in #60227.

#3 @jonsurrell
4 weeks ago

HTML API: WP_HTML_Processor::next_token is an internal API (with @access private) so this may not be considered a bug.

This ticket was mentioned in PR #5975 on WordPress/wordpress-develop by @jonsurrell.

4 weeks ago

Trac ticket: Core-60382

This currently only includes a failing test.

@jonsurrell commented on PR #5975:

4 weeks ago

@dmsnell your fix looks good to me and fixes the test failures I saw in

👍 This change looks good to me.

#6 @dmsnell
4 weeks ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 57507:

HTML API: Fix void tag nesting with next_token

When next_token() was introduced, it introduced a regression in the HTML
Processor whereby void tags remain on the stack of open elements when they
shouldn't. This led to invalid values returned from get_breadcrumbs().

The reason was that calling next_token() works through a different code path
than the HTML Processor runs everything else. To solve this, its sub-classed
next_token() called step( self::REPROCESS_CURRENT_TOKEN ) so that the proper
HTML accounting takes place.

Unfortunately that same reprocessing code path skipped the step whereby void
and self-closing elements are popped from the stack of open elements.

In this patch, that step is run with a third mode for step(), which is the
new self::PROCESS_CURRENT_TOKEN. This mode acts as if self::PROCESS_NEXT_NODE
were called, except it doesn't advance the parser.

Developed in
Discussed in

Follow-up to [57348]

Props dmsnell, jonsurrell
Fixes #60382

#8 @dmsnell
4 weeks ago

  • Milestone changed from Awaiting Review to 6.5
  • Version set to trunk
Note: See TracTickets for help on using tickets.