Make WordPress Core

Opened 7 months ago

Closed 7 months ago

Last modified 6 months ago

#61810 closed defect (bug) (fixed)

HTML API: Parser may hang on some unclosed script tag inputs

Reported by: jonsurrell's profile jonsurrell Owned by: jonsurrell's profile jonsurrell
Milestone: 6.7 Priority: normal
Severity: normal Version: 6.2
Component: HTML API Keywords: has-patch has-unit-tests dev-feedback
Focuses: Cc:

Description

When the Tag Processor (or HTML Processor) attempts to parse certain incomplete script tags, the parser enters an infinite loop and will hang indefinitely. The conditions to reach this situation are:

  • Input HTML ends with an open script tag.
  • The final character of input is - or <.

If these conditions are satisfied, the parser will enter an infinite loop and hang when it attempts to parse the script tag.

Example problematic inputs:

  • <script>-
  • <script><

Creating a processor and calling next_tag() will hang. In both cases, next_tag() should return false with the processor in incomplete token state.

Change History (6)

This ticket was mentioned in PR #7128 on WordPress/wordpress-develop by @jonsurrell.


7 months ago
#1

  • Keywords has-patch has-unit-tests added

Trac ticket: https://core.trac.wordpress.org/ticket/61810

When the Tag Processor (or HTML Processor) attempts to parse certain incomplete script tags, the parser enters an infinite loop and will hang indefinitely. The conditions to reach this situation are:

  • Input HTML ends with an open script tag.
  • The final character of input is - or <.


If these conditions are satisfied, the parser will enter an infinite loop and hang when it attempts to parse the script tag.

Example problematic inputs:

  • <script>-
  • <script><

Creating a processor and calling next_tag() will hang. In both cases, next_tag() should return false with the processor in incomplete token state.

## Diagnosis

This bug was caused by essential code for advancing the parser position ($at++) never being reached by a short-circuit condition: if ( $too_short || '<' !== $html[ $at++ ] ) {}. If the left-hand side is true, the RHS is never evaluated and the processor may not advance.

Other code in the block would advance the parser under most conditions, it was only when - or < were in the final position in the document that the LHS would evaluate to true and the parser would enter an infinite loop.

This was difficult to spot because of its similarity to many other blocks in the parser like this:

if (
  $at + 2 < $doc_length &&
  '-' === $html[ $at ] &&
  '-' === $html[ $at + 1 ] &&
  '>' === $html[ $at + 2 ]
) {}

---

This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

#2 @dmsnell
7 months ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 58845:

HTML API: Fix an infinite loop in certain unclosed SCRIPT tags.

When the Tag Processor (or HTML Processor) attempts to parse certain
incomplete script tags, the parser enters an infinite loop and will
hang indefinitely. The conditions to reach this situation are:

  • Input HTML ends with an open script tag.
  • The final character of input is - or <.

The infinite loop was caused by the parser-advancing increment not being
called when two || OR conditions short-circuited. If the first
condition was true, the $at++ code was never reached.

This path resolves the issue.

Developed in https://github.com/wordpress/wordpress-develop/pull/7128
Discussed in https://core.trac.wordpress.org/ticket/61810

Follow-up to [55203].

Props: dmsnell, jonsurrell.
Fixes #61810.

#4 @dmsnell
7 months ago

  • Milestone changed from 6.6.2 to 6.7
  • Version changed from 6.5 to 6.2

Re-labeling as 6.7 since this fixes a bug from 6.2, not from 6.6. Can someone confirm this is right?
In any case, if it's appropriate to backport into 6.6.2 I would be glad to do so, even to 6.5. I'm not sure how important it is to do so. Thankfully, the circumstances for this bug should be relatively rare.

#5 @jonsurrell
7 months ago

  • Keywords dev-feedback added

#6 @jonsurrell
6 months ago

This was discovered by the HTML5lib-tests full HTML API parser tests, in progress in https://github.com/WordPress/wordpress-develop/pull/7117.

Note: See TracTickets for help on using tickets.