#61810 closed defect (bug) (fixed)
HTML API: Parser may hang on some unclosed script tag inputs
Reported by: |
|
Owned by: |
|
---|---|---|---|
Milestone: | 6.7 | Priority: | normal |
Severity: | normal | Version: | 6.2 |
Component: | HTML API | Keywords: | has-patch has-unit-tests dev-feedback |
Focuses: | Cc: |
Description
When the Tag Processor (or HTML Processor) attempts to parse certain incomplete script tags, the parser enters an infinite loop and will hang indefinitely. The conditions to reach this situation are:
- Input HTML ends with an open script tag.
- The final character of input is
-
or<
.
If these conditions are satisfied, the parser will enter an infinite loop and hang when it attempts to parse the script tag.
Example problematic inputs:
<script>-
<script><
Creating a processor and calling next_tag()
will hang. In both cases, next_tag()
should return false with the processor in incomplete token state.
Change History (6)
This ticket was mentioned in PR #7128 on WordPress/wordpress-develop by @jonsurrell.
7 months ago
#1
- Keywords has-patch has-unit-tests added
#4
@
7 months ago
- Milestone changed from 6.6.2 to 6.7
- Version changed from 6.5 to 6.2
Re-labeling as 6.7 since this fixes a bug from 6.2, not from 6.6. Can someone confirm this is right?
In any case, if it's appropriate to backport into 6.6.2 I would be glad to do so, even to 6.5. I'm not sure how important it is to do so. Thankfully, the circumstances for this bug should be relatively rare.
#6
@
6 months ago
This was discovered by the HTML5lib-tests full HTML API parser tests, in progress in https://github.com/WordPress/wordpress-develop/pull/7117.
Trac ticket: https://core.trac.wordpress.org/ticket/61810
When the Tag Processor (or HTML Processor) attempts to parse certain incomplete script tags, the parser enters an infinite loop and will hang indefinitely. The conditions to reach this situation are:
-
or<
.If these conditions are satisfied, the parser will enter an infinite loop and hang when it attempts to parse the script tag.
Example problematic inputs:
<script>-
<script><
Creating a processor and calling
next_tag()
will hang. In both cases,next_tag()
should return false with the processor in incomplete token state.## Diagnosis
This bug was caused by essential code for advancing the parser position (
$at++
) never being reached by a short-circuit condition:if ( $too_short || '<' !== $html[ $at++ ] ) {}
. If the left-hand side is true, the RHS is never evaluated and the processor may not advance.Other code in the block would advance the parser under most conditions, it was only when
-
or<
were in the final position in the document that the LHS would evaluate to true and the parser would enter an infinite loop.This was difficult to spot because of its similarity to many other blocks in the parser like this:
if ( $at + 2 < $doc_length && '-' === $html[ $at ] && '-' === $html[ $at + 1 ] && '>' === $html[ $at + 2 ] ) {}
---
This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.