Make WordPress Core

Opened 6 months ago

Closed 6 months ago

Last modified 5 months ago

#60108 closed defect (bug) (fixed)

HTML API: May attempt out of range string access

Reported by: jonsurrell's profile jonsurrell Owned by: jonsurrell's profile jonsurrell
Milestone: 6.5 Priority: normal
Severity: normal Version:
Component: HTML API Keywords: has-patch has-unit-tests
Focuses: Cc:


The HTML API Tag Processor may attempt to perform out of range string index access, which may manifest as:

ValueError: strpos(): Argument #3 ($offset) must be contained in argument #1 ($haystack)

Change History (7)

This ticket was mentioned in PR #5793 on WordPress/wordpress-develop by @jonsurrell.

6 months ago

  • Keywords has-patch has-unit-tests added

@dmsnell commented on PR #5793:

6 months ago

Rebuilt on to #5725 in aa24abd8aacecdc80970723bcc3f85a053dbe856

@dmsnell commented on PR #5793:

6 months ago

@sirreal I've cherry-picked my rebuilt change in 9e7167a3d1b4b4d1e4525072b2f96779788fb92d onto #5725. If you want we can re-target this PR against that branch or close it out. I'm worried about losing the change in the stacked PRs if we merge it first, but if you prefer that we can do that and I'll rebuild avoid-parsing-incomplete-tokens on top of it.

#4 @Bernhard Reiter
6 months ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 57211:

HTML API: Avoid processing incomplete tokens.

Currently the Tag Processor assumes that an input document is a full HTML document. Because of this, if there's lingering content after the last tag match it will treat that content as plaintext and skip over it. This is fine for the Tag Processor because if there is lingering content that isn't a valid tag then there's nothing for next_tag() to match.

However, in order to support a number of feature expansions it is important to recognize that the remaining content may involve partial syntax elements, such as incomplete tags, attributes, or comments.

In this patch we're adding a mode inside the Tag Processor which will flip when we start parsing HTML syntax but the document finishes before the token does. This will provide the ability to:

  • extend the input document,
  • avoid misinterpreting syntax as text, and
  • guess if we have a complete document, know if we have an incomplete document.

In the process of building this patch a few fixes were identified and fixed in the Tag Processor, namely in the handling of incomplete syntax elements.

Props dmsnell, jonsurrell.
Fixes #60122, #60108.

#6 @kebbet
5 months ago

With changeset in trunk during 6.5, please milestone the ticket to 6.5 @bernhard-reiter, thanks!

#7 @swissspidy
5 months ago

  • Milestone changed from Awaiting Review to 6.5
Note: See TracTickets for help on using tickets.