Make WordPress Core


Ignore:
Timestamp:
10/16/2023 02:00:01 PM (2 years ago)
Author:
SergeyBiryukov
Message:

HTML API: Avoid calling subclass method while internally scanning in Tag Processor.

After modifying tags in the HTML API, the Tag Processor backs up to before the tag being modified and then re-parses its attributes. This saves on the code complexity involved in applying updates, which have already been transformed to “lexical updates” by the time they are applied.

In order to do that, ::get_updated_html() called ::next_tag() to reuse its logic. However, as a public method, subclasses may change the behavior of that method, and the HTML Processor does just this. It maintains an HTML stack of open elements and when the Tag Processor calls this method to re-scan a tag and its attributes, it leads to a broken stack.

This commit replaces the call to ::next_tag() with a more appropriate reapplication of its internal parsing logic to rescan the tag name and its attributes. Given the limited nature of what's occurring in ::get_updated_html(), this should bring with it certain guarantees that no HTML structure is being changed (that structure will only be changed by subclasses like the HTML Processor).

Follow-up to [56274], [56702].

Props dmsnell, zieladam, nicolefurlan.
Fixes #59607.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/src/wp-includes/html-api/class-wp-html-tag-processor.php

    r56703 r56941  
    22712271     * @since 6.2.0
    22722272     * @since 6.2.1 Shifts the internal cursor corresponding to the applied updates.
     2273     * @since 6.4.0 No longer calls subclass method `next_tag()` after updating HTML.
    22732274     *
    22742275     * @return string The processed HTML.
     
    23042305         * move; a call to `next_tag()` will reparse the recently-updated attributes
    23052306         * and additional calls to modify the attributes will apply at this same
    2306          * location.
     2307         * location, but in order to avoid issues with subclasses that might add
     2308         * behaviors to `next_tag()`, the internal methods should be called here
     2309         * instead.
     2310         *
     2311         * It's important to note that in this specific place there will be no change
     2312         * because the processor was already at a tag when this was called and it's
     2313         * rewinding only to the beginning of this very tag before reprocessing it
     2314         * and its attributes.
    23072315         *
    23082316         * <p>Previous HTML<em>More HTML</em></p>
    2309          *                 ^  | back up by the length of the tag name plus the opening <
    2310          *                 \<-/ back up by strlen("em") + 1 ==> 3
    2311          */
    2312 
    2313         // Store existing state so it can be restored after reparsing.
    2314         $previous_parsed_byte_count = $this->bytes_already_parsed;
    2315         $previous_query             = $this->last_query;
    2316 
    2317         // Reparse attributes.
     2317         *                 ↑  │ back up by the length of the tag name plus the opening <
     2318         *                 └←─┘ back up by strlen("em") + 1 ==> 3
     2319         */
    23182320        $this->bytes_already_parsed = $before_current_tag;
    2319         $this->next_tag();
    2320 
    2321         // Restore previous state.
    2322         $this->bytes_already_parsed = $previous_parsed_byte_count;
    2323         $this->parse_query( $previous_query );
     2321        $this->parse_next_tag();
     2322        // Reparse the attributes.
     2323        while ( $this->parse_next_attribute() ) {
     2324            continue;
     2325        }
    23242326
    23252327        return $this->html;
Note: See TracChangeset for help on using the changeset viewer.