Make WordPress Core

Changes between Version 1 and Version 3 of Ticket #60698


Ignore:
Timestamp:
05/09/2024 02:21:18 AM (7 weeks ago)
Author:
dmsnell
Comment:

Legend:

Unmodified
Added
Removed
Modified
  • Ticket #60698 – Description

    v1 v3  
    114114In [https://github.com/WordPress/wordpress-develop/pull/6387 #6387] I have built a spec-compliant HTML text decoder which utilizes this token map. The performance of the new decoder is approximately 20% slower than calling `html_entity_decode()` directly, except it properly decodes what PHP can't. In fact, the performance bottleneck in that work is not even in the token map, but in converting a sequence of digits into a UTF-8 string from the given code point.
    115115
     116My proposal is adding a new class `WP_Token_Map` providing at least two methods for normal use:
     117
     118 - `contains( $token )` returns whether the passed string is in the set.
     119 - `read_token( $text, $offset = 0, $skip_bytes )` indicates if the character sequence starting at the given offset in the passed string forms a token in the set, and if so, returns the replacement for that token. It also sets `&$skip_bytes` to the length of the token so that calling code .
     120
     121It also provides utility functions for pre-computing these classes, as they are designed for relatively-static cases where the actual code is intended to be generated dynamically, but stay static over time. For example, HTML5 defines the set of named character references and indicates that the list //shall not// change or be expanded. [https://html.spec.whatwg.org/#named-character-references-table HTML5 spec]. Precomputing can save on the startup-up cost of building the optimized lookup tables.
     122
     123 - `static::from_array( array $mappings )` generates a new token map from the given array of whose keys are tokens and whose values are the replacements.
     124 - `to_array()` dumps the set of mapping into an array suitable for passing back into `from_array()`.
     125 - `static::from_precomputed_table( ...$table )` instantiates a token set from a precomputed table, skipping the computation for building the table and sorting the tokens.
     126 - `precomputed_php_source_table()` generates PHP source code which can be loaded with the previous static method for maintenance of the core static token sets.
     127
    116128== Other potential uses
    117129