Make WordPress Core

Opened 4 weeks ago

Last modified 11 days ago

#64093 new enhancement

Interactivity API: Performance bottleneck in `data_wp_each_processor` due to repeated template parsing

Reported by: michelleeby's profile michelleeby Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: trunk
Component: Interactivity API Keywords: has-patch
Focuses: performance Cc:

Description

Summary

A performance bottleneck exists in WP_Interactivity_API::data_wp_each_processor when it processes a data-wp-each directive with a large number of items. The current implementation re-scans the entire inner template HTML for every item in the loop, leading to exponential performance degradation as the number of items increases. This ticket proposes an optimized approach using pre-computation and caching to parse the template only once.

Problem Description

During a high-traffic load test (500 virtual users) on a production endpoint for a major enterprise customer, we identified WP_Interactivity_API::data_wp_each_processor as the primary bottleneck. New Relic traces consistently showed this single function consuming over 60% of the total request time (e.g., 40.89 seconds).

The root cause of the inefficiency lies in its nested-loop-like behavior:

  • The function iterates through the array provided to the data-wp-each directive (let's call the number of items N).
  • Inside every iteration, it instantiates a new WP_HTML_Tag_Processor for the template and recursively calls $this->process_directives().
  • This recursive call forces the WP_HTML_Tag_Processor to scan the template from the beginning to find all directive tags (let's say there are M tags to process).

This results in a time complexity of approximately O(N⋅M), where the expensive work of finding all directive tags within the template is repeated for every single item in the source array. For a list with hundreds of items and a template with several directives, this becomes computationally prohibitive.

Steps to Reproduce

  • Register an interactive block or view script.
  • Use the data-wp-each directive on a <template> tag.
  • Provide a large array to the directive via wp_interactivity_state(), for example, an array with 500+ items.
  • Inside the <template>, include several elements with other interactivity directives (e.g., data-wp-text, data-wp-bind:aria-label, data-wp-class).
  • Load the page and profile the server response time. You will observe an extremely high execution time for the data_wp_each_processor method.

Proposed Solution: Pre-computation and Caching

The proposed solution refactors the processor to adopt a "pre-compute and cache" strategy. Instead of re-parsing the template on every iteration, we can build a "blueprint" of the template once, cache it, and then use it to efficiently process each item.

The new logic is as follows:

  • Blueprint Creation: On the first encounter of a given template HTML, create a "blueprint.": Instantiate a WP_HTML_Tag_Processor for the template. Iterate through it once to find all tags containing data-wp- directives. For each interactive tag found, create a bookmark using $p->set_bookmark().
  • Caching: Store the array of bookmark names in a static class property (private static $template_blueprints), using an md5 hash of the template's HTML as the cache key.
  • Optimized Loop: For each item in the data array: Instantiate a new WP_HTML_Tag_Processor with the template HTML. Instead of scanning, iterate through the cached bookmark names. Use $p->seek( $bookmark_name ) to jump directly to the next interactive tag. Finally, call the original process_directives() on that specific tag.

I think this change would fundamentally alter the time complexity to something closer to O(M+N⋅D), where M is the one-time cost of scanning the template to build the blueprint, and D is the number of interactive directives that need processing for each of the N items. This would be a massive performance improvement.

Attachments (4)

interactivity-api-patch.diff (7.6 KB) - added by michelleeby 4 weeks ago.
data-wp-each-test.zip (476.4 KB) - added by michelleeby 3 weeks ago.
This plugin zip can be installed to test the patch. Once installed and activated, a shortcode can be placed on a post like:
data-wp-each-test.2.zip (476.4 KB) - added by michelleeby 3 weeks ago.
This plugin zip can be installed to test the patch. Once installed and activated, a shortcode can be placed on a post like: [table_perf_test size="heavy"], [table_perf_test size="medium"], or [table_perf_test size="light"]
interactive-stress-test.zip (57.7 KB) - added by michelleeby 2 weeks ago.
A plugin for running performance tests against the Interactivity API.

Download all attachments as: .zip

Change History (8)

This ticket was mentioned in PR #10249 on WordPress/wordpress-develop by Michelleeby.


4 weeks ago
#1

Hello!

This PR introduces a patch for https://core.trac.wordpress.org/ticket/64093. It introduces a caching mechanism for data-wp-each template blueprints, improving performance by pre-computing and storing template structures. The data_wp_each_processor method has been refactored to utilize this cache.

Trac ticket: https://core.trac.wordpress.org/ticket/64093

@michelleeby
3 weeks ago

This plugin zip can be installed to test the patch. Once installed and activated, a shortcode can be placed on a post like:

@michelleeby
3 weeks ago

This plugin zip can be installed to test the patch. Once installed and activated, a shortcode can be placed on a post like: [table_perf_test size="heavy"], [table_perf_test size="medium"], or [table_perf_test size="light"]

#2 @michelleeby
3 weeks ago

Revised patch

In testing it was found that the first attempt of the patch was incomplete. It wasn’t accounting for opening and closing tags and how that traversal was managed by the state of the stack. The second attempt, commit bfc4a1c04f8ab3f808529edde74dce550d4f3d65, no longer tries to replace the core processing logic. Instead, its an attempt to optimize the way the code navigates the HTML. The patch now replicates the traversal.

The original code used a while ( $p->next_tag() ) loop to walk through the template’s HTML, visiting every opening and closing tag in sequential order. The patched code does this once to create the “traversal blueprint,” which is simply an ordered list of bookmarks for every tag the processor visits. In the main loop, the code iterates through this cached list of bookmarks and uses $item_p->seek() to jump to each tag in the exact same sequence. This aims to mimic the original code's traversal.

Crucially, once the code seeks to a tag, it calls the original, unmodified $this->process_directives() function. This means that all the complex, necessary features, like handling enter and exit modes, checking for unbalanced tags, and respecting directive priority, are still being handled by the battle-tested core code. Finally, the patch adds a small but important check: if ( ! empty( $item_p->get_attribute_names_with_prefix( 'data-wp-' ) ) ). This ensures that the code only calls the expensive process_directives function on tags that actually have directives. For all other tags (like a simple <p> or <span>), the code still seeks to them to maintain the correct traversal order, but it skips the unnecessary processing call.

Steps to test

  1. Using 2 different wp-env environments, create an environment for the patched version of WordPress and another for "vanilla" WordPress Core. Then continue to follow the steps for both.
  2. Install and activate data-wp-each-test plugin, which is attached to this ticket as a zip.
  3. Install and activate Query Monitor, https://wordpress.org/plugins/query-monitor/.
  4. Place a shortcode block on a post and set its size, the plugin supports "light", "medium", and "heavy", which respectively create an HTML table with 500, 1000, and 10,000 rows. The shortcode should look like this: [table_perf_test size="heavy"].
  5. Go to the page on the frontend and view the page generation time in Query Monitor after hitting refresh.

Repeat step 5 for each environment and compare the results.

Test results

Test Case Light Medium Heavy
Core Run 1 0.1496s 0.2201s 1.7510s
Core Run 2 0.1700s 0.2323s 1.7361s
Core Run 3 0.1452s 0.2222s 1.6932s
Core Run 4 0.1465s 0.2556s 1.9496s
Core Run 5 0.1478s 0.2369s 1.9446s
Core Average 0.15182s 0.23342s 1.8149s
Patched Run 1 0.0599s 0.0694s 0.0832s
Patched Run 2 0.0677s 0.0657s 0.0627s
Patched Run 3 0.1190s 0.0686s 0.0784s
Patched Run 4 0.0601s 0.0698s 0.0789s
Patched Run 5 0.0707s 0.0701s 0.2161s
Patched Average 0.07548s 0.06872s 0.10386s
Last edited 3 weeks ago by michelleeby (previous) (diff)

@michelleeby
2 weeks ago

A plugin for running performance tests against the Interactivity API.

#3 @michelleeby
2 weeks ago

Revised Patch 2: Directive Caching

Previous attempts to optimize WP_Interactivity_API::data_wp_each_processor included a "pre-compute and cache" strategy. Instead of re-parsing the template on every iteration, we can build a "blueprint" of the template once, cache it, and then use it to efficiently process each item. However, this solution did not consider how the WP_HTML_Tag_Processor is designed.

As far as I can tell, the WP_HTML_Tag_Processor architecture works by:

  • Sequentially scanning HTML to find tags (via strpos, strcspn, etc.)
  • Tracking positions as byte offsets
  • Modifying HTML through WP_HTML_Text_Replacement objects

It seems there's no way to "jump to position X" without scanning from the beginning. Byte offsets shift as HTML is modified, and bookmarks are designed for single-pass processing, not cross-rendering reuse.

Looking more carefully at the bottleneck, the real issue is that for each item in the array, the code:

  1. Create a new WP_Interactivity_API_Directives_Processor
  2. Call $p->next_tag() repeatedly (which does expensive string scanning)
  3. Call get_attribute_names_with_prefix('data-wp-') for each tag
  4. Parse directive names and extract values
  5. Evaluate directives

The optimization opportunity is steps 2-4, not step 5. The code can't avoid re-scanning the HTML (that's how the Tag Processor works), but it can cache what it learned about where directives are and what they are.

Steps to Reproduce the Bottleneck

  1. Download, install and activate "interactive-stress-test.zip" which is attached to the ticket.
  2. The plugin creates 3 posts, "Stress Test – Light", "Stress Test – Medium", "Stress Test – Heavy". View "Stress Test - Light" on the frontend.
  3. Navigate to the Medium and Heavy posts on the frontend. In browser console, observe the time that the "Interactivity API processed in".

Steps to Run Experiments

  1. Navigate to the "Stress Test – Light" post on the frontend.
  2. Click the "Run Batch Test (10x)" button
  3. Follow the pop up prompts. They will guide you through the medium and heavy tests and ask to save the results as a CSV. The results can also be seen in the browser console.

Results

Core WordPress

Summary Statistics

Variation Runs Avg Time (ms) Min Time (ms) Max Time (ms)
light 10 231.68 196.20 275.80
medium 10 562.00 511.10 698.40
heavy 10 1081.62 1051.30 1148.80

Individual Run Data

Variation Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10
light 215.00 196.20 226.20 263.10 218.70 272.20 207.00 211.60 275.80 231.00
medium 514.70 546.30 646.90 514.00 554.70 591.10 698.40 524.10 511.10 518.70
heavy 1051.30 1053.60 1062.00 1053.60 1072.20 1069.30 1148.80 1101.50 1098.00 1105.90

Patched WordPress

Summary Statistics

Variation Runs Avg Time (ms) Min Time (ms) Max Time (ms)
light 10 201.12 152.80 362.50
medium 10 392.73 366.00 437.40
heavy 10 796.62 756.80 838.30

Individual Run Data

Variation Run 1 Run 2 Run 3 Run 4 Run 5 Run 6 Run 7 Run 8 Run 9 Run 10
light 165.20 168.10 362.50 193.80 158.10 152.80 157.20 175.10 254.10 224.30
medium 384.50 369.60 411.60 366.00 405.30 400.30 437.40 372.30 398.70 381.60
heavy 756.80 791.20 799.30 776.90 787.20 772.20 818.70 838.30 803.60 822.00

@darerodz commented on PR #10249:


11 days ago
#4

Hello, @Michelleeby! 👋

I've been testing this PR following the steps you provided. Below are the results I got, and there doesn't seem to be much difference between them. What results do you obtain, @Michelleeby? Asking just in case I did something wrong while running the tests. 😄

WordPress:trunk e0558c2df1b47215ea2657c8fcd3b97abbc5895e

Variation Runs Avg Time (ms) Min Time (ms) Max Time (ms)
light 10 291.82 285.40 302.20
medium 10 798.53 770.70 836.60
heavy 10 1518.23 1476.90 1550.70

Michelleeby:trunk ceae14b6a6cd90d44d7884cf0625eb15b10f045c

Variation Runs Avg Time (ms) Min Time (ms) Max Time (ms)
light 10 294.18 283.70 305.20
medium 10 780.25 765.50 807.20

| heavy | 10 | 1502.56 | 1452.50 | 1543.40

Michelleeby:ticket/64093-interactivity... 7280cc9ec99c7c94c24383f443d9c943bce42d80

Variation Runs Avg Time (ms) Min Time (ms) Max Time (ms)
light 10 298.30 292.60 308.70
medium 10 788.31 773.50 818.30
heavy 10 1508.38 1480.90 1592.30
Note: See TracTickets for help on using tickets.