Make WordPress Core

Opened 7 months ago

Last modified 21 hours ago

#57789 assigned enhancement

Make theme.json related caches persistent

Reported by: flixos90's profile flixos90 Owned by: joemcgill's profile joemcgill
Milestone: 6.4 Priority: normal
Severity: normal Version:
Component: Themes Keywords: gutenberg-merge has-patch
Focuses: performance Cc:

Description (last modified by flixos90)

The theme_json cache group is currently non-persistent, which means that, even when a site uses a persistent object cache, the data in this cache group will only be cached within a single request. In other words, there is only a benefit to this caching when a function is called several times, at least more than once, in a single request.

Needless to say, we have noticed in the past that parsing theme.json and running related logic can be expensive, and the preferable outcome would be to make those caches persistent so that they also can be cached beyond requests.

The biggest challenge to this is to implement proper cache invalidation. Originally, within Gutenberg, the cache group was persistent, but it led to problems due to:

  • cache invalidation hard to figure out (what are all the events can lead to the cached values becoming stale?)
  • caching logic, rather than database values, is not yet a common practice in WordPress, and one of the quirks is that e.g. action/filter hooks may no longer be run when the logic that contains them is cached, which can lead to a clunky developer experience

See related comment

Change History (32)

#1 @flixos90
7 months ago

  • Description modified (diff)
  • Milestone changed from Awaiting Review to 6.3

This tracking ticket will likely involve several issues and PRs in the Gutenberg repository to identify the best approach for the various usages of caching related to theme.json and the theme_json cache group.

I'm hopeful we can at least make some notable progress here for WordPress 6.3, so I'll milestone it for that release for now, acknowledging though that most of the work will need to happen in the Gutenberg repository, so it'll probably be a while until we get a PR here.

cc @oandregal @spacedmonkey

#2 @iCaleb
7 months ago

One approach could be to mainly just utilize the persistent cache only during frontend requests (where it'd get the most performance benefit).

Each logged-in non-ajax admin requests could, at least once, always fetch the theme.json. If there are changes from what is in the persistent cache then it can update. As a backup (in case a theme is updated but nobody logs in), could also allow the same invalidation-checks during cron as well.

Worst case scenario: no cron is running and no admin-side log-ins are occurring. In which case it'd just have to rely on the TTL expiring and a frontend request would have to do the work to populate the cache again.

This gives the benefits of persistent cache, while solving for invalidation in a pretty reasonable way.

Last edited 7 months ago by iCaleb (previous) (diff)

#3 @oandregal
7 months ago

It's worth noting that these caches were implemented as persistent in Gutenberg, and they were removed to unblock this work from landing into core. See for details, and links to the conversations. The PR can also serve to bring everything back quickly.

The cache invalidation covered all the following scenarios:

  • upgrader_process_complete: for WordPress, plugin, and theme updates.
  • switch_theme: the active theme is changed.
  • start_previewing_theme: previews for customizer and theme directory.
  • save_post_wp_global_styles: user updates theme.json data via the site editor.
  • activated_plugin and deactivated_plugin: any plugin can hook into the existing theme.json filters, so when they are activated/deactivated the filters may change.

The disagreement was specific to the following two points:

  • Consumers that hook into theme.json data using dynamic mechanisms (deciding whether to hook or not based on an option or user meta value). For these cases, the alternatives would be to add more cache invalidation points (option add/update/delete, etc.), or ask the consumers to clear the cache themselves.
  • Updates that are not done via the regular WordPress mechanism (FTP updates, updates managed via git repositories, etc.). The alternative is asking consumers to clear the cache themselves.

Note these two cases only if they use a persistent object cache plugin, as core, by default, doesn't persist.

#4 @spacedmonkey
7 months ago

WordPress core already has the cache group of theme. I wonder if we could use them. This is a special cache, that is an opt-in to persistent. See.

This means, for site owner that have control, their can make this group persistent, as they can cache clear when the theme is updated. Otherwise, it is a non persistent groups. This could be a interesting model to follow, as it would allow for production sites, that hosting companies know are not going to change, to persistent otherwise just be an in memory cache.

#5 @mukesh27
7 months ago

  • Keywords needs-patch added

#6 @flixos90
7 months ago

@iCaleb Your idea certainly sounds interesting, however there is still the concern about how that affects filter callback logic that runs due to the filters from the logic that generates the data which is then cached. Whenever the cached data is returned, those filters aren't run. Which maybe is fine for some, but is not an expected situation, and with the scale of WordPress I'm sure will lead to breakage.

So the problems here are partially invalidation, partially filters in the logic that is no longer executed when cached (see @oandregal's comment for more details).

@spacedmonkey Regarding the existing theme cache group, I think we should avoid intertwining with that. There are different implications for that cache group, and opting in to making it persistent can be a good idea. Whereas if you now opted in to making the theme-json cache group persistent, you would effectively opt in into broken behavior, which IMO is not something we should offer or encourage.

#7 @spacedmonkey
7 months ago

This cache could also be global cache group, like the theme group.

#8 @oglekler
3 months ago

  • Milestone changed from 6.3 to 6.4

This is an enhancement, and we are in 12 days until Beta 1 after which we will not add new enhancement to the release, so, because there is no patch, I am moving this ticket to 6.4.

#9 @spacedmonkey
2 months ago

  • Owner set to spacedmonkey
  • Status changed from new to assigned

#10 @hellofromTonya
2 months ago

  • Keywords gutenberg-merge added

As this ticket will likely involve work in Gutenberg, adding the gutenberg-merge keyword for tracking.

This ticket was mentioned in PR #5024 on WordPress/wordpress-develop by joemcgill.

5 weeks ago

  • Keywords has-patch added; needs-patch removed

This is a performance optimization of the WP_Theme_JSON_Resolver, which stores the merged WP_Theme_JSON object to a static property to avoid unnecessarily recreating this object by merging from other sources. In my testing, this cuts the execution time of this method in half.

Trac ticket:

#12 @joemcgill
5 weeks ago

  • Keywords needs-patch added; has-patch removed

After spending some time getting a better understanding of this system, and reading through the history of this issue, It seems to me like we should be focusing on a strategy for persistently caching data from WP_Theme_JSON_Resolver.

Currently, this class is responsible for loading theme.json data from four different sources (core, blocks, theme, and editor data) and then merging them together (depending on context).

All of the individual sources only get calculated once and stored to a static property of that class, which helps, but the process of creating a merged WP_Theme_JSON object is still pretty expensive.

I'd like to explore three options:

  1. Store the merged data to a static property, similar to the individual sources, to eliminate the cost of merging them.
  2. Make the entire merged result cacheable in a persistent way
  3. Make the data from each source cacheable in a persistent way, if the entire result cannot be cached persistently

Option 1, seems like a very simple optimization with very little downside. I've opened a PR as a proof of concept.

@oandregal commented on PR #5024:

5 weeks ago

Thanks for looking into this with fresh eyes.

I'd like to provide a use case of how this breaks (but works in trunk), so we can discuss ways forward.

  • Apply this patch to your environment:
  • src/wp-content/themes/twentytwentythree/theme.json

    diff --git a/src/wp-content/themes/twentytwentythree/theme.json b/src/wp-content/themes/twentytwentythree/theme.json
    index 68e17a87e9..08e2e8ac0c 100644
    a b  
    323323                               }
    324324                       },
    325325                       "core/post-content": {
     326                               "color": {
     327                                       "background": "hotpink"
     328                               },
    326329                               "elements": {
    327330                                       "link": {
    328331                                               "color": {
  • src/wp-includes/blocks/archives.php

    diff --git a/src/wp-includes/blocks/archives.php b/src/wp-includes/blocks/archives.php
    index 695affde76..3b55c0a9c0 100644
    a b function register_block_core_archives() { 
    116116       );
    118118add_action( 'init', 'register_block_core_archives' );
     121       'init',
     122       function() {
     123               WP_Theme_JSON_Resolver::get_merged_data();
     124       },
     126 No newline at end of file
  • Load the hello world post. The expected result is that the post content uses hotpink as background color, as it does in trunk:

@oandregal commented on PR #5024:

5 weeks ago

The way I see it, we can make this change for 6.4 as long as we invalidate the get_merged_cached when registered blocks are different, to avoid future issues, like we had in the past.

A longer brain dump:

theme.json data has different lifecycles and sanitization requirements:

  • templateParts and customTemplates – they are accessed before block registration (before init), and their sanitization doesn't require blocks.
  • settings and styles – they are accessed after block registration, and they require blocks to be sanitized.

Not knowing this has been a source of bugs and performance issues.

WordPress 6.4 is introducing two public methods for consumers to access templateParts and customTemplates, so it's unlikely that this happens again. However! I still think get_merged_data should not require consumers to know _when_ in the WordPress lifecycle it is safe to call it to avoid cache issues.

@oandregal commented on PR #5024:

5 weeks ago

Oh! By the way, @joemcgill I'd rather make this change in Gutenberg first, and only port it to core after it has been tested there. This gives us confidence it works and makes synchronizing the two codebases easier for everyone involved.

joemcgill commented on PR #5024:

5 weeks ago

Thanks for taking a look and giving some feedback, @oandregal. The use case that you explained in your write up is also confirmed by failing unit tests. I can see that there are several different places where theme.json data is recalculated based on the return value of the has_same_registered_blocks() method for the origin being calculated, but trying to use a similar approach for merged data doesn't seem to work (see d1319a1).

Lifecycle issues was one of my concerns with this approach, and this confirms that is indeed a problem. That also means that any function that relies on the WP_Theme_JSON_Resolver::get_merged_data is already unpredictable since its return value can change depending on when in the application lifecycle it's called, so any higher level caching at the utility function level is also likely to cause bugs like this.

I'm not sure if the unit tests failures are due to missing a place where the cache should be invalidated during the test setup, or if there is a more predictable way to know when all of the different sources of data are finalized, at which point it's safe to cache the merged object. Definitely more to dig into.

I'm happy to try to apply an iteration of this change to the Gutenberg version of this class first once we've got a working approach.

@oandregal commented on PR #5024:

5 weeks ago

I see the failing test is test_should_conditionally_include_font_sizes and it tests 1) switching to a classic theme and 2) updating the "theme supports" it declares. The get_theme_data method does not cache the full result of injecting the "theme supports": if they need to be used, the final theme data is recalculated. I don't have much time to help debug this week, I could next week. Hoping this helps anyway?

@flixos90 commented on PR #5024:

5 weeks ago

@joemcgill I like where this is going, however as @oandregal mentioned above, in order to fix the outstanding issues, we would need to somehow cache the "theme data" results to also include their data with_supports.

This was previously attempted in #3624, which however was eventually closed. Looking back at the discussion there though, while there is some complexity to figure out, I don't think there was a clear reason it was closed, I believe we just didn't prioritize it at the time. I do think the idea from that PR is worth picking up, particularly now as it would potentially unlock this greater performance enhancement here.

Maybe we can find a better solution to avoid the problem of having to clean the theme.json caches lots of times (see Potentially we can come up with something along the lines of WP_Theme_JSON_Resolver::has_same_registered_blocks(), like a method that checks whether the theme support data has changed from when the data was last calculated.

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.

4 weeks ago

#20 @joemcgill
4 weeks ago

  • Owner changed from spacedmonkey to joemcgill

Reassigning to myself to move this forward for now.

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.

3 weeks ago

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.

11 days ago

joemcgill commented on PR #5024:

5 days ago

@oandregal I've taken another pass at this PR, and it does seem like the main thing causing the types of bugs (and unit test failures) we discussed previously was the need to recalculate the merged data if theme supports data changes. I've modified the approach so that the theme supports data gets saved to a static variable so we can test whether the value has changed, much like we test whether registered blocks have changed.

This results in a big improvement in terms of reducing the number of unnecessary WP_Theme_JSON objects that get created and merged during a page lifecycle, but unfortunately doesn't seem to result in a very large performance benefit from what I can tell, but I want to do some more profiling.

In the two XHProf reports below, you can see the reduction in calls to all the getter methods for various theme.json data origins, as well as a reduction in calls to WP_Theme_JSON::merge, all of which have a positive impact on memory and cpu usage, but only a slight reduction in the overall server time.



Do you think something like this is worth pursuing?

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.

4 days ago

This ticket was mentioned in PR #5267 on WordPress/wordpress-develop by @flixos90.

3 days ago

  • Keywords has-patch added; needs-patch removed

While a complexity that has made it complicated to progress has been the additional filters firing around theme.json parsing, this PR explores an approach that is not subject to that complexity: By persistently caching _only_ the raw results of the theme.json files (i.e. none of the following parsing), it already saves a lot of processing time on every page load (for sites that use a persistent object cache).

The impact of this is already quite notable, with wp-total being 8.9% faster for TT1 (classic theme) and 3.2% faster for TT3 (block theme). See also which is another related idea that focused on the cost of _just_ the JSON decoding of the files while just by itself has a notable negative performance impact.

This PR is currently for testing & benchmarking purposes only. It should preferably be implemented in Gutenberg before being merged into core. If there is consensus on the approach though, I think this would be worth prioritizing still for WordPress 6.4 given the notable performance impact for sites using a persistent object cache.

Keep in mind that this only has an impact when using a persistent object cache. If we wanted to bring these benefits to all WordPress sites, we could consider using transients instead.

Trac ticket:

This ticket was mentioned in Slack in #core-performance by flixos90. View the logs.

3 days ago

#27 @flixos90
3 days ago

I just opened the above PR as another idea I had the other day: While we can't easily cache theme.json data persistently because filters don't fire, we can easily cache the most low-level part of it which is the JSON-decoding of the actual files.

Caching just that part itself already has a notable positive performance impact for sites that use a persistent object cache. Please see the PR description for more context.

If there is consensus on that idea, I'm happy to port this over to a Gutenberg PR. Wanted to implement it in core for simplicity first so I can benchmark it in core itself.

Last edited 3 days ago by flixos90 (previous) (diff)

#28 @joemcgill
2 days ago

Thanks @flixos90. I agree that we should consider both strategies together here:

  1. Caching the merged data non-persistently to eliminate the need to recreate the merged dataset multiple times per page load (PR #5024)
  2. Persistently caching each theme data origin (core, blocks, theme, user) based on the different invalidation rules needed by each (PR 5267)

Could you take a look at the initial PR and provide feedback from your testing? I'll do the same with yours.

@flixos90 commented on PR #5024:

2 days ago

@joemcgill In my benchmarking (similar approach that I used for #5267), this PR shows a small but consistent improvement. After rerunning a few times, it is always slightly faster with this PR, so I feel good about making this change.

For reference:

  • wp-total with TT3: 86.24ms with the PR, 86.6ms without (0.4% faster)
  • wp-total with TT1: 52.1ms with the PR, 52.24ms without (0.3% faster)

It also makes sense to me that the benefit is slightly higher for the block theme given theme.json is more heavily relied upon there, and I assume this in the past resulted in more executions of the costly get_merged_data() logic than it did for the classic theme.

joemcgill commented on PR #5024:

2 days ago

Thanks @felixarntz. I'm going to go ahead and apply these changes to a Gutenberg PR to get feedback from that team and to hopefully get this tested in the plugin prior to a core change.

joemcgill commented on PR #5024:

26 hours ago

@felixarntz I've applied your PR feedback and also opened a PR with these changes in the Gutenberg repo, here:

@flixos90 commented on PR #5267:

21 hours ago

Thanks @joemcgill for the feedback.

In my testing for #5024, I've noticed that the most expensive operations during Theme JSON processing is the core data and the user data, whereas the data for blocks and theme have less of an impact.

That's certainly possible. I only tested overall performance, but didn't profile. Could you share the data for how much the 4 different processes took? Since the profile below is only for get_theme_data()? Another question, did you test with a classic or a block theme? Asking since I'd expect the changes to get_theme_data() to be impactful for block themes only.

I ran some profiles with this PR and I'm not seeing the same degree of impact that you've reported. It also seems like most of the time in the get_theme_data method is spent constructing the WP_Theme_JSON_Data objects, which is unaffected by your changes.

I'm not sure about that. When you say "unaffected", do you mean that in terms of little performance impact? Because the changes here should certainly affect get_theme_data() as it reads a theme.json file.

We could cache the WP_Theme_JSON_Data object before it's passed to the wp_theme_json_data_theme filter, since that should not change unless the theme is updated, which could help, or further improve the performance of that class constructor.

That could work. Have you verified that there are no extension points (e.g. filters) firing within the WP_Theme_JSON_Data constructor logic (which also instantiates a WP_Theme_JSON)? I only wanted to touch the reading JSON file parts because in that one I could be sure there's no extension points.

Note: See TracTickets for help on using tickets.