Make WordPress Core

Opened 8 months ago

Closed 4 months ago

Last modified 2 months ago

#59719 closed task (blessed) (fixed)

Revise caching approach for theme files

Reported by: flixos90's profile flixos90 Owned by: joemcgill's profile joemcgill
Milestone: 6.5 Priority: high
Severity: normal Version:
Component: Themes Keywords: 2nd-opinion has-patch
Focuses: performance Cc:

Description

This is a follow up related to #59490, #59633, and #59600.

Specifically, the purpose of this ticket is to decide the path forward on the temporary approach of using a global cache group in #59633, which will also determine the path forward to take on #59600.

Questions to answer:

  1. Should we continue to use wp_cache_*() functions, or should we switch to transients to bring these benefits to sites without a persistent object cache too?
    • When relying on WP_DEVELOPMENT_MODE during theme development, using transients would be perfectly fine, as the caching gets (temporarily) disabled as long as the constant is set.
    • However, for e.g. agencies or freelancers that continuously deploy new theme versions, it may be more common to rely on flushing the cache when an update is made. This is arguably not about developing the theme in production, but rather continuous deployment without bumping the version number (the latter of which would automatically invalidate the cache).
    • If we want to maintain the latter workflow of flushing the cache to refresh theme file caches, we either need to stick to wp_cache_*(), or we need to implement a way to "flush" certain transients even when not using a persistent object cache. (Flushing a persistent object cache would by definition flush transients too as in that case they're part of the object cache.)
  2. If we continue to rely on wp_cache_*() functions, which cache group should we use?
    • We should not use the themes cache group for this as it is non persistent by default, even for sites with a persistent object cache.
    • We should not use the global cache group as that is not encouraged.
    • We could consider introducing a new cache group for this purpose, e.g. theme_files, or theme_templates, or similar.

Change History (39)

#1 @flixos90
8 months ago

  • Priority changed from normal to high

Marking this as a high priority for the 6.5 release as it is partly a follow up, partly a blocker to other potential high impact tickets #59600 and #59314.

@joemcgill Let's continue the discussion here that was started as part of #59633.

Last edited 8 months ago by flixos90 (previous) (diff)

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


8 months ago

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


8 months ago

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


8 months ago

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


7 months ago

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


7 months ago

#7 @flixos90
7 months ago

I just had a good chat with @joemcgill about this, where we ended up taking a step back. While determining how to best cache this data continues to be relevant, it's worth considering other ways to improve the performance of the relevant logic regardless of caching. If there are opportunities for that, we could afterwards reassess how the caching here should behave to achieve a good compromise between reliability, consistency, and performance gain.

A few observations on existing theme caching first:

  • The themes cache group (see WP_Theme class) is by default not persistent, but can be made persistent via the wp_cache_themes_persistently filter. This filter additionally allows to define the cache expiration to use (in seconds). If no specific expiration is defined, it's set to 30 minutes.
  • This same filter is used to control whether to persistently cache the list of available themes (see search_theme_directories() function). The main difference is that here the WP Cache API isn't used, instead it uses a transient, which means even sites without a persistent object cache will receive these caching benefits. This in itself is inconsistent.
  • There is no invalidation logic for this cache group at all. This could be okay as the data naturally expires even if persistent, but it's worth highlighting that when you for example install a new theme, it would not show up in WP Admin until the cached data expired.
  • Eventually, we could either use this cache group for all theme data or introduce a new more specific theme cache group depending on what is being cached. This decision depends on a few factors, including whether or not we think the approach of "non-persistent by default, persistent by opt-in" makes sense for this data.

Before we can decide that though, we identified another potential opportunity with block templates, block template parts, and block patterns (i.e. relevant for all tickets blocked by this ticket): Currently, when registering the data for them, that data includes the full contents of these files. Getting the file content accounts for some of the most expensive work, but the file contents actually only matter if the respective template/part/pattern is actually used on the current URL. There may be other data that isn't needed either unless the template/part/pattern is actually used. So we should consider reworking the template/part/pattern registration logic to only register the bare minimum data on every page load.

If we can find a solution to that, we can probably reduce the performance overhead significantly for themes that include a large number of templates/parts/patterns, without involving any caching.

An additional benefit would be that, even if we then cached this data, it wouldn't involve caching the actual file contents, which would mean that changes to those files would still be reflected immediately, plus it would keep cache data for individual keys from growing too large.

We should probably explore that further:

  • Can we rework each of the block template/part/pattern registration logic to only register the bare minimum data needed by default?
  • The relevant functions would still need to optionally support registering the full data, or there would need to be some "lazy-loading" mechanism to only run if any of those data fields is being accessed. This would be relevant for example in the block editor where templates/parts/patterns need to be previewable.
  • For regular page loads though, only the bare minimum data would be relevant, and the extra data (such as the actual file content) would only be relevant for the few templates/parts/patterns actually used on the current URL.

It's worth noting that there is also a little bit of potential overlap with #57789. The two tickets are about caching different aspects of theme data, and while the relevant logic considerations are completely different, in terms of caching it may make sense to follow similar principles. It's worth noting that theme.json data caching has the additional complexity of firing lots of filters which means the filtered data specifically cannot be cached as it could be dynamically adjusted by code from anywhere (not only the current theme).

Last edited 7 months ago by flixos90 (previous) (diff)

#8 @joemcgill
7 months ago

Thanks for summarizing the points we discussed, @flixos90. The couple of observations that stand out to me as the most relevant to theme caching strategy is that currently, the only place where we're using caching in the WP_Theme class that isn't part of the 'themes' cache group is the new 'wp_theme_patterns_{$stylesheet} key that was moved to the class in [56978].

Instead, that key is currently being cached to the global group so it is persistent in environments that support a persistent object cache. It would be more consistent to save that key to the themes cache group, but not until we're able to see if can avoid the most expensive parts of getting all of the pattern data so that moving this to a non-persistent (by default) group is feasible.

If not, then I think it's worth exploring a similar approach to what search_theme_directories() function does, which is to store that data in a transient with a default expiration time, and make use of the context argument on the wp_cache_themes_persistently hook to enable more fine grained control over whether that value is persisted by default or not. For example, we could choose to have persistence set to true by default for just this value and use the filter for people to opt out if needed (hypothetically—I'm not actually proposing this approach without more exploration).

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


7 months ago

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


6 months ago

This ticket was mentioned in Slack in #core-performance by thekt12. View the logs.


6 months ago

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


6 months ago

#13 @flixos90
6 months ago

  • Owner changed from flixos90 to joemcgill

We had a long discussion about this today on Slack, involving mostly @joemcgill, @spacedmonkey and myself.

Some context summary:

  • We cannot use the themes group as it is non-persistent by default.
  • We cannot use the theme_json group as it is non-persistent and also includes site-specific considerations (e.g. it cannot be globally cached).
  • We need to consider the differences between:
    • Sites with vs without persistent object cache
    • Multisite networks vs regular sites
  • We need to consider that people may edit theme files (specifically of custom themes) in ways WordPress cannot detect.

Taking into account the ideas and perspectives shared in that discussion, here's what I would propose:

  • For sites that don't use a persistent object cache (see wp_using_ext_object_cache()), cache theme files in network transients (*_site_transient()).
    • On non-multisite, this automatically falls back to storing them in the options table.
    • Regardless of multisite or not, these transients are never autoloaded.
  • For sites that use a persistent object cache, introduce a new global cache group theme_files.
  • Use the transients / the cache group to cache theme files across all themes (i.e. transient/cache keys would include the theme's slug).
  • Bypass any of these caches if WP_DEVELOPMENT_MODE is set to "theme" or "all".
  • If using transients, provide an expiration (time TBD, likely anything between 30 minutes to 1 day).
    • This means when someone edits theme files in unsupported ways, at least they don't have to wait too long for the changes to become active.
    • For those that need changes to be reflected immediately, they need to either use WP_DEVELOPMENT_MODE or flush the transient with a plugin.
    • This is similar for sites using an object cache, though the difference in reality is that flushing the object cache is a commonly supported functionality in plugins, so it's a more intuitive workflow than having to flush transients (which isn't something that WordPress has historically required or encouraged anywhere).
  • If not using multisite, delete the previous theme's cached data since there's only that one site with the one active theme anyway (maybe regardless of whether object cache or transient, but certainly at least when using object cache).

@joemcgill @spacedmonkey Please feel free to add anything I missed, and let me know if this proposal is in line with what you're thinking.

Open questions if we go with the above:

  • What should be the expiration time?
  • How would we deal with theme versioning? Could be either part of the cache key, or stored within the transient/cache (in which case a cache value with outdated version would be considered stale and thus ignored). The latter is already used for the theme pattern caches today.
Version 1, edited 6 months ago by flixos90 (previous) (next) (diff)

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


6 months ago

#15 @spacedmonkey
6 months ago

Cache invalidation with themes is extremely hard. Theme files of all types can be updated at any time. Theme files can be updated in the following ways.

  • Files are editted on file system, via FTP or SSH. There is no PHP event / action for this. File edit time (may be) updated.
  • Files are overwritten using a tool like git/rsync or cp. There is no PHP edit for this. File edit time (may be) updated.
  • Files are editted in the CMS using the theme editor. There is PHP event or action here. This is unpopular way of editing files and is disabled on many hosts.
  • Updating theme via plugin repo. There is a currently cache invalidation and a PHP event there.

In instances where files are directly edited, the absence of a PHP event complicates cache invalidation, leading to potential issues such as incorrectly cached files. Fixed time cache settings are also deemed unsuitable, as immediate theme updates are expected without delay.

The use of functions like filemtime is hindered on cloud servers. For example, in a scenario with multiple web servers deployed simultaneously, the last update date would be identical. However, introducing or replacing a server could result in differing last update dates, causing cache invalidation even when no files were modified. This situation may further lead to inconsistent cache values.

To address these challenges, a proposed solution includes the introduction of a WP CLI command for theme cache clearance, allowing hosts and developers to force cache clears upon code deployment:

wp theme cache twentyten clear

This command empowers hosts and developers to force cache clears upon code deployment, ensuring timely updates across web servers. Additionally, incorporating a theme cache clear button in the CMS interface offers a user-friendly alternative.

To address the unique workflow of hosts like VIP GO, adhering to a git-based deployment, an optional constant can be introduced. This constant, when set, signals that theme files are immutable on the file system. Large hosts capable of supporting this feature can opt in, simplifying cache invalidation steps post-code deployment, as exemplified by the WP CLI command mentioned earlier.

#16 @spacedmonkey
6 months ago

Side note, it has been clear to be there are different groups of data that need to cached.

  • Theme json files.
  • Theme files data
  • Theme block patterns.

I think that caching all these in one place would be mistake. There are different types of data and have different cache invalidation needs. Keep these and other data cache apart, may seem like a problem, but will result in improve cache and cache invalidation.

#17 @spacedmonkey
6 months ago

Had some thoughts about cache invalidation.

We could add a scheduled event that runs every say 15 minutes and does the following.

One first run, loops through all of the installed themes, getting the current version number and last modified time of theme directory. Create an array of this data and store it in a network options or transient.

One second run, loop through all installed themes, comparing against the stored version and last modified times. If either of these values do not match, call the delete_cache method on theme object. If there is also a mismatch, update the network option or transient.

Themes are not updated that much, so this scheduled event way of cache clear could work well. It would mean invalidate caches until these run, but this is still better than what we have now.

Running in scheduled event, means that this expensive check could be done in a background process or not on every page request.

Thoughts?

#18 follow-up: @joemcgill
6 months ago

Thanks for adding your thoughts, @spacedmonkey. I agree that invalidation is one of the bigger challenges to consider when caching theme files due to the various non-standard ways that theme files can be edited. This is one of the reasons that I feel pretty strongly that any caching needs to include a TTL expiration so that sites don't get stuck in a broken state due to a stale cache indefinitely.

In addition, it's helpful to list the cases where we can assume that files have been updated and explicitly clear caches in those cases as well. Off the top of my head, those include:

  1. When the theme version number has changed
  2. When someone has edited the theme from the Theme File Editor

If we can find additional indicators that the files have been modified in order to invalidate caches, I think we should do so, but that shouldn't block us from implementing the strategy that @flixos90 outlined above, IMO.

I also agree that it is good to think about designing the caching strategy so that it is easy for external tools like WP CLI to make use of it, but those ideas should be discussed and handled separately from this particular ticket.

Lastly, I want to offer a slight modification to the cache groups you've defined above:

  • Theme data (themes): Theme data that has traditionally been cached by Core, non-persistent by default.
  • Theme JSON data (theme_json): Expensive data that is derived from the theme.json structure, which includes loading and parsing options and settings across core, themes, blocks, and user data (e.g., wp_get_global_settings()).
  • Block theme template files (theme_files): This currently includes block patterns (#60120), and potentially block templates/template-parts (#59600).

The reason the last ones are worth handling differently than the prior two is that there is extra overhead involved in loading and parsing these types of files when compared with traditional PHP template parts, which are loaded an included at runtime. For patterns specifically, it's even more sever due to the fact that all of the patterns get registered on every request (as you've previously identified). Since this data is generally only used only once per request, we want to avoid this cost to each request. These files rarely change between requests so putting some persistence in place makes sense. If we find better ways of reducing the overhead of these types of files (e.g., lazily loading block patterns), the new theme_files cache group may become unnecessary entirely, and these could all move to the default themes cache, where a site could opt-in to persistence for the whole group.

#19 in reply to: ↑ 18 @spacedmonkey
6 months ago

Replying to joemcgill:

This is one of the reasons that I feel pretty strongly that any caching needs to include a TTL expiration so that sites don't get stuck in a broken state due to a stale cache indefinitely.

It is worth noting that theme cache already has a TTL on it. The cache_expiration is 1800. I would like to see this filterable.

  1. When someone has edited the theme from the Theme File Editor

Also worth noting that theme caches are already invalidated when using Theme File Editor. See wp_edit_theme_plugin_file and these lines. Caches are also invalidated when themes are deleted, see delete_theme and these lines. Caches are already invalidated core / plugins / themes are updated when wp_clean_themes_cache is called.

We should make sure that all cache invalidation happens in the WP_Theme::cache_delete() method. This is called throughout core and by third party plugins.

Until we "solved" the cache invalidation problem, we CANNOT use site_transient functions to cache this data. I have outlined a fix here, but I am open to other ideas.

Last edited 6 months ago by spacedmonkey (previous) (diff)

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


6 months ago

#21 follow-up: @flixos90
5 months ago

@joemcgill @spacedmonkey I concur that there should be a TTL on these caches. We could use the 1800s (30 minutes) that is already used on the themes cache (if configured to be persistent), that would keep the default consistent. We could then introduce a filter for it.

The TTL will help work around the limitations of cache invalidation, which is impossible to reliably solve when editing theme files in non-standard ways. With a 30 minute TTL, changes will still reflect after a short time. If immediate invalidation is needed, this can be achieved in one of two ways:

  • Either flush the cache, or the specific cache group, or flush transients. This would be most relevant when deploying/uploading a new version of the theme, and you don't want to wait for 30 minutes.
  • Or use WP_DEVELOPMENT_MODE to bypass caching entirely. This would be most relevant when actually developing the theme (e.g. locally). There you definitely don't want to wait for changes to be reflected, and that's specifically what that constant is for.

I agree with the proposed group names themes (already present anyway), theme_json, and theme_files. There are at least two crucial ways in which theme_json and theme_files differ (and thus, why they need to be distinct groups):

  • theme_json includes potentially user-generated / user-modified data, so it cannot be global. theme_files on the other hand only includes data that comes entirely from the filesystem.
  • theme_json uses cache keys that are always for the current theme, where for theme_files we envision the keys to include theme slugs and be global.

#22 in reply to: ↑ 21 @spacedmonkey
5 months ago

Replying to flixos90:

@joemcgill @spacedmonkey I concur that there should be a TTL on these caches. We could use the 1800s (30 minutes) that is already used on the themes cache (if configured to be persistent), that would keep the default consistent. We could then introduce a filter for it.

It is worth noting that this value is already filtable. See these lines. If you use wp_cache_themes_persistently filter and pass a int, it will change this value.

add_filter('wp_cache_themes_persistently', function(){
  return DAY_IN_SECONDS;
});

This will set self::$cache_expiration to a day. This functionality should be documented better.

@flixos90 I outlined using cron to do cache invalidation in this comment. What do you think?

I have created a issue on WP CLI, for a cache clear command for themes.

#23 @flixos90
5 months ago

@spacedmonkey While the wp_cache_themes_persistently allows filtering the expiration, it primarily allows making the themes cache group persistent, so we'll probably have to decouple this and use a new filter, as this one does something else at the same time.

Regarding cron cache invalidation, I don't believe this would be useful, given we'll put an expiration of 30 minutes on those caches anyway.

On another note: This discussion prompted me to build a small and simple plugin to flush transients (most relevant for sites that don't use a persistent object cache): https://github.com/felixarntz/flush-transients

This would be a simple solution for sites without a persistent object cache that would want to have changes immediately reflected after a deployment of a theme.

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


5 months ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


4 months ago

#26 @joemcgill
4 months ago

  • Type changed from enhancement to task (blessed)

This ticket was opened as a way to consolidate conversation about the caching strategy that should be applied to tickets like #59600 and #60120. Since, I'm not expecting that this ticket would explicitly fixed with a commit, I'm changing to a task to allow conversation to continue through the beta period.

#27 @spacedmonkey
4 months ago

Should this ticket be punted, consider beta 1 is around the corner.

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


4 months ago

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


4 months ago

This ticket was mentioned in PR #6137 on WordPress/wordpress-develop by @thekt12.


4 months ago
#30

  • Keywords has-patch added

This ticket was mentioned in Slack in #core-performance by thekt12. View the logs.


4 months ago

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


4 months ago

#33 @joemcgill
4 months ago

  • Resolution set to fixed
  • Status changed from assigned to closed

The original purpose of this issue, which was to consolidate conversation about the caching strategy for block theme files that was happening across various issues like #59600, #59633, and #60120 has been fulfilled by the discussion summarized by this comment. For that reason, I'm closing this as fixed and we can continue any implementation detail discussion on #59600, which I've modified to include block patterns in addition to the template files. I've also update the PR description of this PR to point to #59600.

To summarize the main results of this issue:

  • Block theme template related files will now be cached in a new theme_files cache group.
  • We will explore the use of persistent caching for this group in #59600.
  • All caches in this group should respect WP_DEVELOPMENT_MODE.
  • Persistent caching strategies should be resilient to files being directly modified on production servers (e.g., transients must use expiration times).

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


4 months ago

@flixos90 commented on PR #6137:


3 months ago
#35

@kt-12 @joemcgill I'm not sure https://core.trac.wordpress.org/ticket/59600 is the correct ticket this PR is addressing. That ticket is about block template files, while this PR addresses block patterns. I thought there was a dedicated ticket for that one too. If not, let's create one and point this PR to that one.

Caching block template files, while very similar in the approach, is a separate problem.

@joemcgill commented on PR #6137:


3 months ago
#36

@felixarntz

I'm not sure https://core.trac.wordpress.org/ticket/59600 is the correct ticket this PR is addressing

The tickets related to theme_file caching began to get very hard to follow during the end of the last release, so I've updated the issue to consolidate all template file caching (including patterns) into the same ticket. See this comment.

@flixos90 commented on PR #6137:


3 months ago
#37

@joemcgill

I'm not sure https://core.trac.wordpress.org/ticket/59600 is the correct ticket this PR is addressing

The tickets related to theme_file caching began to get very hard to follow during the end of the last release, so I've updated the issue to consolidate all template file caching (including patterns) into the same ticket. See this comment.

Thanks for clarifying. However, this PR still doesn't solve the original purpose of that ticket, it only addresses part of it (block patterns). So I don't think that ticket can be closed from only caching block patterns when it was originally about block templates. I'm only saying that because typically commits close tickets. I'm happy to keep this association as long as we're aware committing this will not _fix_ the ticket.

@flixos90 commented on PR #6137:


3 months ago
#38

@joemcgill Regarding cache invalidation, I think we had discussed those points already on the ticket. Trying to summarize the conclusion based on my understanding:

  • For users that directly edit theme files (without bumping the version number):
    • If they edit using the WordPress built-in theme file editor, cache invalidation is automatically taken care of by WordPress.
    • If they edit elsewhere and upload or deploy somehow, they have three options:
      • Either use WP_DEVELOPMENT_MODE (e.g. set to "theme") to disable this caching entirely, permanently or temporarily.
      • Or, wait for the cache expiration (by default 30 minutes) for changes to take effect.
      • Or, if they want to have the changes reflected immediately, flush the respective cache (when using a persistent cache), or transients (when using a non-persistent cache).
        • Several plugins have a "Flush cache" button to do the first. For more fine grained control, they could only flush the theme_files group, which may be better for high-traffic sites, to not flush all other data too.
        • For flushing transients, I recently published https://wordpress.org/plugins/flush-transients/, which is essentially a "Flush cache" button, but for transients.
  • It should be fair to assume that at scale, relatively speaking few users edit theme files directly. Those that do are often maintained by developers that often are aware of cache flushing. Though, even for when they're not, we're catering for that by using a short cache expiration time by default.

I believe that with all of these considerations we are catering for the majority of sites, while providing reasonable options for other sites. WordPress doesn't provide any opt-out filters for caching to my knowledge, and I'm not convinced we need to introduce one here. Last but not least, we're trying to optimize performance out of the box, even for sites without a persistent object cache, to make those benefits available to a majority of sites.

Note: See TracTickets for help on using tickets.