Opened 7 months ago
Last modified 3 months ago
#61296 reopened enhancement
Add "force" parameter to get_option and get_transient
Reported by: | juvodesign | Owned by: | |
---|---|---|---|
Milestone: | Future Release | Priority: | normal |
Severity: | normal | Version: | |
Component: | Options, Meta APIs | Keywords: | has-patch reporter-feedback |
Focuses: | performance | Cc: |
Description
Since both get_transient and get_option already have built in ways to handle object caching, i think it would be reasonable to add the force parameter and pass it through to wp_cache_get.
I imagine having the default parameter set to false. This shouldn't break anything but give developers better control in cases where the cache is being updated while in a request lifecycle.
I stumbled across this when using persistent object caching:
https://github.com/rhubarbgroup/redis-cache/issues/523
Other users are also missing this option:
https://wordpress.stackexchange.com/questions/100040/can-i-force-get-option-to-go-back-to-the-db-instead-of-cache
If this is something worth considering, I would happily provide a patch.
Change History (15)
This ticket was mentioned in PR #6691 on WordPress/wordpress-develop by @snehapatil02.
6 months ago
#1
- Keywords has-patch added
#2
@
4 months ago
- Resolution set to invalid
- Status changed from new to closed
Hey @juvodesign, I hope you’re doing well. I wanted to follow up on this ticket (#61296) regarding the new enhancement to add the "force" parameter to get_option and get_transient. I have submitted a PR with the proposed changes.
Could you please review the PR when you have a moment? I believe this enhancement will provide developers with better control over object caching during the request lifecycle.
Thank you for your time and consideration. I look forward to your feedback.
#4
@
4 months ago
Should this apply to get_site_transient()
and get_{site,network}_option()
as well?
This ticket was mentioned in PR #7299 on WordPress/wordpress-develop by @helias10.
3 months ago
#5
This PR introduces a new optional parameter to get_transient and get_option as well as get_site_transient and get_{site,network}_option. However, it should be fully backwards compatible.
The force parameter is passed through all of these functions. In case of transients, the parameter is also passed to the timeout transient.
Trac ticket: [](https://core.trac.wordpress.org/ticket/61296)
#6
@
3 months ago
I just submitted another PR @snehapatil02 i think i caught some edge cases could you maybe take a look? Considering you also submitted a PR i took inspiration by that.
#7
follow-up:
↓ 9
@
3 months ago
i think it would be reasonable to add the force parameter and pass it through to wp_cache_get
Sorry bit I don't see how that would be particularly useful. Could you explain the use case(s) for it? Also thinking there may be a chances to run into some DB write/read problems for options that have been just set, especially when the DB is replicated?
#9
in reply to:
↑ 7
@
3 months ago
Replying to azaozz:
Could you explain the use case(s) for it?
Not the OP, but I can chime in.
The database is the sole source of truth, while cache is not reliable: it may get unintentionally flushed by something else, causing time-of-check/time-of-use issues that in some cases may build up to undefined behavior.
Usually it's not a big concern, but sometimes it is:
- handling locks (OP's example)
- doing many operations in a loop on a list of transient items: the operations may trigger hooks that flush the cache for the items the loop iterates over
This way, developers would have an option to ensure the data is as correct as possible, at the cost of the option/transient read speed.
Also thinking there may be a chances to run into some DB write/read problems for options that have been just set, especially when the DB is replicated?
Database changes that do not propagate immediately is not really something that can be supported anyways. Not to mention that the cache is not something that is present everywhere, so if there were issues with that, they would have already surfaced in cache-less installs.
In normal HA setups, the replication delay does not affect the master's ability to mutate the state immediately.
#10
@
3 months ago
Exactly as @alexeykorotkov described. I ran into issues where the default behavior caused the in-memory caching mechanism of WordPress to contain old data.
I noticed the issue when heavily using the action scheduler of WooCommerce. When process A and B start running at the same time and A changes data of a transient while B also already read that same transient, which caused to have it in memory already, there is no way with get_transient to get the latest transient data modified by A.
Example scenario:
Time 0: Action Scheduler initiates Process A and Process B asynchronously.
Time 1: Process B reads transient 'X' with value '100' and caches it in memory.
Time 2: Process A modifies transient 'X' to value '200'.
Time 3: Process B continues execution, using its cached value of '100' for transient 'X'.
Time 4: Process B calls get_transient('X'), which returns the cached value '100' instead of the updated value '200'.
In this scenario, Process B operates on outdated data, potentially leading to inconsistencies or errors in the application logic.
#12
follow-up:
↓ 13
@
3 months ago
The database is the sole source of truth, while cache is not reliable
That may be true in many cases however retrieval of options in WP works a bit differently. All (autoload) options are loaded early and are kept in memory. When used, all option values are read from that memory cache. When an option is changed or deleted, it is updated in the memory cache at the same time. There is also a nooptions cache for options that don't exist or have been deleted. So the memory cache works as the "trusted source".
Database changes that do not propagate immediately is not really something that can be supported anyways.
Right. Using memory cache is a workaround for that limitation?
In normal HA setups, the replication delay does not affect the master's ability to mutate the state immediately.
I'm not an expert but thinking this my have something to do with (internet) re-routing requests to different data centers?
When process A and B start running at the same time and A changes data of a transient while B also already read that same transient
Frankly I'm not sure if there can ever be 100% guarantee that B will get the data set by A. My guess would be that if both A and B are part of the same WP install, using external memory cache (memcache?) would be the best option and would get as close to the 100% as possible.
In all other cases there will likely be some lag: DB replication, internet trunk routers re-routing some request, busy/slow server somewhere, etc.
Consider the following example scenario:
Time 0: Action Scheduler initiates Process A and Process B asynchronously.
Time 1: Process B reads transient 'X' with value '100' and caches it in memory.
Time 2: Process A modifies transient 'X' to value '200'. The transient is saved to the DB in the Huston datacenter.
Time 3: Process B continues execution, and requests the DB value for transient 'X' from the Tokyo datacenter.
Time 4: Process B calls get_transient('X'), which returns the value '100' instead of the updated value '200' as the DB value in Tokyo was not yet updated from the Huston datacenter when it was retrieved.
Think the same scenario would have worked properly if options/transients were stored in memcache (and would have been way faster).
Also see the changeset and ticket @peterwilsoncc linked above, and especially this comment: https://core.trac.wordpress.org/ticket/31245#comment:99.
the default behavior caused the in-memory caching mechanism of WordPress to contain old data.
This sounds like a bug. If you can reproduce it reliably please post the exact steps.
#13
in reply to:
↑ 12
;
follow-up:
↓ 15
@
3 months ago
This sounds like a bug. If you can reproduce it reliably please post the exact steps.
I am not sure if this is a bug, but the same race time issue we are discussing here.
In my scenario B has already fetched the transients at the start of the action, which means the result is fetched and stored by the options usage of wp_cache_add: https://github.com/JUVOJustin/wordpress-develop/blob/769f0170f87acc22dca0289a042ae55148c3a35a/src/wp-includes/option.php#L202
This situation gets worse when you have really long-running processes like an importer. The in-memory cache without directly calling wp_cache_get("", "", true) will very likely be stale at some point. This happened to me, and I had to create an ugly wrapper around get_transient that basically just deletes the object cache before querying the transient:
<?php if (!wp_using_ext_object_cache()) { // Delete transient cache $deletion_key = '_transient_' . $key; wp_cache_delete($deletion_key, 'options'); // Delete timeout cache $deletion_key = '_transient_timeout_' . $key; wp_cache_delete($deletion_key, 'options'); // At this point object cache is cleared and can be requested again $data = get_transient($key); } else { $data = wp_cache_get($key, "transient", true); } return $data;
Frankly I'm not sure if there can ever be 100% guarantee that B will get the data set by A. My guess would be that if both A and B are part of the same WP install, using external memory cache (memcache?) would be the best option and would get as close to the 100% as possible.
I have included my issue link regarding Redis as an external cache in the Trac ticket. This was actually my initial motivation for creating the ticket. I agree that an external cache can mitigate this issue to some extent. However, particularly for transients, the absence of a 'force' parameter leads developers to use wp_cache_get($key, "transient", true) instead of the transient API.
My personal opinion is that developers using the transient API shouldn't need to worry about whether an external cache is being used or not. The API should serve as a cache-agnostic way to interact with this specific part of the WordPress core.
Since transients are described as "using the wp_options database table to temporarily store cached information." in the docs, I think there should be a way to actually enforce getting the latest version of the "cached information" backed into its API.
I could definitely see cases as described in the other tickets, where the same is happening with options itself. To fix this for transients, there is no need to tackle alloptions as transients are never stored as autoload options if I am not mistaken. Nonetheless, the provided patch looks good to me, and maybe we could incorporate it? Is this something it should or could do? Will the initial reporter of https://core.trac.wordpress.org/changeset/46780 still be acknowledged?
#15
in reply to:
↑ 13
@
3 months ago
- Focuses performance added
- Milestone changed from Awaiting Review to Future Release
Replying to juvodesign:
In my scenario B has already fetched the transients at the start of the action
...
This situation gets worse when you have really long-running processes like an importer. The in-memory cache without directly calling wp_cache_get("", "", true) will very likely be stale at some point.
Right, when not using external cache, and the site is large/busy, the chance for an option (or transient) to get updated while some slow process is running increase. Thinking that would be a pretty rare edge case though, large/busy sites generally always use external cache.
This happened to me, and I had to create an ugly wrapper
I don't think a wrapper is such a bad idea, especially if forcing a trip to the DB will be only for transients. Whether such wrapper should be in WP core or only in plugins that may need it is another question.
My personal opinion is that developers using the transient API shouldn't need to worry about whether an external cache is being used or not.
Right. However there is always a chance, albeit very small, of a race condition when two processes try to simultaneously read/write the same data or the same row in the DB. This is unavoidable afaik, and the developers should be aware of it and account for such cases if they use very slow running processes.
To fix this for transients, there is no need to tackle alloptions as transients are never stored as autoload options if I am not mistaken.
Not exactly. If transient expiration is not set, and they are stored in the options table, they seem to be autoloaded: https://core.trac.wordpress.org/browser/tags/6.6.1/src/wp-includes/option.php#L1516.
Nonetheless, the provided patch looks good to me..
At this point I'm about 50/50 about adding another param to get_option()
and friends.
- On one side there is a legitimate use case for transients, and maybe for all options, when external cache is not used. Seems that use case is extremely rare though, and will not be solved completely by forcing another trip to the DB.
- On the other side there is a chance that many plugins would switch to forcing these trips to the DB "just in case" which will result in (possibly significant) slow-down in some cases. Also I tend to agree with @spacedmonkey's comment here: https://core.trac.wordpress.org/ticket/37178#comment:7. Adding a
force
param toget_option()
would also mean it may need acache_only
param too? (Thanks @SergeyBiryukov for the link).
Going to set this to "future release" pending a review by the performance team.
### Ticket: https://core.trac.wordpress.org/ticket/61296
## Description
This PR enhances the
get_transient
andget_option
functions in WordPress by introducing a new optional$force
parameter. The$force
parameter allows developers to bypass the object cache and retrieve the latest transient value directly from the database.### Changes Made
$force
parameter to the function signature with a default value offalse
.$force
parameter is set totrue
. If true, the function bypasses the cache and retrieves the transient value directly from the database.### Why This Change is Necessary