Opened 3 years ago
Last modified 7 weeks ago
#14485 new defect (bug)
Taxonomy hierarchy cache isn't properly refreshed
| Reported by: |
|
Owned by: | |
|---|---|---|---|
| Priority: | high | Milestone: | Future Release |
| Component: | Cache | Version: | 3.0 |
| Severity: | normal | Keywords: | needs-testing needs-unit-tests needs-patch |
| Cc: | garyc40@…, aaroncampbell, liping.ong@…, mail@…, yakatz, amattie@…, cpkwebsolutions, andyadamscp@…, kenton.jacobsen@…, chriscct7@… |
Description
I've developed a plugin that can create parent and children categories at the same time. It works well at 2.8.6 but it doesn't work as what I expect at 3.0.
When I create new parent and children categories at the same time, it only shows parent category in the Categories dashboard, but the children category is actually created. If I create another category or delete one category, the children category shows up. I think it should be a problem of wordpress cache, but I have no idea where to start tracing.
I use wp_create_category at wp-admin/includes/taxonomy.php to create categories. Besides, I have tried clean_term_cache but it didn't help. And I didn't activate other plugins when I tested this plugin.
It happened exactly when creating NEW parent and children categories.
Attachments (9)
Change History (75)
comment:1
nacin
— 3 years ago
- Milestone changed from Awaiting Review to 3.0.2
- Priority changed from normal to high
I'm moving this to 3.0.2 because this is about the fifth report I've seen of funny stuff happening on the edit-tags screen.
We need to still track down the problem, but I've seen enough reports to at least warrant a higher priority.
comment:2
dd32
— 3 years ago
I believe this is caused by the category parent => child cache option becoming out of date, That cache should be being updated correctly if the correct API calls are being made in the correct order (I havn't looked at the attached plugin)
comment:3
follow-up:
↓ 4
hakre
— 3 years ago
I smell cache as well. It would be good to know if the reporter can specify if making use of multisite or not.
Simple steps how to reproduce are welcome as well.
comment:4
in reply to:
↑ 3
mfields
— 3 years ago
I was able to reproduce this as reported on a single site installation. I found that this issue does not really spring from the object cache rather from an option: "{$taxonomy}_children" which does not seem to get updated when new child categories are added.
I believe that this is caused by a bug in the _get_term_hierarchy() function line 2297 of /wp-includes/taxonomy.php. This function will always return the value of the "{$taxonomy}_children" option if that value is an array. Basically if there are children for any category, this function will not update the option. A quich fix would be to remove this conditional from the function:
if ( is_array($children) ) return $children;
Removing these lines kills the benefit of storing the result in the options table though. Perhaps moving it to object cache might be a better solution? I'm not really comfortable creating a patch for this, but thought I would help by sharing my experience.
comment:5
nacin
— 3 years ago
- Milestone changed from 3.0.3 to 3.1
Handled via mdawaffe's work in 3.1?
comment:6
mfields
— 3 years ago
Just tested this in 3.1-beta1-16642 and it appears that the bug had been squashed. I now see the new created parent + all of its children in edit tags.php.
comment:7
garyc40
— 3 years ago
- Cc garyc40@… added
- Keywords has-patch needs-testing added; category cache removed
No, the bug hasn't been squashed yet.
Patch attached.
Related: #14399
Root of the problem:
Unique cache key for get_terms() is not created properly. It is a hash of term query concatenated with "last_changed" time. "last_changed" is currently created simply by generating the current timestamp time().
However, when we insert many terms at once (with parent & child relationship), this "last_changed" key is still the same (all these operations are often within a fraction of a second). As a result, when get_terms() is called inside _get_term_hierarchy(), it always returns the same cache even though many new terms have been inserted.
Another problem is with the static variable $cleaned in clean_term_cache(). When the parent term is created, $cleaned[$taxonomy] is set to true. When the child terms are created, $cleaned[$taxonomy] is already true, so the option "{$taxonomy}_children" is not properly regenerated.
Solution is easy. I attached the ids of the created or changed terms inside last_changed key. clean_term_cache() is also made to accept another argument called $force_clean_taxonomy, when specified "true" will regenerate {$taxonomy}_children no matter what. When a children term is created, clean_term_cache($ids, $taxonomy, true, true) is called.
comment:8
scribu
— 3 years ago
- Keywords needs-patch added; has-patch needs-testing removed
wp_cache_set('last_changed'...) is called from get_terms() too, so I'm sure there's more to be done here.
comment:9
scribu
— 3 years ago
It weird though that although the time is stored, it's not compared to another time but just to see if it exists, so I don't see why adding the changed term ids would help.
comment:10
garyc40
— 3 years ago
Well, here's what I am able to trace out:
- First, create a term called "Parent". wp_insert_term() will call clean_term_cache($term_id, $taxonomy). clean_term_cache() in turn will create a new last_changed key based on current timestamp, let's say 1234567890.
- Second, create a term called "Child", which is a child to "Parent". The same process happens. last_changed is still 1234567890 (could jump another second, but that rarely happens).
- Third, let's say you want to get_term_children() of "Parent". It'll call _get_term_hierarchy(). _get_term_hierarchy() will check if {taxonomy}_children option exists. If it does, return, otherwise, it'll call get_terms(). Now here's where the problem lies. If before "Child" or "Parent" is created, _get_term_hierarchy() is called, then a cache of get_terms before Child or Parent is already there. This time when we call get_term_children(), the cache key is still the same (same last_changed), so get_terms will return the cache value from before "Child" or "Parent" is created. Right?
If you want to test, see #14399, the code snippet in the ticket description, only change get_child_term() to get_term_children() with the same arguments.
Quite twisted and confusing indeed.
I'll include a revised patch asap.
comment:11
follow-up:
↓ 12
scribu
— 3 years ago
The cache key is 'last_changed'. The timestamp is the cache value.
comment:12
in reply to:
↑ 11
garyc40
— 3 years ago
Replying to scribu:
The cache key is 'last_changed'. The timestamp is the cache value.
Yes, sorry I kind of mixed up the terms. Here's the revised edition of my explanation:
Well, here's what I am able to trace out:
- First, create a term called "Parent". wp_insert_term() will call clean_term_cache($term_id, $taxonomy). clean_term_cache() in turn will create a new
last_changed key based on current timestampcache: key => 'last_changed', value => current timestamp, let's say 1234567890.
- Second, create a term called "Child", which is a child to "Parent". The same process happens. last_changed is
stillupdated with the same value 1234567890 (could jump another second, but that rarely happens).
- Third, let's say you want to get_term_children() of "Parent". It'll call _get_term_hierarchy(). _get_term_hierarchy() will check if {taxonomy}_children option exists. If it does, return, otherwise, it'll call get_terms(). Now here's where the problem lies. If before "Child" or "Parent" is created, _get_term_hierarchy() is called, then a cache of get_terms before Child or Parent is already there. This time when we call get_term_children(),
the cache keylast_changed value is still the same (same last_changed1234567890), so get_terms will return the cache value from before "Child" or "Parent" is created. Get_terms will return the cache at "get_terms::{hashed term queries}::1234567890 which was created before "Child" or "Parent" is inserted. Right?
comment:13
garyc40
— 3 years ago
- Keywords has-patch needs-testing added; needs-patch removed
I looked at "wp_cache_set('last_changed')" inside get_terms(). There's no need to make any changes there.
This is because this line is only called when there's no "last_changed" yet. Leaving it be time() there doesn't affect anything.
As a result, I'm changing the keywords back to has-patch, needs-testing .
comment:14
follow-up:
↓ 17
scribu
— 3 years ago
Ok, but then how do you invalidate "get_terms:$key:$last_changed" ?
comment:15
scribu
— 3 years ago
- Keywords changed from has-patch, needs-testing to has-patch needs-testing
comment:16
scribu
— 3 years ago
Actually, the question isn't relevant to the patch, since a timestamp isn't easily reproducible anyway. Still, it would be good to know.
comment:17
in reply to:
↑ 14
;
follow-up:
↓ 35
garyc40
— 3 years ago
Replying to scribu:
Ok, but then how do you invalidate "get_terms:$key:$last_changed" ?
I also don't see cache invalidation for get_terms anywhere, so that's something I'm thinking about right now. If there's a way to invalidate get_terms cache, then there's no need for $last_changed.
Actually, attaching $last_changed to the cache key is still flawed, even when I include other factors such as post ids in it. Imagine this: a term gets created, then modified or deleted all in the same page load. The upon deletion, the value of $last_changed is still $ids:$time . However there's almost no chance of something like that happening.
What I have in mind right now, is rewriting the cache API to provide a "wp_cache_delete_group()" which wipes out cached values of a whole group. Whenever something gets updated, we can just call wp_cache_delete_group( "terms" ); This would eliminate the need to attach $last_changed to the cache key.
What do you think?
comment:18
follow-up:
↓ 19
scribu
— 3 years ago
Not a good idea, since most persistent caching backends (memcache) wouldn't support it.
comment:19
in reply to:
↑ 18
garyc40
— 3 years ago
Replying to scribu:
Not a good idea, since most persistent caching backends (memcache) wouldn't support it.
Oh right, got it.
comment:20
scribu
— 3 years ago
Anyway, I think this patch is an improvement over current behaviour, so it should go in 3.1.
One little thing: I would md5() the last_changed value, so that it doesn't get too long, if there are a lot of ids changed.
comment:21
scribu
— 3 years ago
Great. Just document the $force_clean_taxonomy parameter in clean_term_cache() and it's good to go.
comment:22
scribu
— 3 years ago
If you have persistent caching enabled, it's as if everything is done in a single page load.
test-term-cache.php provides a test case for this.
Unfortunately, it fails with garyc40-14485-documented.patch (no children are found).
comment:23
scribu
— 3 years ago
So, yeah, as you said, adding the post ids to the cache value doesn't help.
comment:24
garyc40
— 3 years ago
Hmm, interesting. I thought including the timestamp along with the post ids would make sense if persistent caching is enabled, but I'll do some more testing with that.
comment:25
scribu
— 3 years ago
("post id" meaning of course "term id")
comment:26
garyc40
— 3 years ago
In your test case, line 10:
$child = wp_insert_term('Child', 'category', $parent_id);
should be:
$child = wp_insert_term( 'Child', 'category', array( 'parent' => $parent_id ) );
Otherwise the child term is not created under the parent term.
garyc40
— 3 years ago
removed $last_changed, added "cache_keys", properly invalidated get_terms cache
comment:27
garyc40
— 3 years ago
So, once I corrected line 10 in your test case, things worked out fine with my original documented patch.
But I want to take things one step further. So there's another patch attached (rev 4), in which I eliminated the need for a "last_changed" key.
I just simply store all the generated cache keys into "get_terms:cache_keys". Then whenever term structures are modified, all of the generated get_terms cache values are invalidated properly.
Is there any potential problem with that approach? If there is, the original documented patch works fine anyways.
comment:28
scribu
— 3 years ago
In 'get_terms:cache_keys', instead of storing "get_terms:$key" each time, you only need to store $key and then prepend 'get_terms:' afterwards.
comment:29
follow-up:
↓ 30
scribu
— 3 years ago
In clean_term_cache(), you should check $clean_taxonomy before going through all the $cache_keys, no?
Also, instead of:
$force_clean_taxonomy = ( $parent ) ? true : false; clean_term_cache($term_id, $taxonomy, true, $force_clean_taxonomy);
it's better to do:
clean_term_cache($term_id, $taxonomy, true, (bool) $parent);
comment:30
in reply to:
↑ 29
garyc40
— 3 years ago
In clean_term_cache(), you should check $clean_taxonomy before going through all the $cache_keys, no?
Right, that makes sense. I'll revise the patch again. The original code changes $last_changed regardless of $clean_taxonomy though, so I've been wondering if there's any particular reason why it skips checking $clean_taxonomy.
One thing to note though. With both my patch and the original code, if you clean_term_cache( $term, $taxonomy, true ), all get_terms cache of all taxonomies will be outdated anyways (instead of only $taxonomy's get_terms cache). So there's still that flaw of over-invalidating. However, get_terms can accept a list of taxonomies, and allocating get_terms cache for each taxonomy is just too cumbersome if not impossible.
Also, instead of:
$force_clean_taxonomy = ( $parent ) ? true : false; clean_term_cache($term_id, $taxonomy, true, $force_clean_taxonomy);it's better to do:
clean_term_cache($term_id, $taxonomy, true, (bool) $parent);
Sure, good suggestion.
comment:31
scribu
— 3 years ago
One thing to note though. With both my patch and the original code, if you clean_term_cache( $term, $taxonomy, true ), all get_terms cache of all taxonomies will be outdated anyways (instead of only $taxonomy's get_terms cache). So there's still that flaw of over-invalidating. However, get_terms can accept a list of taxonomies, and allocating get_terms cache for each taxonomy is just too cumbersome if not impossible.
Yes, I saw that too. However, for hierarchical taxonomies, it does only work with the first taxonomy.
Maybe you could key it like this: get_terms:$taxonomy[0]:$key and get_terms:mixed:$key for the rare case when more than one taxonomy is passed.
comment:32
scribu
— 3 years ago
The only edge case would be if a user has a taxonomy named 'mixed'.
comment:33
follow-up:
↓ 34
garyc40
— 3 years ago
This time get_terms cache values of specific taxonomies are properly invalidated. Passed your test case.
One thing though, the $cache_keys now is an associative array ($key => $taxonomies). That kind of data structure is supported by persistent cache backends right?
Anything else that we need to improve?
comment:34
in reply to:
↑ 33
scribu
— 3 years ago
Replying to garyc40:
One thing though, the $cache_keys now is an associative array ($key => $taxonomies). That kind of data structure is supported by persistent cache backends right?
comment:35
in reply to:
↑ 17
scribu
— 3 years ago
Replying to garyc40:
What I have in mind right now, is rewriting the cache API to provide a "wp_cache_delete_group()" which wipes out cached values of a whole group. Whenever something gets updated, we can just call wp_cache_delete_group( "terms" ); This would eliminate the need to attach $last_changed to the cache key.
It turns out that the Memcache class, at least, does have a flush() method and since each group is a an instance of that class, making a wp_cache_delete_group() function would be trivial.
For reference: http://plugins.trac.wordpress.org/changeset/182802/memcached
comment:36
scribu
— 3 years ago
Actually, it wouldn't be so easy:
In a multisite environment, keys are prefixed with the current blog id, so calling wp_cache_flush_group() would flush cache entries for all blogs in the network, which clearly isn't a good idea.
In conclusion, I think the current patch is still the best way to go.
comment:37
garyc40
— 3 years ago
Related to #14704
comment:38
ryan
— 3 years ago
- Milestone changed from 3.1 to Future Release
comment:39
garyc40
— 3 years ago
- Keywords 3.2-early added
comment:40
aaroncampbell
— 2 years ago
- Cc aaroncampbell added
comment:41
scribu
— 2 years ago
Related: #16182
comment:42
mdawaffe
— 2 years ago
#14399 may be a dupe.
comment:43
liping
— 2 years ago
- Cc liping.ong@… added
comment:44
scottbasgaard
— 2 years ago
- Cc mail@… added
comment:45
scribu
— 2 years ago
- Summary changed from Didn't show up children category when created parent and children categories at the same time. to Taxonomy hierarchy cache isn't properly refreshed
comment:46
Archaniel
— 2 years ago
http://core.trac.wordpress.org/ticket/17356#comment:3
my ticket has been marked as duplicate of this bug
Not sure wether or not is it possible to simply crawl through the existing categories after inserting a new one and update the wp_options tag with properly serialized data. Because that's all that seems problematic to me. Not sure if there has to be something serialized in some specific way - array of taxonomy group ID'd arrays or something, but for me it worked really simply by automatic recreation
when I find myself some time I try to submit a bugfix for this.
comment:47
dd32
— 2 years ago
comment:48
christian_gnoth
— 23 months ago
- Severity changed from normal to major
- Version changed from 3.0 to 3.2.1
I wrote in ticket
http://core.trac.wordpress.org/ticket/14669?replyto=4#comment:4
already same issue I have.
comment:49
scribu
— 23 months ago
- Keywords 3.2-early removed
- Milestone changed from Future Release to 3.3
Please don't change the Version; it's used to track when the problem was first spotted.
We need to fix this already.
comment:50
scribu
— 23 months ago
- Version changed from 3.2.1 to 3.0
comment:51
ryan
— 23 months ago
Some background. Invalidation is implicit, not explicit. Timestamped cache that is no longer used will eventually get forced out of the cache. Selectively updating arrays and saving them to cache is fraught with peril. The problems with updating arrays of keys are why the timestamped key was introduced.
comment:52
scribu
— 22 months ago
Related to what Ryan said, we now have some new functions for implicit invalidation: #18494
comment:53
yakatz
— 22 months ago
- Cc yakatz added
comment:54
amattie
— 21 months ago
- Cc amattie@… added
comment:55
ryan
— 20 months ago
- Milestone changed from 3.3 to Future Release
comment:56
scribu
— 11 months ago
Related: #18628
comment:57
scribu
— 11 months ago
Worth reposting duck_'s explanation:
The problem you're facing is caching. wp_insert_term() will call functions to clear out the cache that stops the hierarchy displaying properly (clean_term_cache() -> _get_term_hierarchy()). So if you were to only add one term all would be okay. However, you're inserting multiple terms on a single page load...
- clean_term_cache() contains some code, in the form of a static variable, to stop a taxonomy's hierarchy being rebuilt multiple times on a single page load.
- because the code executes so fast the last_changed cache item in the terms group will not actually be changed on every term insert. Therefore, even if the above didn't occur, the result of get_terms() would be cached and only include the first term to be inserted (and any which existing prior) so _get_term_hierarchy() will return 'incorrect' information.
comment:58
lightningspirit
— 9 months ago
- Keywords needs-unit-tests added
comment:59
christian_gnoth
— 9 months ago
I am woking now with wp 3.4.2 and I have still the same problem. I created with a plugin a lot of child categories and only after I have added one single category they are shown up properly in the admin posts->categories backend .
I think there should be set a flag if a change to the categories is made and after 15 minutes a refresh should be done and ´the falg should be set back.
comment:60
cpkwebsolutions
— 6 months ago
- Cc cpkwebsolutions added
- Keywords needs-patch added; has-patch removed
- Severity changed from major to normal
For me, it seems to be choice between:
wp_cache_flush();
and a more targeted approach:
function force_flush_term_cache( $taxonomy = 'category' ) {
if ( !taxonomy_exists( $taxonomy ) ) return FALSE;
wp_cache_set( 'last_changed', time( ) - 1800, 'terms' );
wp_cache_delete( 'all_ids', $taxonomy );
wp_cache_delete( 'get', $taxonomy );
delete_option( "{$taxonomy}_children" );
_get_term_hierarchy( $taxonomy );
return TRUE;
}
comment:61
markoheijnen
— 4 months ago
This bug really irritates be. I'm using the taxonomy hidden to control things and when the option "{$taxonomy}_children" is empty there isn't really anything to do. Except manually delete the option. What is kind of ugly.
Is there a change to get this in 3.6? and what needs to happen to get this in core.
comment:62
scribu
— 4 months ago
what needs to happen to get this in core
It needs a patch and it needs unit tests to cover it.
comment:63
andyadams
— 4 months ago
- Cc andyadamscp@… added
comment:64
brokentone
— 3 months ago
- Cc kenton.jacobsen@… added
comment:65
tarasm
— 3 months ago
#22102 was marked as a duplicate.
comment:66
chriscct7
— 7 weeks ago
- Cc chriscct7@… added
plugin