#14485 closed defect (bug) (fixed)
Taxonomy hierarchy cache isn't properly refreshed
Reported by: | thealien | Owned by: | wonderboymusic |
---|---|---|---|
Milestone: | 3.9 | Priority: | high |
Severity: | normal | Version: | 2.7 |
Component: | Taxonomy | Keywords: | has-patch commit |
Focuses: | Cc: | mainpart@… |
Description
I've developed a plugin that can create parent and children categories at the same time. It works well at 2.8.6 but it doesn't work as what I expect at 3.0.
When I create new parent and children categories at the same time, it only shows parent category in the Categories dashboard, but the children category is actually created. If I create another category or delete one category, the children category shows up. I think it should be a problem of wordpress cache, but I have no idea where to start tracing.
I use wp_create_category at wp-admin/includes/taxonomy.php to create categories. Besides, I have tried clean_term_cache but it didn't help. And I didn't activate other plugins when I tested this plugin.
It happened exactly when creating NEW parent and children categories.
Attachments (14)
Change History (113)
#1
@
14 years ago
- Milestone changed from Awaiting Review to 3.0.2
- Priority changed from normal to high
I'm moving this to 3.0.2 because this is about the fifth report I've seen of funny stuff happening on the edit-tags screen.
We need to still track down the problem, but I've seen enough reports to at least warrant a higher priority.
#2
@
14 years ago
I believe this is caused by the category parent => child cache option becoming out of date, That cache should be being updated correctly if the correct API calls are being made in the correct order (I havn't looked at the attached plugin)
#3
follow-up:
↓ 4
@
14 years ago
I smell cache as well. It would be good to know if the reporter can specify if making use of multisite or not.
Simple steps how to reproduce are welcome as well.
#4
in reply to:
↑ 3
@
14 years ago
I was able to reproduce this as reported on a single site installation. I found that this issue does not really spring from the object cache rather from an option: "{$taxonomy}_children" which does not seem to get updated when new child categories are added.
I believe that this is caused by a bug in the _get_term_hierarchy() function line 2297 of /wp-includes/taxonomy.php. This function will always return the value of the "{$taxonomy}_children" option if that value is an array. Basically if there are children for any category, this function will not update the option. A quich fix would be to remove this conditional from the function:
if ( is_array($children) ) return $children;
Removing these lines kills the benefit of storing the result in the options table though. Perhaps moving it to object cache might be a better solution? I'm not really comfortable creating a patch for this, but thought I would help by sharing my experience.
#6
@
14 years ago
Just tested this in 3.1-beta1-16642 and it appears that the bug had been squashed. I now see the new created parent + all of its children in edit tags.php.
#7
@
14 years ago
- Cc garyc40@… added
- Keywords has-patch needs-testing added; category cache removed
No, the bug hasn't been squashed yet.
Patch attached.
Related: #14399
Root of the problem:
Unique cache key for get_terms() is not created properly. It is a hash of term query concatenated with "last_changed" time. "last_changed" is currently created simply by generating the current timestamp time().
However, when we insert many terms at once (with parent & child relationship), this "last_changed" key is still the same (all these operations are often within a fraction of a second). As a result, when get_terms() is called inside _get_term_hierarchy(), it always returns the same cache even though many new terms have been inserted.
Another problem is with the static variable $cleaned in clean_term_cache(). When the parent term is created, $cleaned[$taxonomy] is set to true. When the child terms are created, $cleaned[$taxonomy] is already true, so the option "{$taxonomy}_children" is not properly regenerated.
Solution is easy. I attached the ids of the created or changed terms inside last_changed key. clean_term_cache() is also made to accept another argument called $force_clean_taxonomy, when specified "true" will regenerate {$taxonomy}_children no matter what. When a children term is created, clean_term_cache($ids, $taxonomy, true, true) is called.
#8
@
14 years ago
- Keywords needs-patch added; has-patch needs-testing removed
wp_cache_set('last_changed'...) is called from get_terms() too, so I'm sure there's more to be done here.
#9
@
14 years ago
It weird though that although the time is stored, it's not compared to another time but just to see if it exists, so I don't see why adding the changed term ids would help.
#10
@
14 years ago
Well, here's what I am able to trace out:
- First, create a term called "Parent". wp_insert_term() will call clean_term_cache($term_id, $taxonomy). clean_term_cache() in turn will create a new last_changed key based on current timestamp, let's say 1234567890.
- Second, create a term called "Child", which is a child to "Parent". The same process happens. last_changed is still 1234567890 (could jump another second, but that rarely happens).
- Third, let's say you want to get_term_children() of "Parent". It'll call _get_term_hierarchy(). _get_term_hierarchy() will check if {taxonomy}_children option exists. If it does, return, otherwise, it'll call get_terms(). Now here's where the problem lies. If before "Child" or "Parent" is created, _get_term_hierarchy() is called, then a cache of get_terms before Child or Parent is already there. This time when we call get_term_children(), the cache key is still the same (same last_changed), so get_terms will return the cache value from before "Child" or "Parent" is created. Right?
If you want to test, see #14399, the code snippet in the ticket description, only change get_child_term() to get_term_children() with the same arguments.
Quite twisted and confusing indeed.
I'll include a revised patch asap.
#11
follow-up:
↓ 12
@
14 years ago
The cache key is 'last_changed'. The timestamp is the cache value.
#12
in reply to:
↑ 11
@
14 years ago
Replying to scribu:
The cache key is 'last_changed'. The timestamp is the cache value.
Yes, sorry I kind of mixed up the terms. Here's the revised edition of my explanation:
Well, here's what I am able to trace out:
- First, create a term called "Parent". wp_insert_term() will call clean_term_cache($term_id, $taxonomy). clean_term_cache() in turn will create a new
last_changed key based on current timestampcache: key => 'last_changed', value => current timestamp, let's say 1234567890.
- Second, create a term called "Child", which is a child to "Parent". The same process happens. last_changed is
stillupdated with the same value 1234567890 (could jump another second, but that rarely happens).
- Third, let's say you want to get_term_children() of "Parent". It'll call _get_term_hierarchy(). _get_term_hierarchy() will check if {taxonomy}_children option exists. If it does, return, otherwise, it'll call get_terms(). Now here's where the problem lies. If before "Child" or "Parent" is created, _get_term_hierarchy() is called, then a cache of get_terms before Child or Parent is already there. This time when we call get_term_children(),
the cache keylast_changed value is still the same (same last_changed1234567890), so get_terms will return the cache value from before "Child" or "Parent" is created. Get_terms will return the cache at "get_terms::{hashed term queries}::1234567890 which was created before "Child" or "Parent" is inserted. Right?
#13
@
14 years ago
- Keywords has-patch needs-testing added; needs-patch removed
I looked at "wp_cache_set('last_changed')" inside get_terms(). There's no need to make any changes there.
This is because this line is only called when there's no "last_changed" yet. Leaving it be time() there doesn't affect anything.
As a result, I'm changing the keywords back to has-patch, needs-testing .
#14
follow-up:
↓ 17
@
14 years ago
Ok, but then how do you invalidate "get_terms:$key:$last_changed"
?
#16
@
14 years ago
Actually, the question isn't relevant to the patch, since a timestamp isn't easily reproducible anyway. Still, it would be good to know.
#17
in reply to:
↑ 14
;
follow-up:
↓ 35
@
14 years ago
Replying to scribu:
Ok, but then how do you invalidate
"get_terms:$key:$last_changed"
?
I also don't see cache invalidation for get_terms anywhere, so that's something I'm thinking about right now. If there's a way to invalidate get_terms cache, then there's no need for $last_changed.
Actually, attaching $last_changed to the cache key is still flawed, even when I include other factors such as post ids in it. Imagine this: a term gets created, then modified or deleted all in the same page load. The upon deletion, the value of $last_changed is still $ids:$time . However there's almost no chance of something like that happening.
What I have in mind right now, is rewriting the cache API to provide a "wp_cache_delete_group()" which wipes out cached values of a whole group. Whenever something gets updated, we can just call wp_cache_delete_group( "terms" ); This would eliminate the need to attach $last_changed to the cache key.
What do you think?
#18
follow-up:
↓ 19
@
14 years ago
Not a good idea, since most persistent caching backends (memcache) wouldn't support it.
#19
in reply to:
↑ 18
@
14 years ago
Replying to scribu:
Not a good idea, since most persistent caching backends (memcache) wouldn't support it.
Oh right, got it.
#20
@
14 years ago
Anyway, I think this patch is an improvement over current behaviour, so it should go in 3.1.
One little thing: I would md5() the last_changed value, so that it doesn't get too long, if there are a lot of ids changed.
#21
@
14 years ago
Great. Just document the $force_clean_taxonomy parameter in clean_term_cache() and it's good to go.
#22
@
14 years ago
If you have persistent caching enabled, it's as if everything is done in a single page load.
test-term-cache.php provides a test case for this.
Unfortunately, it fails with garyc40-14485-documented.patch (no children are found).
#24
@
14 years ago
Hmm, interesting. I thought including the timestamp along with the post ids would make sense if persistent caching is enabled, but I'll do some more testing with that.
#26
@
14 years ago
In your test case, line 10:
$child = wp_insert_term('Child', 'category', $parent_id);
should be:
$child = wp_insert_term( 'Child', 'category', array( 'parent' => $parent_id ) );
Otherwise the child term is not created under the parent term.
#27
@
14 years ago
So, once I corrected line 10 in your test case, things worked out fine with my original documented patch.
But I want to take things one step further. So there's another patch attached (rev 4), in which I eliminated the need for a "last_changed" key.
I just simply store all the generated cache keys into "get_terms:cache_keys". Then whenever term structures are modified, all of the generated get_terms cache values are invalidated properly.
Is there any potential problem with that approach? If there is, the original documented patch works fine anyways.
#28
@
14 years ago
In 'get_terms:cache_keys', instead of storing "get_terms:$key" each time, you only need to store $key and then prepend 'get_terms:' afterwards.
#29
follow-up:
↓ 30
@
14 years ago
In clean_term_cache(), you should check $clean_taxonomy before going through all the $cache_keys, no?
Also, instead of:
$force_clean_taxonomy = ( $parent ) ? true : false; clean_term_cache($term_id, $taxonomy, true, $force_clean_taxonomy);
it's better to do:
clean_term_cache($term_id, $taxonomy, true, (bool) $parent);
#30
in reply to:
↑ 29
@
14 years ago
In clean_term_cache(), you should check $clean_taxonomy before going through all the $cache_keys, no?
Right, that makes sense. I'll revise the patch again. The original code changes $last_changed regardless of $clean_taxonomy though, so I've been wondering if there's any particular reason why it skips checking $clean_taxonomy.
One thing to note though. With both my patch and the original code, if you clean_term_cache( $term, $taxonomy, true ), all get_terms cache of all taxonomies will be outdated anyways (instead of only $taxonomy's get_terms cache). So there's still that flaw of over-invalidating. However, get_terms can accept a list of taxonomies, and allocating get_terms cache for each taxonomy is just too cumbersome if not impossible.
Also, instead of:
$force_clean_taxonomy = ( $parent ) ? true : false; clean_term_cache($term_id, $taxonomy, true, $force_clean_taxonomy);it's better to do:
clean_term_cache($term_id, $taxonomy, true, (bool) $parent);
Sure, good suggestion.
#31
@
14 years ago
One thing to note though. With both my patch and the original code, if you clean_term_cache( $term, $taxonomy, true ), all get_terms cache of all taxonomies will be outdated anyways (instead of only $taxonomy's get_terms cache). So there's still that flaw of over-invalidating. However, get_terms can accept a list of taxonomies, and allocating get_terms cache for each taxonomy is just too cumbersome if not impossible.
Yes, I saw that too. However, for hierarchical taxonomies, it does only work with the first taxonomy.
Maybe you could key it like this: get_terms:$taxonomy[0]:$key
and get_terms:mixed:$key
for the rare case when more than one taxonomy is passed.
#33
follow-up:
↓ 34
@
14 years ago
This time get_terms cache values of specific taxonomies are properly invalidated. Passed your test case.
One thing though, the $cache_keys now is an associative array ($key => $taxonomies). That kind of data structure is supported by persistent cache backends right?
Anything else that we need to improve?
#34
in reply to:
↑ 33
@
14 years ago
Replying to garyc40:
One thing though, the $cache_keys now is an associative array ($key => $taxonomies). That kind of data structure is supported by persistent cache backends right?
#35
in reply to:
↑ 17
@
14 years ago
Replying to garyc40:
What I have in mind right now, is rewriting the cache API to provide a "wp_cache_delete_group()" which wipes out cached values of a whole group. Whenever something gets updated, we can just call wp_cache_delete_group( "terms" ); This would eliminate the need to attach $last_changed to the cache key.
It turns out that the Memcache class, at least, does have a flush() method and since each group is a an instance of that class, making a wp_cache_delete_group() function would be trivial.
For reference: http://plugins.trac.wordpress.org/changeset/182802/memcached
#36
@
14 years ago
Actually, it wouldn't be so easy:
In a multisite environment, keys are prefixed with the current blog id, so calling wp_cache_flush_group() would flush cache entries for all blogs in the network, which clearly isn't a good idea.
In conclusion, I think the current patch is still the best way to go.
#45
@
14 years ago
- Summary changed from Didn't show up children category when created parent and children categories at the same time. to Taxonomy hierarchy cache isn't properly refreshed
#46
@
14 years ago
http://core.trac.wordpress.org/ticket/17356#comment:3
my ticket has been marked as duplicate of this bug
Not sure wether or not is it possible to simply crawl through the existing categories after inserting a new one and update the wp_options tag with properly serialized data. Because that's all that seems problematic to me. Not sure if there has to be something serialized in some specific way - array of taxonomy group ID'd arrays or something, but for me it worked really simply by automatic recreation
when I find myself some time I try to submit a bugfix for this.
#48
@
13 years ago
- Severity changed from normal to major
- Version changed from 3.0 to 3.2.1
I wrote in ticket
http://core.trac.wordpress.org/ticket/14669?replyto=4#comment:4
already same issue I have.
#49
@
13 years ago
- Keywords 3.2-early removed
- Milestone changed from Future Release to 3.3
Please don't change the Version; it's used to track when the problem was first spotted.
We need to fix this already.
#51
@
13 years ago
Some background. Invalidation is implicit, not explicit. Timestamped cache that is no longer used will eventually get forced out of the cache. Selectively updating arrays and saving them to cache is fraught with peril. The problems with updating arrays of keys are why the timestamped key was introduced.
#52
@
13 years ago
Related to what Ryan said, we now have some new functions for implicit invalidation: #18494
#57
@
12 years ago
Worth reposting duck_'s explanation:
The problem you're facing is caching. wp_insert_term() will call functions to clear out the cache that stops the hierarchy displaying properly (clean_term_cache() -> _get_term_hierarchy()). So if you were to only add one term all would be okay. However, you're inserting multiple terms on a single page load...
- clean_term_cache() contains some code, in the form of a static variable, to stop a taxonomy's hierarchy being rebuilt multiple times on a single page load.
- because the code executes so fast the last_changed cache item in the terms group will not actually be changed on every term insert. Therefore, even if the above didn't occur, the result of get_terms() would be cached and only include the first term to be inserted (and any which existing prior) so _get_term_hierarchy() will return 'incorrect' information.
#59
@
12 years ago
I am woking now with wp 3.4.2 and I have still the same problem. I created with a plugin a lot of child categories and only after I have added one single category they are shown up properly in the admin posts->categories backend .
I think there should be set a flag if a change to the categories is made and after 15 minutes a refresh should be done and ´the falg should be set back.
#60
@
12 years ago
- Cc cpkwebsolutions added
- Keywords needs-patch added; has-patch removed
- Severity changed from major to normal
For me, it seems to be choice between:
wp_cache_flush();
and a more targeted approach:
function force_flush_term_cache( $taxonomy = 'category' ) { if ( !taxonomy_exists( $taxonomy ) ) return FALSE; wp_cache_set( 'last_changed', time( ) - 1800, 'terms' ); wp_cache_delete( 'all_ids', $taxonomy ); wp_cache_delete( 'get', $taxonomy ); delete_option( "{$taxonomy}_children" ); _get_term_hierarchy( $taxonomy ); return TRUE; }
#61
@
12 years ago
This bug really irritates be. I'm using the taxonomy hidden to control things and when the option "{$taxonomy}_children" is empty there isn't really anything to do. Except manually delete the option. What is kind of ugly.
Is there a change to get this in 3.6? and what needs to happen to get this in core.
#62
follow-up:
↓ 73
@
12 years ago
what needs to happen to get this in core
It needs a patch and it needs unit tests to cover it.
#70
@
11 years ago
- Cc luke.gedeon@… added
Just a thought. I can dig into it further if you guys like the idea. Inside wp_insert_term, could we register a hook against shutdown that will refresh the {$taxonomy}_children option there? Subsequent calls to wp_insert_term could check if the hook is set and not re-add.
This way the cache only gets reset once without the complex keys.
To be honest, I have not dug deep enough into the other cases to know if it would fix all of them, but I know of a few cases it would solve.
#73
in reply to:
↑ 62
@
11 years ago
- Milestone changed from 3.7 to Future Release
Replying to scribu:
what needs to happen to get this in core
It needs a patch and it needs unit tests to cover it.
Aye.
#75
@
11 years ago
- Cc mainpart@… added
- Keywords needs-refresh added
it is still buggy even in 3.7. today i've got out of sync category_hierarchy option - some categories were missing from wp_option 'category_children' entry
Current code:
$children = get_option("{$taxonomy}_children");
if ( is_array($children) ) return $children;
will not refresh hierarchy if it not empty. even with last patch.
i suggest to switch to some caching solution implemented in wp core or get rid off this strange piece of code.
#76
@
11 years ago
- Keywords has-patch commit added; needs-testing needs-patch needs-refresh removed
- Milestone changed from Future Release to 3.9
This almost melted my brain.
14485.diff fixes this by piggybacking on the patch from #22526. The "{$taxonomy}_children"
option has to the be in sync with wp_cache_get( 'last_changed', $taxonomy );
. This is accomplished by also setting wp_cache_set( 'hierarchy_last_changed', $last_changed, $taxonomy );
.
Introduces taxonomy_hierarchy_is_fresh( $taxonomy )
.
By doing this, we only need to check taxonomy_hierarchy_is_fresh()
in _get_term_hierarchy()
before setting $children
.
Added some unit tests. Bugs like these make me want to work on nothing but Cache, there are probably other problems like these in the wild.
#77
@
11 years ago
- Owner set to wonderboymusic
- Resolution set to fixed
- Status changed from new to closed
In 27102:
#78
@
11 years ago
- Keywords needs-docs added; needs-unit-tests commit removed
Sorry to reopen, but the docs on taxonomy_hierarchy_is_fresh()
are incomplete. Looks like the parameter and return descriptions were missed.
#83
@
11 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
I proposed to revert this is #22526 in favor of a simpler fix.
The cache invalidation with static was introduced in r9102 with version 2.7. Multisite wasn't in core back then, so something like switch_to_blog()
wasn't a concern, but now it breaks if you switch the blog in between calls to clean_term_cache
.
14485.2.diff removes the static array. It only affects core bulk actions and plugins doing multiple calls to wp_term_* in a single request.
#86
follow-up:
↓ 87
@
11 years ago
- Keywords has-patch needs-docs removed
- Resolution fixed deleted
- Status changed from closed to reopened
Not all unit tests pass with the last commit:
Tests_Term_Cache::test_hierachy_invalidation Failed asserting that an array is not empty. /vagrant/wordpress/tests/phpunit/tests/term/cache.php:83
This fails on PHP 5.2 (but I can confirm it appears to pass fine on PHP 5.3+).
#87
in reply to:
↑ 86
@
11 years ago
Replying to bpetty: that's strange. Any more info would be helpful, I get the green light on 5.2.14:
$ /opt/php/5.2.14/bin/php /usr/bin/phpunit tests/phpunit/tests/term/cache.php OK (3 tests, 300 assertions)
#88
follow-ups:
↓ 93
↓ 94
@
11 years ago
You might need to run it a few times (it's not a consistent failure), Travis CI occasionally fails on other taxonomy tests related to caching too, and I can occasionally reproduce those locally as well:
For example, running phpunit --group taxonomy
four times in a row on the same trunk revision:
https://gist.github.com/tierra/8980745
Notably:
1) Tests_Term_getTerms::test_get_terms_seven_levels_deep Failed asserting that an array is not empty. /vagrant/wordpress/tests/phpunit/tests/term/getTerms.php:311
and
1) Tests_Term_getTerms::test_get_terms_grandparent_zero Failed asserting that an array is not empty. /vagrant/wordpress/tests/phpunit/tests/term/getTerms.php:290
FWIW, this is PHP 5.2.17
#89
@
11 years ago
Just for good measure, here's a number of Travis CI PHP 5.2 builds since this commit landed that show the same sort of random test failures throughout term-specific unit tests:
#90
@
11 years ago
I can get test_get_terms_grandparent_zero to fail in isolation, and consistently. It passes for me when the get_terms() there receives hide_empty => false.
Let's take the time to get this right (with patches for review) before another commit.
#93
in reply to:
↑ 88
@
11 years ago
Replying to bpetty:
You might need to run it a few times (it's not a consistent failure), Travis CI occasionally fails on other taxonomy tests related to caching too, and I can occasionally reproduce those locally as well:
For example, running
phpunit --group taxonomy
four times in a row on the same trunk revision:
https://gist.github.com/tierra/8980745
Confirmed. Investigating.
#94
in reply to:
↑ 88
;
follow-up:
↓ 95
@
11 years ago
Couldn't confirm with PHP 5.2.14 :( I did get this test to fail consistently though:
1) Tests_Term::test_update_shared_term Failed asserting that '61' is not equal to 61.
#95
in reply to:
↑ 94
@
11 years ago
- Keywords has-patch commit added
So, [27115] introduced a race condition here.
After adding a new term, it's important to clear the {$taxonomy}_children
option in order for get_terms()
to work properly with hierarchical taxonomies (as _get_term_hierarchy()
relies on that option).
Normally, this happens via clean_term_cache()
call in wp_insert_term()
:
tags/3.8.1/src/wp-includes/taxonomy.php#L2223
However, when rapidly creating new terms and calling get_terms()
right away, _get_term_hierarchy()
sometimes returns and caches stale data.
Since [23401], we use microtime()
for last_changed
cache value. Before [27115], it used to be a string, now it's a float.
Let's see what last_changed
looks like when calling get_terms()
in test_get_terms_grandparent_zero()
. I've added logs to lines 1312 and 1214: tags/3.8.1/src/wp-includes/taxonomy.php#L1312.
Before [27115]:
last_changed set: 0.66727500 1392952644 last_changed get: 0.66862600 1392952644 last_changed get: 0.67553500 1392952644 last_changed get: 0.70352200 1392952644
After [27115]:
last_changed set: 1392953072.29 last_changed get: 1392953072.29 last_changed get: 1392953072.29 last_changed get: 1392953072.32
As you can see, there are no identical values in the first set, which means the term cache is properly refreshed after creating each new term.
In the second set, however, there are three identical values, which means get_terms()
returns stale data from cache, and the test fails. With only two decimal digits, the chances of a collision have increased.
According to ticket:27000:6, [27115] was introduced as a fix for an old object-cache.php
drop-in, so we can probably revert it.
If we still want to keep it, we need to call clean_term_cache()
manually before get_terms()
in our tests. delete_option( "{$taxonomy}_children" )
would work too, but clean_term_cache()
seems less cryptic.
See 14485.3.diff.
Tests for non-hierarchical taxonomies appear to be unaffected, as they don't involve _get_term_hierarchy()
.
Replying to kovshenin:
I did get this test to fail consistently though:
1) Tests_Term::test_update_shared_term Failed asserting that '61' is not equal to 61.
That one should be skipped, #5809 is not fixed yet :)
This ticket was mentioned in IRC in #wordpress-dev by wonderboymusic. View the logs.
11 years ago
#98
@
11 years ago
- Resolution set to fixed
- Status changed from reopened to closed
Someone reopen the second one of those unit tests fails again, but I think we are good since [27300].
plugin