WordPress.org

Make WordPress Core

Opened 22 months ago

Last modified 5 days ago

#38280 new enhancement

Make term count for a specific object type available

Reported by: desrosj Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 4.7
Component: Taxonomy Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

Term count is the total number of relationships with no context of the object types the relationships belong to.

Currently, the best way to determine the count of a term for a specific object type is to run a WP_Query with a WP_Tax_Query for that term.

$term_count_query = new WP_Query( array(
        'post_type' => 'some-post-type',
        'tax_query' => array(
                array(
                        'taxonomy' => 'some-taxonomy',
                        'field' => 'slug',
                        'terms' => array( 'term-slug' )
                ),
        ),
        'posts_per_page' => 1,
) );
$term_count_for_post_type = $term_count_query->found_posts;

Now that term meta is in core, it would be great if these counts were stored in term meta for easy access.

To start, these counts could be available through a function. To go a step further, an argument could be added to get_terms() and other functions to filter the term counts and reflect the object types terms are being grabbed for.

Attachments (18)

38280.diff (3.6 KB) - added by desrosj 22 months ago.
Stores object type specific counts in term meta when term counts are updated. Adds wp_get_term_count_for_object_type() to retrieve term counts.
38280.2.diff (6.1 KB) - added by desrosj 20 months ago.
Updated ticket adds unit tests, casts get_term_meta() call into an array to prevent PHP notice, improved function Docblock.
38280.3.diff (8.0 KB) - added by desrosj 20 months ago.
38280.4.diff (9.8 KB) - added by desrosj 20 months ago.
38280.5.diff (11.7 KB) - added by desrosj 17 months ago.
Updated patch adds another meta key for tracking the object types that were counted. Also removes all previously counted object types each time to prevent abandoned meta keys.
38280.6.diff (12.1 KB) - added by boonebgorges 13 months ago.
38280.7.diff (10.8 KB) - added by ivankristianto 5 weeks ago.
Update patch to apply cleanly
38280.8.diff (11.4 KB) - added by ivankristianto 5 weeks ago.
term_taxonomy_id could be different than term_id in term_relationships table. this patch fix it
38280.9.diff (14.2 KB) - added by desrosj 5 weeks ago.
38280.10.diff (14.8 KB) - added by boonebgorges 5 weeks ago.
38280.11.diff (5.3 KB) - added by ivankristianto 5 weeks ago.
38280.12.diff (16.2 KB) - added by boonebgorges 5 weeks ago.
38280.13.diff (16.0 KB) - added by desrosj 5 weeks ago.
38280.14.diff (17.5 KB) - added by boonebgorges 5 weeks ago.
38280.15.diff (20.9 KB) - added by boonebgorges 4 weeks ago.
38380.diff (9.7 KB) - added by desrosj 13 days ago.
38280.16.diff (9.7 KB) - added by desrosj 13 days ago.
38280.17.diff (20.3 KB) - added by ivankristianto 5 days ago.
Added back wpGetTermObjectCount test

Download all attachments as: .zip

Change History (43)

@desrosj
22 months ago

Stores object type specific counts in term meta when term counts are updated. Adds wp_get_term_count_for_object_type() to retrieve term counts.

@desrosj
20 months ago

Updated ticket adds unit tests, casts get_term_meta() call into an array to prevent PHP notice, improved function Docblock.

@desrosj
20 months ago

#1 @desrosj
20 months ago

Talked this over with @johnbillion at WordCamp US. He suggested single meta entries for each object type. This would unlock the potential for sorting terms by object type counts.

The newest patch reflects this change, and also adds unit tests.

The patch works and I can verify that through testing it and checking the database/displaying counts and function results. The tests all pass when run directly (using phpunit -group 38280 or phpunit -group taxonomy), but for some reason they fail when running all unit tests. A second set of eyes would be appreciated.

#2 @desrosj
20 months ago

Also, the attached patch follows the following methodology:

  • Do not create meta entry if one object type is registered on a taxonomy.
  • If two or more object types are registered, only store a meta entry for object types who's counts are greater than 0.
  • The term count remains the cumulative count for all relationships.

#3 @johnbillion
20 months ago

This is looking good. Some feedback:

  • This function will return 0 for terms that haven't yet had their post count cache populated (ie. all terms that already exist when the site gets upgraded to WP 4.8, for example). Should the term count its posts if the term meta cache doesn't yet exist?
  • The $wpdb->prepare() call added on line 3427 of src/wp-includes/taxonomy.php should properly prepare $type instead of concatenating it into the SQL.
  • There should be a test that asserts that the count is correctly updated when terms are removed from an object with wp_remove_object_terms().

@desrosj
20 months ago

#4 @desrosj
20 months ago

Updated patch to address all three items.

Bullet point one made me think of a couple scenarios that could result in some abandoned meta caches:

  • What if a taxonomy is removed from a post type?
  • What if a post type is removed?

Should we check for these some how?

#5 @desrosj
18 months ago

  • Keywords needs-refresh has-patch has-unit-tests added

#7 @desrosj
18 months ago

  • Keywords needs-refresh removed

Patch actually does not need a refresh. Still having the issue of tests passing when the taxonomy or 38280 group is run, but failing when all tests are run though.

#8 @boonebgorges
17 months ago

@desrosj Nice job on this patch so far.

The comment // When a count for an object type does not exist, if any other meta caches exist we can assume 0. makes me nervous. I guess this is because you're not storing 0s, and the presence a count for another object type means that WP has recently done a relevant count. I worry that this is going to be subject to weird race conditions or data integrity problems. (For example: what if a taxonomy originally assigned to 'post' becomes assigned to array( 'post', 'page' )?) How about some combination of your original array-based approach and the one-entry-per-object-type approach? There'd be a single key containing a list of the object types for which a count exists, but the counts themselves would be stored in their own rows. This way, if $type appears in the array, and if get_term_meta( $term_id, '_wp_object_count_' . $type, true ) returned false for a $type, we can be confident that $type has indeed been counted.

Bullet point one made me think of a couple scenarios that could result in some abandoned meta caches

These situations are rare enough that I wouldn't worry about it too much. As a precaution, _update_post_term_count() could delete_post_meta() for all of a post's existing counts before generating and saving new ones (another argument for having a separate list of the object types that have been counted). This way, the stale data will be removed during the next count.

@desrosj
17 months ago

Updated patch adds another meta key for tracking the object types that were counted. Also removes all previously counted object types each time to prevent abandoned meta keys.

This ticket was mentioned in Slack in #core by desrosj. View the logs.


16 months ago

#10 @boonebgorges
13 months ago

38280.6.diff does some formatting cleanup, including some slight reorganization of return statements in wp_get_term_count_for_object_type() to reduce the amount of if indentation.

This reorganization shows that there are some cases in which wp_get_term_count_for_object_type() doesn't return a value at all. Should we be returning 0 as a default value? The fact that these cases didn't come up makes me think we're missing some cases in the unit tests.

Can you explain this logic to me?

 	* If the object type has been counted, and other counts exist in meta, we can 
 	* assume this term has 0 relationships for this object type. 

Why is it necessary to check for *other* counts? Shouldn't it be enough for the type to be listed in _wp_counted_object_types? If there's a scenario where these values might get out of whack, we should (a) have a test that describes it, and/or (b) defend against that scenario :)

The invalidation part is looking pretty good to me.

This ticket was mentioned in Slack in #core-multisite by desrosj. View the logs.


4 months ago

#12 @ivankristianto
5 weeks ago

  • Keywords needs-refresh added

I failed to merge this patch

@ivankristianto
5 weeks ago

Update patch to apply cleanly

#13 @ivankristianto
5 weeks ago

  • Keywords needs-refresh removed

@ivankristianto
5 weeks ago

term_taxonomy_id could be different than term_id in term_relationships table. this patch fix it

@desrosj
5 weeks ago

#14 @desrosj
5 weeks ago

Thanks for the patch refresh, @ivankristianto!

Looks like the changes in 38280.8.diff are breaking the unit tests. After looking into it further, I think those changes should be reverted (which I have done in 38280.9.diff). While the term_id could be different than term_taxonomy_id, the function's inline documentation explicitly specifies List of Term taxonomy IDs.. I also looked at where _update_post_term_count() is called in core and the only function that calls it is wp_update_term_count_now(). This function's documentation also specifies The term_taxonomy_id of terms to update., so I think we can avoid that additional query.

A term ID check may be needed to ensure the correct term ID for the meta function calls, but I need to look into this further. @boonebgorges does anything stick out to you there?

Also in 38280.9.diff:

  • Change version number on new functions to 5.0.0.
  • Added a @since tag to _update_post_term_count() noting the new meta information.
  • If a WP_Error is returned by get_term() return from wp_get_term_count_for_object_type(). Previously, it was just assumed a term would always be returned. null can also be returned for "miscellaneous failures". I was looking to better understand when null would be returned so I could write tests to catch it before adding a return statement for this condition.
  • Removed the else statement at the bottom of _update_post_term_count(). The meta entries deleted in this else are already deleted earlier on in the function.
  • Moved tests for wp_get_term_count_for_object_type() into a separate test class.

Replying to boonebgorges:

Can you explain this logic to me?

 	* If the object type has been counted, and other counts exist in meta, we can 
 	* assume this term has 0 relationships for this object type. 

Why is it necessary to check for *other* counts? Shouldn't it be enough for the type to be listed in _wp_counted_object_types? If there's a scenario where these values might get out of whack, we should (a) have a test that describes it, and/or (b) defend against that scenario :)

I think this was something I missed when previously when adding the _wp_counted_object_types meta cache. I have removed it.

This reorganization shows that there are some cases in which wp_get_term_count_for_object_type() doesn't return a value > at all.

Looking at this further, it seems the only time we would not return a value would be if wp_update_term_count_now() returned a "falsey" value. But, that function always returns true. Wondering if this could ever cause an endless loop in the last conditional of the function (if the meta does not successfully get stored, it may call itself recursively).

I think this is in a great spot now pending the items above. I would like to cut the test methods into smaller ones (and maybe use some fixtures) as well. I couldn't find any tests for wp_update_term_count_now() or _update_post_term_count() either. If we don't add them here, a new ticket may be good for tracking.

#15 @ivankristianto
5 weeks ago

@desrosj Thank you for catch that error. since we need to pass term_taxonomy_id for wp_update_term_count_now You need to convert term_id in this line:

if ( wp_update_term_count_now( array( $term_id ), $taxonomy ) ) {


Because it will count different terms if the term_id is not equal term_taxonomy_id

Last edited 5 weeks ago by ivankristianto (previous) (diff)

#16 @boonebgorges
5 weeks ago

38280.10.diff makes the change pointed out by @ivankristianto, which I was in the process of patching when he posted his comment :)

Fixing this error uncovers some bug that I haven't grasped yet. The logic of wp_get_term_count_for_object_type() is basically: see if we have a cached count, but if not, call wp_update_term_count_now(), and then re-call wp_get_term_count_for_object_type(), the idea being that calling wp_update_term_count_now() should always populate the cached count. But this doesn't appear to be happening in all cases, so that this recursive call ends up with an infinite regress. This appears to happen when the taxonomy is associated with more than one object type, and when term_id and term_taxonomy_id are out of sync, which suggests that there's still some inconsistency there. Try running 38280.10.diff with the entire --group taxonomy and you'll see what I mean.

I made a couple other changes as well:

  • In _update_post_term_count(), you were double-escaping the $object_types (esc_sql() + $wpdb->prepare()). Your rearrangement means that the esc_sql() call is no longer needed, so I've removed it.
  • Rearranged the logic at the end of wp_get_term_count_for_object_type() to make it clearer what's being returned in the last case.
  • In the complex tests, use a custom taxonomy rather than category. Category has all sorts of special cases ('Uncategorized', etc) and, at first, I thought it might be the cause of the bug described above, though perhaps I'm wrong about that.
  • Don't return false from wp_get_term_couont_for_object_type() in case of a bad object type. Seems more informative and more internally consistent to return a WP_Error.

Perhaps @ivankristianto or @desrosj will be able to look at the errors and find what's causing the infinite regress. To me, it's sort of a code smell - ideally, the function wouldn't need to call itself again to get the cached value, though I realize the underlying count functions don't return friendly values.

#17 @ivankristianto
5 weeks ago

@boonebgorges it's because we use term_taxonomy_id in _update_post_term_count argument. while we do update_term_meta and delete_term_meta which accept term_id
So back to the case where term_id is not equal to term_taxonomy_id issue.

I upload patch 38280.11.diff to query back to get the term_id for this purpose.
But it cause 1 test failed from test_termmeta_cache_should_not_be_primed_when_update_term_meta_cache_is_false which I don't understand why. Can you please help?

#18 @boonebgorges
5 weeks ago

@ivankristianto Good call, thanks!

The failing test is because _update_post_term_count() now calls get_term_meta(), which populates the (entire) term meta cache for each term. I don't think this is a bug, but it means that the test needs some additional logic to ensure an empty termmeta cache before running its assertion. I've made the change in 38280.12.diff.

I also swapped out your database query with get_term_by( 'term_taxonomy_id' ), and changed the variable names to make it clear where you're referencing a term_taxonomy_id vs a term object/term_id.

@desrosj
5 weeks ago

#19 follow-up: @desrosj
5 weeks ago

38280.13.diff removes bool from the wp_get_etrm_count_for_object_type() inline documentation and adds the taxonomy to the first WP_Error that is returned when a taxonomy does not exist. Also fixes a 3 minor alignment and spacing issues in the unit tests to adhere to the coding standards.

@boonebgorges Do you think adding a return value to _update_post_term_count() is realistic? It is a pretty old function (2.3), but adding a return value (or at least a false return when it fails) would solve the infinite loop issue.

#20 in reply to: ↑ 19 @boonebgorges
5 weeks ago

Replying to desrosj:

38280.13.diff removes bool from the wp_get_etrm_count_for_object_type() inline documentation and adds the taxonomy to the first WP_Error that is returned when a taxonomy does not exist. Also fixes a 3 minor alignment and spacing issues in the unit tests to adhere to the coding standards.

@boonebgorges Do you think adding a return value to _update_post_term_count() is realistic? It is a pretty old function (2.3), but adding a return value (or at least a false return when it fails) would solve the infinite loop issue.

Thank you sir. I don't think it's feasible to change the return value for the update functions, for a few reasons. One is backward compatibility. Another is that taxonomy terms can be attached to things that aren't posts, which means that the count fallback needs to be more resiliant. 38280.14.diff has a test that demonstrates this - the test, along with 38280.13.diff, will trigger the infinite regress.

38280.14.diff has one proof-of-concept way to avoid the regress. Basically: break the call to termmeta into its own reusable function. The function returns false if no meta is found, indicating a count should take place. If it returns false, then we trigger wp_update_term_count_now(), and then fetch from meta again. If it's *still* false, we just return 0 from the function. Upside: No more regress, even for non-post-type objects.

Downside: wp_get_term_count_for_object_type( $term_id, $non_post_type ) will *always* trigger a recount, except when a plugin updates the term-object counts itself. This seems dangerous. So, an alternative technique might be to simply bail out of wp_get_term_count_for_object_type() if the object_type is (a) not found in _wp_counted_object_types (this covers cases where the plugin properly handles term counts), and (b) is not a post type. This 0 return value for non-post-types could be filtered, in case a plugin wants to do something fancy.

I'm thinking that this latter technique might be the simplest and safest, but I'd welcome feedback before writing yet another patch :-D

#21 @ivankristianto
4 weeks ago

@boonebgorges I think it's a good idea to bail it early and also the safest.

And I found a potential issue, if the taxonomy has custom update_count_callback , the _update_post_term_count() will never get called. Thus function wp_get_term_count_for_object_type() will never return the correct count.

I propose to move the logic to count terms per object types from _update_post_term_count into reusable function and call it from wp_update_term_count_now instead.

#22 follow-up: @boonebgorges
4 weeks ago

38280.15.diff is a rethink.

In the case of post types, term-object counts will be regenerated whenever wp_update_term_count_now() is run. This, in turn, is triggered by wp_set_object_terms(), etc. In other words, the data should always exist for terms that are touched after the 5.0 release - that is, those that are created/updated/attached to posts/removed from posts after 5.0. The time when we need to worry about "backfilling" the data is for terms that have *not* been touched since 5.0.

In the case of non-post-types (or post types that have their own custom update_count_callback), there is nothing that core can do when the count is requested, except to fail gracefully.

To accommodate both scenarios with a single approach, I introduced a new action 'wp_no_object_term_count_found'. Core will hook into it for post types, and will run wp_update_term_count_now(). (This covers "legacy" terms.) For non-post-types or post types with custom update_count_callback, plugins can hook to it and populate the object-term counts, and/or use the main filter I've added at the end of the function.

How does this general approach feel to others?

Moving forward, can we think about new names for the function/functionality? term_count_for_object_type() sounds backward - the thing being counted is "objects of a given (object/post) type that have a given term", not "terms that belong to a given (object/post) type". How about wp_get_term_object_count()? I can't think of anything more melodious :)

I also have questions about the "if" conditions where I put the comment are these right?. I know that these are meant to be shortcuts, but I worry that they have the potential to produce corrupted data. For example, the 1 >= count( $taxonomy_object->object_type ) call assumes that plugins have properly registered their taxonomies with object types - but do we enforce this elsewhere? What if the $taxonomy_object->object_type is not the same as $object_type? It seems better to return no data in these cases than to return something potentially incorrect. Can anyone help to think through the possibilities here?

This ticket was mentioned in Slack in #core by davecpage. View the logs.


2 weeks ago

#24 in reply to: ↑ 22 ; follow-up: @desrosj
13 days ago

Replying to boonebgorges:

How does this general approach feel to others?

I like this approach a lot. It feels a lot more natural. 38280.16.diff adds tests for the new meta retrieval function.

Some questions.

  • Should the priority of core's wp_no_object_term_count_found hook be higher than the default of 10 to ensure it runs before custom hooks?
  • Should wp_get_term_count_for_object_type() have a dynamic hook that allows a filter to only be added for a specific taxonomy? "wp_term_count_for_object_type_$taxonomy", for example.
  • If the second attempt to get the object count after the "no count found" hook fails, should a WP_Error be returned?

Also, I think wp_get_term_count_for_object_type_from_meta() could also return false when an object does not belong to a taxonomy. Object types can be removed from a taxonomy and not have their counts updated for quite some time. Returning false here would trigger a recount and flush the removed object type from the meta cache.

How about wp_get_term_object_count()?

Sounds good to me. I made name changes and tweaked some docblocks to more accurately indicate what is counted in 38280.16.diff.

I also have questions about the "if" conditions where I put the comment are these right?. I know that these are meant to be shortcuts, but I worry that they have the potential to produce corrupted data. For example, the 1 >= count( $taxonomy_object->object_type ) call assumes that plugins have properly registered their taxonomies with object types - but do we enforce this elsewhere? What if the $taxonomy_object->object_type is not the same as $object_type? It seems better to return no data in these cases than to return something potentially incorrect. Can anyone help to think through the possibilities here?

When a taxonomy is registered, no object type validation is performed. So, you could register a foo object type for a taxonomy even if foo objects are not registered. Maybe 1 >= count( $taxonomy_object->object_type ) could be changed to count the number of object types that currently exist instead. This would add a need to compare the number of object types stored in _wp_counted_object_types with the number of active in order to trigger recounting if an object type is removed, but could prevent inaccurate counts when an object type is removed.

I also created a branch on my WordPress Develop fork to show the tests passing. This branch reflects 38280.16.diff.

Edit: Changed Travis build link to correct passing build.

Last edited 13 days ago by desrosj (previous) (diff)

@desrosj
13 days ago

@desrosj
13 days ago

@ivankristianto
5 days ago

Added back wpGetTermObjectCount test

#25 in reply to: ↑ 24 @ivankristianto
5 days ago

Replying to desrosj:

Replying to boonebgorges:

I also have questions about the "if" conditions where I put the comment are these right?. I know that these are meant to be shortcuts, but I worry that they have the potential to produce corrupted data. For example, the 1 >= count( $taxonomy_object->object_type ) call assumes that plugins have properly registered their taxonomies with object types - but do we enforce this elsewhere? What if the $taxonomy_object->object_type is not the same as $object_type? It seems better to return no data in these cases than to return something potentially incorrect. Can anyone help to think through the possibilities here?

When a taxonomy is registered, no object type validation is performed. So, you could register a foo object type for a taxonomy even if foo objects are not registered. Maybe 1 >= count( $taxonomy_object->object_type ) could be changed to count the number of object types that currently exist instead. This would add a need to compare the number of object types stored in _wp_counted_object_types with the number of active in order to trigger recounting if an object type is removed, but could prevent inaccurate counts when an object type is removed.

I agree with @desrosj here, $taxonomy_object->object_type could be different and we didn't enforced that. I prefer we remove that condition and go straight get the count from the meta / recount it. It's more safe rather than return false data.

I did add back the unit testing that missing in previous patch 38280.17.diff. And also rename the function name as well.

Note: See TracTickets for help on using tickets.