Opened 8 years ago
Last modified 6 months ago
#40351 reopened enhancement
Term post re-counts scale poorly, are common and difficult to avoid
Reported by: | mattoperry | Owned by: | whyisjake |
---|---|---|---|
Milestone: | Future Release | Priority: | normal |
Severity: | normal | Version: | 4.8 |
Component: | Taxonomy | Keywords: | has-patch has-unit-tests early |
Focuses: | performance | Cc: |
Description
Under normal conditions whenever a post status transition occurs, _update_term_count_on_transition_post_status
attempts to completely recalculate the post counts for each term related to the post by re-counting each term's total number of post relationships.. For sites with large term relationship tables, large numbers of total terms and high numbers of terms per post, this recalculation does not scale well and can lead to wp-admin lag (while saving a post for example) or failed queries.
A typical bad scenario looks like this:
Consider a site with a large term_relationship table and a post with (say) 30 terms assigned to it. When that post is updated, an eventual call to _update_post_term_count
will cause that function to execute 60 total queries on some very large tables. 30 are SELECTs:
SELECT COUNT(*) FROM $wpdb->term_relationships, $wpdb->posts WHERE $wpdb->posts.ID = $wpdb->term_relationships.object_id AND post_status = 'publish' AND post_type IN ('" . implode("', '", $object_types ) . "') AND term_taxonomy_id = %d
(This is actually the hopeful case, since if we need to count attachments too, those are added to the above query via a subselect which probably makes things worse.)
Interspersed among the 30 SELECTs are 30 UPDATEs to the tt table, generated by:
$wpdb->update( $wpdb->term_taxonomy, compact( 'count' ), array( 'term_taxonomy_id' => $term ) );
One result of all of this can be failed queries -- typically the SELECTs -- and incorrect term post counts. Even more frequently the issue manifests as slow post updating behavior (the lag a user feels while waiting for a post to save or publish.)
We currently provide two mechanisms (besides just unhooking _update_term_count_on_transition_post_status
entirely and not updating term post counts at all) to do something about this problem. The first is to defer post counts, but this just delays the badness until a later call of wp_update_term_count
. The second is to define a per-taxonomy custom update callback using update_count_callback
.
Neither of these mechanisms is intended to improve performance, and both are obscure. It would be better to by default increment or decrement post counts directly in the tt table in response to posts entering or leaving published (or other countable) stati, and reserve complete recalculations for special occasions (such as when a term is edited.) Using this strategy, the 60 total queries in the example above could -- on publish -- be replaced by a a single query that would look something like:
UPDATE {$wpdb->term_taxonomy} AS tt SET tt.count = tt.count + 1 WHERE tt.term_taxonomy_id IN ( .... )
(where the final list is a list of tt_ids.)
This solution is implemented now as a plugin here: https://github.com/Automattic/lightweight-term-count-update.
Patch based on this on the way soon.
Attachments (10)
Change History (70)
This ticket was mentioned in Slack in #core by mattoperry. View the logs.
7 years ago
#4
@
5 years ago
I've also seen the Lightweight Term Count Update plugin reduce processing time from literally minutes when there's a large number of terms, to fractions of a second.
#5
@
4 years ago
- Milestone changed from Awaiting Review to 5.5
- Owner set to whyisjake
- Status changed from new to assigned
This ticket was mentioned in Slack in #core by whyisjake. View the logs.
4 years ago
This ticket was mentioned in Slack in #core by david.baumwald. View the logs.
4 years ago
#8
@
4 years ago
@whyisjake Is this one landable with 5.5 Beta 1 due to be released in just a few days?
#10
@
4 years ago
- Keywords early added; dev-feedback removed
- Milestone changed from 5.5 to 5.6
I think this should go in early in a cycle (unless Jake wants to run with it for 5.5). I'm willing to shepherd this for 5.6 as it has a huge performance impact on large sites.
#13
@
4 years ago
Would it be possible to have the filter countable_status to include $taxonomy as a parameter as it is available with both usages.
Some custom post types plugins wish to count using a different set of statuses than the default and providing the taxonomy being counted will make it easy to implement this requirement with the filter.
Thanks in advance,
Neil James
#14
in reply to:
↑ 12
@
4 years ago
Replying to whyisjake:
I'd love to see this land in 5.5, getting some more eyes on it.
I'd rather put it in early in 5.6, it's a substantial change of approach.
In 40351.diff, the existing query is used on the edit terms screen. I can't remember if that is in the original plugin but it would be good to avoid that in core if possible.
@
4 years ago
Add the filter to the original count code in _update_post_term_count (used in edit-{taxonomy} screen).
This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.
4 years ago
#16
follow-up:
↓ 30
@
4 years ago
- Keywords needs-dev-note added
As discussed during <bug-scrub>
, this one will get a Misc Dev Note for the new filter.
This ticket was mentioned in PR #515 on WordPress/wordpress-develop by peterwilsoncc.
4 years ago
#17
Trac ticket: https://core.trac.wordpress.org/ticket/40351
#18
@
4 years ago
- Keywords needs-refresh added
I've started a pull request based on 40351.4.patch so the existing tests begin running on the screen.
It's still a WIP but so far I have made a few minor changes:
- Documented
countable_status
filter, I suspect it will need a rename to the plural - Some tidy up for coding standards
- Modified
maybe_recount_posts_for_term()
to allow for developers deferring post counts by switching towp_update_term_count()
- There were a few errors in the patch that I am slowly tidying up and the unit tests alert me to them, as best I can tell mostly type casting that needs fixing up
#19
@
4 years ago
Further notes on what needs to be added for this to be ready
- Shortly after
deleted_term_relationships
andadded_term_relationship
fire, calls towp_update_term_count()
need to be removed/modified as the proposed patch doesn't defer term counting like the original plugin. - Including a term count on
edit_term
adds a new location in which terms are counted and doesn't seem like it's needed countable_status
is new, very useful but unrelated to this feature. It's probably best left for another ticket.
I'm leaving my PR as is at moment (in some need of change) so I can get some others' feedback on it before updating. As many eyes as possible would be dandy, thanks.
This ticket was mentioned in PR #519 on WordPress/wordpress-develop by peterwilsoncc.
4 years ago
#20
- Keywords has-unit-tests added; needs-unit-tests needs-refresh removed
Completely revised approach to scaling term counting.
- respects deferring of term counting
- replaces calls to the current functions with calls to new functions
- does not introduce new full count points
- terms added in
wp_insert_post()
can be double counted (this was happening with the earlier approach too) - has a single unit test to demonstrate the bug & prevent passing.
Trac ticket: https://core.trac.wordpress.org/ticket/40351
#21
@
4 years ago
I've created a new PR with a different slightly different approach to account for how term counting currently works in Core.
Both approaches are currently a little broken, off-by-one errors, but I think the revised approach is an improvement in terms of maintaining backward compatibility. The new PR is at https://github.com/WordPress/wordpress-develop/pull/519, comments on it would be dandy.
This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.
4 years ago
#23
@
4 years ago
In updated pull request #519:
- unit tests for term counts, they pass on both trunk and the branch now :)
register_taxonomy()
accepts a new argumentupdate_count_by_callback
for incrementing and decrementing term count. For backcompat it callsupdate_count_callback
if it's defined by "count_by" is not- A new function
_wp_prevent_term_counting()
used to prevent double counting when adding terms withinwp_insert_post()
andwp_publish_post()
- DB queries are deferred until the final numbers are known when transitioning post counts to reduce the number of
UPDATE
calls wp_set_object_terms()
andwp_remove_object_terms()
now check the taxonomies object type to determine whether the counts should be updated immediately- All of Core's calls to the older functions have been removed
Questions:
- Has anything been missed?
- Should I be checking trashing and deleting posts too in the unit tests?
- Is
_wp_prevent_term_counting()
too much of a hack and using a reflection preferable? - Has anything been missed (yes, asking this question twice)?
As the definition of early 5.6
is beginning to be stretched some additional eyes on this would be superb.
peterwilsoncc commented on PR #519:
4 years ago
#24
@NeilWJames Thanks for reviewing.
As update_count_callback()
is expecting two arguments in current releases of WP, leaving $modify_by
out of the call is intentional.
If a developer has defined a custom counting callback, calling that will ensure their counts are correct until they update how they are registering their taxonomy.
This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.
4 years ago
#26
@
4 years ago
- Keywords dev-feedback added
Per discussion in today's bug scrub, adding needs dev-feedback
.
This ticket was mentioned in Slack in #core by peterwilsoncc. View the logs.
4 years ago
#28
@
4 years ago
Following the discussion in last week's 0500 UTC dev chat I have done further testing.
PR updates since:
- further tests: basically putting in to code the tests cases I thought of.
- one minor change to actual code to prevent off-by-X errors.
- typo fix in Taxonomy class
I reached out to Boone Gorges via Slack for additional feedback: while happy with the code he does have some concern that off-by-one error will remain in the database in the event of race conditions due to simultaneous updates or propagation delays across database servers.
I have noticed an issue with taxonomies applying to users or comments: deleting the object does not delete the object/term relationship, thus causing term counts to be out. I have tested this on both trunk and this PR's branch and the problem occurs on both indicating term counts for non-post objects may already be out and can not be handled by this ticket.
I think this is good to go and aim to commit just prior to this week's dev chat in two days. If anything else if found between now and then, it can be removed from the milestone.
#30
in reply to:
↑ 16
;
follow-up:
↓ 31
@
4 years ago
- Keywords needs-dev-note removed
Replying to hellofromTonya:
As discussed during
<bug-scrub>
, this one will get a Misc Dev Note for the new filter.
As noted in comment 19, I removed the new filter as despite been handy it's off-purpose for this ticket.
I have opened #51517 for the purpose of including additional statuses in term counts.
#31
in reply to:
↑ 30
@
4 years ago
Replying to peterwilsoncc:
Replying to hellofromTonya:
As discussed during
<bug-scrub>
, this one will get a Misc Dev Note for the new filter.
As noted in comment 19, I removed the new filter as despite been handy it's off-purpose for this ticket.
I have opened #51517 for the purpose of including additional statuses in term counts.
@peterwilsoncc Do you think the new functions added by this ticket need a dev note?
#32
@
4 years ago
Also, are there any considerations for sites that are running the liteweight term counts plugin?
#33
@
4 years ago
- Keywords needs-dev-note added
@hellofromTonya Good point, the new functions and methodology will need a dedicated dev note.
@whyisjake I've opened an issue on the LWTC repository for disabling the plugin in 5.6. It would be dandy if some Automatticians could check for any problems for sites that don't update. As the plugin disables term counts, I don't think there will be.
#34
@
4 years ago
- Keywords needs-patch needs-unit-tests added; has-patch needs-testing early has-unit-tests dev-feedback removed
- Resolution fixed deleted
- Status changed from closed to reopened
@peterwilsoncc A closure has no access to variables of the parent scope thus the fallback for update_count_by_callback
is producing PHP warnings because $args['update_count_callback']
is not set.
#35
@
4 years ago
The PHP Warning is specifically:
wp-includes/class-wp-taxonomy.php:430 - call_user_func() expects parameter 1 to be a valid callback, no array or string given
Adding a use( $args )
to the closure callback will fix it, if that's acceptable for core usage, otherwise the callback will need to run something like $args = get_taxonomy( $taxonomy )
.
#36
@
4 years ago
The use( $args )
should be allowed, as it's used in other parts of core already. For just one example, see https://core.trac.wordpress.org/browser/trunk/src/wp-includes/user.php#L3242
This ticket was mentioned in PR #602 on WordPress/wordpress-develop by peterwilsoncc.
4 years ago
#37
- Keywords has-patch has-unit-tests added; needs-patch needs-unit-tests removed
Trac ticket: https://core.trac.wordpress.org/ticket/40351
#38
@
4 years ago
@ocean90 Good catch, thanks.
I've set up a pull request to fix this (using use
) and added some tests for the callbacks to the register_taxonomy
tests.
#39
@
4 years ago
About to commit pull request 602 after discussing with @dd32 to confirm it would fix the warnings on meta properties.
Something that has occurred to me is that it's not possible to use a custom callback for full counts (the update_count_callback
argument) and the default callback for partial counts (the update_count_by_callback
argument).
A developer defining either argument with the default callback will cause an infinite loop. This has been the case for as long as update_count_callback
has existed and I am unable to see a situation in which a developer would need a custom callback for full but not partial counts.
@
4 years ago
Fixes the $taxonomy parameter for the actions edit_term_taxonomy and edited_term_taxonomy
#41
@
4 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
The new function wp_modify_term_count_by_now()
fires the actions edit_term_taxonomy
and edited_term_taxonomy
with a wrong value for the $taxonomy
parameter. It sends the WP_Taxonomy
object instead of the taxonomy name, as documented for the actions.
I propose 40351.4.diff to fix the issue.
#42
@
4 years ago
In Polylang we heavily rely on taxonomies, including taxonomy for terms to create languages and translation groups for terms. However [49141] introduces an issue in counting for these taxonomies.
I wrote 2 simple tests to explain what we are doing.
In this first test, we automatically create a term in a taxomomy "term_taxonomy" when we create a tag. This test passes with WordPress 5.6 beta 1:
<?php function test_wp_set_object_terms_hooked_to_create_term() { $args['update_count_callback'] = '_update_generic_term_count'; register_taxonomy( 'term_taxonomy', 'term', $args ); add_action( 'create_term', function( $term_id, $tt_id, $taxonomy ) { if ( 'post_tag' === $taxonomy ) { $term = wp_insert_term( uniqid(), 'term_taxonomy' ); $lang = $term['term_id']; wp_set_object_terms( $term_id, $lang, 'term_taxonomy' ); } }, 10, 3 ); $tag_id = $this->factory->tag->create(); $terms = wp_get_object_terms( $tag_id, 'term_taxonomy' ); $this->assertCount( 1, $terms ); // Check that a term has been assigned to the tag. $term = reset( $terms ); $this->assertEquals( 1, $term->count ); // Check the term count. }
In this second test, we do the same but the tag is created by wp_insert_post()
. This test passes with WordPress 5.5 but fails wiht 5.6 beta 1:
<?php function test_wp_set_object_terms_hooked_to_create_term_in_wp_insert_post() { $args['update_count_callback'] = '_update_generic_term_count'; register_taxonomy( 'term_taxonomy', 'term', $args ); add_action( 'create_term', function( $term_id, $tt_id, $taxonomy ) { if ( 'post_tag' === $taxonomy ) { $term = wp_insert_term( uniqid(), 'term_taxonomy' ); $lang = $term['term_id']; wp_set_object_terms( $term_id, $lang, 'term_taxonomy' ); } }, 10, 3 ); $post_id = $this->factory->post->create( array( 'tags_input' => array( 'my_tag' ) ) ); $tags = wp_get_post_tags( $post_id ); $tag = reset( $tags ); $terms = wp_get_object_terms( $tag->term_id, 'term_taxonomy' ); $this->assertCount( 1, $terms ); // Check that a term has been assigned to the tag. $term = reset( $terms ); $this->assertEquals( 1, $term->count ); // Check the term count, fails in WP 5.6. }
The issue is caused by the call to _wp_prevent_term_counting()
by wp_insert_post()
. It's ok for all the terms assigned directly by this function, but it breaks the counting of terms assigned in functions hooked to actions fired while the counting is prevented.
I am not sure it would fix all issues but one possible solution for Polylang would be to make _wp_prevent_term_counting()
taxonomy dependent. That the term counting would be prevented only for taxonomies handled directly by wp_insert_post()
instead of all taxonomies at the same time.
@
4 years ago
Alternative fix proposed. Seems problem occurs as parameter $taxonomy is changed to an object a few lines earlier. This does as in wp_set_object_terms and wp_remove_object_terms.
#43
@
4 years ago
@Chouby @lcyh78 Thanks for the patches re: sending the taxonomy object instead of the name, I will put that in early next week. I will probably use 40351.4.diff as it matches the code used for full term counts in _update_post_term_count()
.
@Chouby For the terms created and assigned using the filter, I will create a follow-up ticket as the patch might need a few iterations and unit tests so I think it will be confusing to continue it here. Again, I will action this next week and cc you on the ticket when I do so.
#44
@
4 years ago
@peterwilsoncc Thanks. Let me know, I will propose there a new test without extra taxonomy but only categories, which tends to demonstrate that preventing counting should be term dependent.
#45
@
4 years ago
#51630 created for followup to the issue @Chouby reported in comment:42 above.
#47
@
4 years ago
Hopefully nothing further will come up on this but if it does, please open another ticket on the Taxonomy component and cc me in the description for followup.
It's a fairly major change in approach so will be good to track things separately, as is happening in #51630.
This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.
4 years ago
This ticket was mentioned in PR #675 on WordPress/wordpress-develop by peterwilsoncc.
4 years ago
#49
Trac ticket: https://core.trac.wordpress.org/ticket/40351
inherit
status checks when adding/removing term counts.
This ticket was mentioned in PR #676 on WordPress/wordpress-develop by peterwilsoncc.
4 years ago
#50
Trac ticket: https://core.trac.wordpress.org/ticket/40351
Sad Clown.
#51
@
4 years ago
- Keywords needs-dev-note removed
- Milestone changed from 5.6 to Future Release
- Resolution fixed deleted
- Status changed from closed to reopened
Reopening with intent to revert, the fix on #51630 is leaving me a little nervous and without it the enhancement is broken. PR to revert is up and will be committed in the next few hours.
#53
@
3 years ago
- Keywords needs-patch added; has-patch has-unit-tests removed
- Milestone changed from Future Release to 5.9
Putting this on 5.9 for wrapping up, it was very close in 5.6 but didn't quite make it.
During the 5.6 I started on a pull request with one potential approach but ended up reverting as I wasn't happy with it. Preventing double counts needs to be more specific than the original code that was reverted but there are neater was to do it than in the PR.
This ticket was mentioned in PR #1601 on WordPress/wordpress-develop by peterwilsoncc.
3 years ago
#54
- Keywords has-patch has-unit-tests added; needs-patch removed
This reverts commit 088fd3cd390a8006a88810ab43073c6a0ca25e07.
Trac ticket: TBA (leaving blank until ready to link).
#55
follow-up:
↓ 56
@
3 years ago
@peterwilsoncc Would you be so kind as to summarize the outstanding issues with this change please? Do any of the issues identified affect LTCU (and can therefore be fixed and tested in the plugin first) or only the port for core?
#56
in reply to:
↑ 55
@
3 years ago
Replying to johnbillion:
@peterwilsoncc Would you be so kind as to summarize the outstanding issues with this change please?
There's only one issue as far as I am aware.
_wp_prevent_term_counting()
was designed to prevent term counting when adding/removing terms during wp_insert_post()
and defer the changes to the transition_post_status
hook. This is to prevent double counting a change (ie, adding a count of two for a single item).
Unfortunately it proved too blunt as terms added to another object (either a different post, or another object type) went uncounted too. There are some examples in #51630.
The LTCU plugin tracks the individual object IDs for the equivalent functionality. I considered adding something like that but as WP is functional, the function's signature became rather unwieldy.
This ticket was mentioned in Slack in #core by audrasjb. View the logs.
3 years ago
#58
@
3 years ago
- Keywords early added
- Milestone changed from 5.9 to Future Release
As per today's bug scrub and since tomorrow is WP 5.9 feature freeze, I'm moving this ticket to Future release
. Also adding early
workflow keyword since this enhancement needs to be properly checked to avoid backwards compatibility issues.
cpu-usage.png shows the room for improvement on large sites.
Work (a large news organisation) enabled light-weight term counting on March 7 due to CPU alerts firing daily during the
wp_scheduled_auto_draft_delete
andwp_scheduled_delete
cron jobs. CPU use immediately dropped 30%+ during those tasks.Note: for various reasons such as the number of users, we generate a lot of auto-drafts so these results aren't typical on a standard WordPress site, this is very specific to a high usage site.
At the very least, I think we can introduce some pre-flight hooks to the term counting functions to allow customisation of the database query. I'd suggest preflight filters for
wp_update_term_count_now()
,_update_post_term_count()
and_update_generic_term_count()
.