Make WordPress Core

Opened 8 years ago

Closed 3 days ago

#39242 closed enhancement (fixed)

Add caching to count_user_posts()

Reported by: johnjamesjacoby's profile johnjamesjacoby Owned by: spacedmonkey's profile spacedmonkey
Milestone: 6.8 Priority: normal
Severity: normal Version: 3.0
Component: Users Keywords: has-patch has-unit-tests needs-testing-info
Focuses: rest-api, performance Cc:

Description

The count_user_posts() function does not cache the results of its query.

On the surface, this seems easy enough, but since it accepts multiple parameters and intersects with page/post editing, it starts to get complex quickly as authors change, posts are transitioned from public to private, and custom post types do more elaborate things.

Since some themes use this directly (Twenty Fourteen, et al) and the REST API exposes this data too, there's at least a small win from not hitting the database for each and every check.

Eventually, count_many_users_posts() may be able to check for cached counts for existing users, and only query for (and prime the caches of) users who do not already have cached counts.

Attachments (5)

39242.diff (1.4 KB) - added by milindmore22 7 years ago.
Cache for count_user_posts
39242.2.diff (1.3 KB) - added by desrosj 5 years ago.
39242.3.diff (3.4 KB) - added by johnregan3 3 years ago.
Not fully functional. Unit tests added, but others are failing.
39242.4.diff (4.7 KB) - added by wpgurudev 2 years ago.
Refactor count_user_posts function
39242-unit-tests.diff (8.9 KB) - added by wpgurudev 2 years ago.
Refactored function with unit tests

Download all attachments as: .zip

Change History (56)

#1 @DJPaul
8 years ago

We're seeing this uncached function be an issue on a A8C VIP hosted site when querying the REST API users endpoint.

@milindmore22
7 years ago

Cache for count_user_posts

#2 @milindmore22
7 years ago

  • Focuses rest-api performance added

Yes, performances of websites using REST is highly affected.

I have added a patch based on WP VIP function below.

<?php

/**
 * Cached version of count_user_posts, which is uncached but doesn't always need to hit the db
 *
 * count_user_posts is generally fast, but it can be easy to end up with many redundant queries
 * if it's called several times per request. This allows bypassing the db queries in favor of
 * the cache
 */
function wpcom_vip_count_user_posts( $user_id ) {
    if ( ! is_numeric( $user_id ) ) {
        return 0;
    }

    $cache_key = 'vip_' . (int) $user_id;
    $cache_group = 'user_posts_count';

    if ( false === ( $count = wp_cache_get( $cache_key, $cache_group ) ) ) {
        $count = count_user_posts( $user_id );

        wp_cache_set( $cache_key, $count, $cache_group, 5 * MINUTE_IN_SECONDS );
    }

    return $count;
}

Do we need proper cache invalidation or 5 Min expiration timeout will do ?

I think we should handle this in 5.0 thoughts ?

#3 @RogerTheriault
5 years ago

Just came here to chime in that this (or a similar) improvement is still needed for sites that have many rows in the posts table and use the REST API.

#4 @whyisjake
5 years ago

  • Keywords has-patch dev-feedback added; needs-patch 2nd-opinion removed
  • Milestone changed from Awaiting Review to 5.6
  • Owner set to whyisjake
  • Status changed from new to accepted
  • Version set to trunk

@desrosj
5 years ago

#5 @desrosj
5 years ago

  • Version changed from 5.5 to 3.0

Changing the version to 3.0 because that's the first version the function existed without caching.

Looking at 39242.diff, it doesn't take into account the $post_type or $public_only parameters.

Also, looking at other cached values in core, it seems that not specifying an expiration in favor of invalidating the cache when appropriate is preferred. In this case, when a post is updated (published/unpublished, or visibility is changed).

39242.2.diff adds the post type to the cache key, and introduces an additional group for when $public_only is true.

I toyed around with where to invalidate caches for a while. Somewhere inside clean_user_cache() feels like the right spot. But there is no way to pass the two additional parameters. Inside wp_insert_post() should be more feasible.

#6 @johnbillion
5 years ago

  • Keywords needs-unit-tests needs-testing added

This would benefit greatly from tests that cover repeated calls, cache invalidation, etc.

This ticket was mentioned in Slack in #core by metalandcoffee. View the logs.


4 years ago

#8 @hellofromTonya
4 years ago

Hey @whyisjake, With beta 1 coming fast, is this ticket still on your radar for 5.6 milestone? If yes, what's the next step for it?

This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.


4 years ago

#10 @hellofromTonya
4 years ago

  • Milestone changed from 5.6 to Future Release

In today's core scrub, decision made to punt this ticket to Future Release as 5.6 Beta 1 is 5 days (next Tuesday 20th Oct) and this ticket needs tests.

If any maintainer or committer feels this can be resolved in time, or wishes to assume ownership during a specific cycle, feel free to update the milestone accordingly.

#11 @TimothyBlynJacobs
4 years ago

#46249 was marked as a duplicate.

#12 @Krstarica
3 years ago

REST API is still terribly slow, is it possible to fix this issue? Thanks.

@johnregan3
3 years ago

Not fully functional. Unit tests added, but others are failing.

#13 @johnregan3
3 years ago

Modified the code so that if an array is passed as the $post_type param, the cache key name will be generated correctly.

I also composed initial Unit Tests.

However, the tests currently come back as such:

There were 3 failures:

1) Tests_Media::test_wp_get_attachment_image_should_use_wp_get_attachment_metadata
Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
-'<img width="999" height="999" src="http://example.org/wp-content/uploads/2022/07/test-image-testsize-999x999.jpg" class="attachment-testsize size-testsize" alt="" decoding="async" loading="lazy" srcset="http://example.org/wp-content/uploads/2022/07/test-image-testsize-999x999.jpg 999w, http://example.org/wp-content/uploads/2022/07/test-image-large-150x150.jpg 150w" sizes="(max-width: 999px) 100vw, 999px" />'
+'<img width="999" height="999" src="http://example.org/wp-content/uploads/2022/07/test-image-testsize-999x999.jpg" class="attachment-testsize size-testsize" alt="" decoding="async" loading="lazy" srcset="http://example.org/wp-content/uploads/2022/07/test-image-testsize-999x999.jpg 999w, http://example.org/wp-content/uploads/2022/07/test-image-large-1-150x150.jpg 150w" sizes="(max-width: 999px) 100vw, 999px" />'

/var/www/tests/phpunit/tests/media.php:2646

2) WP_Test_REST_Users_Controller::test_get_item_published_author_pages
Failed asserting that 401 is identical to 200.

/var/www/tests/phpunit/tests/rest-api/rest-users-controller.php:3128
/var/www/tests/phpunit/tests/rest-api/rest-users-controller.php:1152

3) Tests_User::test_count_many_users_posts
Failed asserting that '1' matches expected 2.

/var/www/tests/phpunit/tests/user.php:573

I'll continue tracking down these issues, but feel free to jump in and add feedback or your own ideas.

#14 @johnregan3
3 years ago

The reason unit test # 2 above is failing is because a post is being created and the value returned by count_user_posts() is returning the cached (former) value.

Looks like this will need an action hooked into (probably) 'save_post' that clears this cache. However, as the cache key is dependent upon the post type(s) are requested, it's trickier to guess the precise cache key we need to delete when a post is saved.

I'd love some input on the best way to approach this.

#15 @Krstarica
2 years ago

We have $cache_key = "count_user_{$post_type_label}_{$userid}"; plus taking into account $public_only param.

On save_post we actually don't know if post_type or user_is has changed, so the safest would be to delete cache for all count_user_% keys, although wp_cache_delete function does not support wildcards.

#16 @wpgurudev
2 years ago

  1. I applied the patch from @johnregan3 , I think the overall approach is fine. The only issue seems that it is not handling certain events which may need to invalidate this cache and recalculate the count.
  1. IMO, we can introduce an additional parameter $force = false to count_user_posts function as the last parameter with default value false to be backward compatible.
  1. If $force is set to true, function would recalculate the user posts count and set it as a new entry in the cache.
  1. This will be useful because we can simply call count_user_posts function with $force set to true on several hooks like save_post, deleted_post etc. to recalculate user posts count. This will further abstract the entire logic of calculating the user posts count without need to know the cache key and group names.

Further Challenges

  1. The main issue which I can observe is when we have multiple post types. Consider a situation where we pass
array( 'post', 'page', 'product' )

as $post_type to count_user_posts function.

  1. Now, consider if we delete a product then we need to update the cache count_user_post_page_product, but the challenge is how do we identify this combination, because we can easily update count_user_product, but we cannot be sure about post_page_product combination as it can be anything like post_product, page_product etc. as well.

@wpgurudev
2 years ago

Refactor count_user_posts function

#17 @wpgurudev
2 years ago

I have refactored the count_user_posts function to two smaller functions

  1. count_user_posts_for_single_type: checks count of posts for a single post type. This function leverages the wp cache.
  1. count_user_posts_for_multiple_types to check count of posts for multiple post types. This iterates over each post type and calls count_user_posts_for_single_type function in turn.

In addition to this, I have also added clear_count_user_posts_cache function which clears the cache. This function can be called on several conditions like author change of posts, visibility change etc.

I have added one such instance where author change will clear the cache. Some other instances can include

  1. Post visibility change from public to private or vice-versa.
  2. User deletion.
  3. User role change.

and so on..

We can keep adding functions and hook them to appropriate actions to clear the cache using clear_count_user_posts_cache function.

As of now, I'm clearing both the public and non-public posts count cache to keep things simplified.

@wpgurudev
2 years ago

Refactored function with unit tests

This ticket was mentioned in PR #7993 on WordPress/wordpress-develop by @swissspidy.


4 months ago
#18

  • Keywords has-unit-tests added; needs-unit-tests removed

This is just a refresh of the previous patches & tests on the ticket. More work & testing is needed though.

Trac ticket: https://core.trac.wordpress.org/ticket/39242

#19 @swissspidy
3 months ago

#62774 was marked as a duplicate.

#20 @spacedmonkey
2 months ago

@swissspidy What is the status of this PR? I feel like we are nearly there with this PR, but need to align on the cache invalidation strategy.

#21 @swissspidy
2 months ago

Haven't had time to look into it further unfortunately. Yes the cache invalidation needs to be resolved still.

This ticket was mentioned in PR #8233 on WordPress/wordpress-develop by @spacedmonkey.


8 weeks ago
#22

Simple caching based on last updated values changing.

Trac ticket: https://core.trac.wordpress.org/ticket/39242

#23 @spacedmonkey
8 weeks ago

I have created a simplier PR in #8233. This aligns more closely with other query caches, that use the last update value of the post group to invalidate caches.

#24 @peterwilsoncc
7 weeks ago

  • Milestone changed from Future Release to 6.8

I've approved the linked pull request #8233 for commit.

I'm happy it addresses the issues highlighted earlier as it uses a hash of the SQL query to generate the cache key and includes the posts' last-changed value to break the cache as authors are potentially reassigned or posts marked private.

I've confirmed that the users' last-changed value isn't needed as the function doesn't take in to account whether or not the user ID exists. Strange but true. When a user is deleted and their posts reassigned the post last-change value is bumped.

@swissspidy commented on PR #7993:


7 weeks ago
#25

Closing in favor of #8233

@spacedmonkey commented on PR #8233:


7 weeks ago
#26

@JJJ @swissspidy Are you interested in reviewing PR?

#27 @spacedmonkey
7 weeks ago

@johnjamesjacoby I believe #8233 is getting close. As the original reporter, do you want to take a look at this PR?

@swissspidy commented on PR #8233:


7 weeks ago
#28

@spacedmonkey Had a quick glance earlier today and I think it's a neat way to add this enhancement without having to make too many changes like in the other PRs. You have plenty of approvals already though :-)

@spacedmonkey commented on PR #8233:


7 weeks ago
#29

@spacedmonkey Had a quick glance earlier today and I think it's a neat way to add this enhancement without having to make too many changes like in the other PRs. You have plenty of approvals already though :-)

It more to check to see if you are strongly against it. I know you were worried about this cache being invalidated a lot, but honestly I dont see a way around it.

#30 @spacedmonkey
7 weeks ago

  • Keywords commit added; dev-feedback removed
  • Owner changed from whyisjake to spacedmonkey
  • Status changed from accepted to assigned

This has been approved by multiple core committers. Marking this as ready to commit.

@johnjamesjacoby commented on PR #8233:


7 weeks ago
#31

@JJJ @swissspidy Are you interested in reviewing PR?

Happy to!

Incrementally, I like it. Specifically:

  • Using last-changed is right
  • Tests look right

### Tangential thoughts while reviewing...

  • Do we validate $post_type in any other similar queries?
  • Does it matter if the post types are: not registered, private, internal, etc...
  • Would adding a filter on $post_type be useful, or expected?

Long term, I also like that @swissspidy included count_many_users_posts() in #7993, because it has potential - I think – to speed up the Users list table.

Perhaps that gets its own issue/ticket?

@peterwilsoncc commented on PR #8233:


7 weeks ago
#32

  • Do we validate $post_type in any other similar queries?

Sorry, I'm unclear what you are asking here but I possibly answer your question next...

  • Does it matter if the post types are: not registered, private, internal, etc...

It doesn't within count_user_posts() as the data is used by get_posts_by_author_sql() which validates the post type exists.

  • Would adding a filter on $post_type be useful, or expected?

Possibly useful but as a follow up ticket.

@logicrays commented on PR #8233:


7 weeks ago
#33

we can get the data using the bellow code and implement on it.

$user_id = get_current_user_id(); 
$post_count = count_user_posts($user_id);

@dilip2615 commented on PR #8233:


7 weeks ago
#34

<?php

/

  • Tests for count_user_posts function. *
  • @group user *
  • @ticket 39242 */

class Tests_User_CountUserPosts extends WP_UnitTestCase {

/

  • User count should work for users that don't exist but have posts assigned. */

public function test_count_user_posts_for_non_existent_user() {

$next_user_id = self::$user_id + 1;

Create a post assigned to a non-existent user.
self::factory()->post->create(

array(

'post_author' => $next_user_id,
'post_type' => 'post',

)

);

$this->assertEquals( 1, count_user_posts( $next_user_id ) );

}

/

  • Cached user count value should be accurate after user is created. */

public function test_count_user_posts_for_user_created_after_being_assigned_posts() {

$next_user_id = self::$user_id + 1;

Assign a post to a non-existent user.
self::factory()->post->create(

array(

'post_author' => $next_user_id,
'post_type' => 'post',

)

);

Cache the user count before creating the user.
$cached_count = count_user_posts( $next_user_id );

Now create the user.
$real_next_user_id = self::factory()->user->create(

array(

'role' => 'author',

)

);

$this->assertEquals( $next_user_id, $real_next_user_id );
$this->assertEquals( $cached_count, count_user_posts( $next_user_id ) );

}

/

  • User count cache should be hit regardless of post type order. */

public function test_cache_should_be_hit_regardless_of_post_type_order() {

Prime the cache by running the function once.
count_user_posts( self::$user_id, array( 'wptests_pt', 'post' ) );

Check database query count before running again.
global $wpdb;
$start_queries = $wpdb->num_queries;

Query with different order of post types.
count_user_posts( self::$user_id, array( 'post', 'wptests_pt' ) );

$end_queries = $wpdb->num_queries;

Expecting no new queries, since cache should be hit.
$this->assertEquals( $start_queries, $end_queries );

}

/

  • User count cache should be hit for string and array of post types. */

public function test_cache_should_be_hit_for_string_and_array_equivalent_queries() {

Prime the cache.
count_user_posts( self::$user_id, 'post' );

global $wpdb;
$start_queries = $wpdb->num_queries;

Query using an array instead of a string.
count_user_posts( self::$user_id, array( 'post' ) );

$end_queries = $wpdb->num_queries;

Expecting no new queries due to cache.
$this->assertEquals( $start_queries, $end_queries );

}

}

We can also use in tests/phpunit/tests/user/countUserPosts.php

I refactored the test methods for better clarity while keeping the functionality intact. I also used assertEquals() where type coercion is acceptable and replaced unnecessary string comparisons to make the tests more robust. Additionally, I added $wpdb->num_queries to track database queries before and after calls, ensuring that the cache is being utilized properly. Let me know if you need any further refinements!

@peterwilsoncc commented on PR #8233:


7 weeks ago
#35

@dilipom13 Each of those test cases are currently covered by the test suite.

WordPress aims to maintain the return type of functions, so the tests need to use assertSame(). ::test_count_user_posts_for_user_created_after_being_assigned_posts() needs to use the auto_increment value from the database as the value can change as users are created and subsequently deleted in the test suite.

@logicrays commented on PR #8233:


7 weeks ago
#36

we can get the data using the bellow code and implement on it.

$user_id = get_current_user_id(); 
$post_count = count_user_posts($user_id);

This ticket was mentioned in Slack in #core-performance by spacedmonkey. View the logs.


6 weeks ago

#38 @spacedmonkey
6 weeks ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 59817:

Users: Add caching to count_user_posts function

Introduced caching for the count_user_posts function to reduce redundant database queries. This ensures better performance by storing and reusing query results when possible. Additionally, sanitized and sorted the $post_type array to avoid invalid queries.

Props spacedmonkey, peterwilsoncc, mamaduka, flixos90, johnjamesjacoby, swissspidy, dilip2615, johnregan3, wpgurudev, desrosj, milindmore22, Krstarica, dilipom13.
Fixes #39242.

#39 @peterwilsoncc
4 weeks ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

@spacedmonkey @johnjamesjacoby Reviewing #63004, I noticed that count_many_users_posts() doesn't use the cache. That's probaly fine for now it occurs to me that we might need to change the cache key to allow for easier generation between the two functions if it's changed in the future. (Checks notes, and yes I was encouraging the use of the query for the cache key earlier :)

#40 @peterwilsoncc
4 weeks ago

The following is used to generate the query string

  • $post_type
  • $userid
  • $public_only
  • if public only
    • current user's post caps for each post type (after passing through the pub_priv_sql_capability deprecated filter)
    • current user id

So it would be a mess to keep them in sync... maybe it's fine and we can think about counting many users later. Jo(h)n, I'll leave you two to decide.

#41 @spacedmonkey
4 weeks ago

I think we should create another for handling caching count_many_users_posts . Maybe these could use the count cache group.

I don’t think that is related to this ticket.

#42 @flixos90
4 weeks ago

  • Resolution set to fixed
  • Status changed from reopened to closed

I agree this makes more sense to implement as a separate enhancement, so I opened #63045 as follow up ticket for it.

Closing this as fixed.

#43 @spacedmonkey
8 days ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

While conducting some testing on the count_user_posts function, I observed that the cache is frequently being invalidated. I believe it would be beneficial to implement a mechanism that reduces the frequency of cache invalidation, thereby preventing the object cache from being filled with these invalid cache keys.

Reopenning so we can work on a fix. CC @peterwilsoncc @joemcgill

This ticket was mentioned in PR #8549 on WordPress/wordpress-develop by jonnynews.


8 days ago
#44

Updated caching logic to use user-specific cache groups for post queries, improving granularity and efficiency. Ensured proper cache invalidation for post_author when relevant user data changes. Adjusted author-based SQL checks to ensure compatibility with cache updates.

Added tests for null, zero, false and float values, as was not tested before.

Trac ticket: https://core.trac.wordpress.org/ticket/39242

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


8 days ago

This ticket was mentioned in Slack in #core by spacedmonkey. View the logs.


7 days ago

#47 @SirLouen
7 days ago

  • Keywords needs-testing-info added; needs-testing removed

@spacedmonkey since this patch has been bouncing for several years, could you add some testing instructions to refresh the current patch iteration?

#48 @peterwilsoncc
5 days ago

  • Keywords commit removed

Removing keyword for report management due to follow up PR.

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


4 days ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


3 days ago

#51 @audrasjb
3 days ago

  • Resolution set to fixed
  • Status changed from reopened to closed

As discussed during today's pre-RC1 bug scrub, we are closing it as fixed in 6.8. Please open a new ticket referring to this one to address follow-up enhancements in 6.9./

Note: See TracTickets for help on using tickets.