Make WordPress Core

Opened 8 years ago

Closed 20 months ago

Last modified 17 months ago

#40613 closed enhancement (fixed)

Add query cache to WP_User_Query class

Reported by: johnjamesjacoby's profile johnjamesjacoby Owned by: spacedmonkey's profile spacedmonkey
Milestone: 6.3 Priority: normal
Severity: normal Version: 3.1
Component: Users Keywords: has-unit-tests has-patch has-dev-note
Focuses: performance Cc:

Description (last modified by johnjamesjacoby)

When calling get_users() the query itself is not cached. If multiple calls to get_users() with the same $args array occur, each will result in a database hit.

Note that each individual user object is cached, and that query caches will need to be invalidated when users are added/removed/edited.

Theoretically, on busy sites with many users who are constantly changing passwords and things, caching user queries may not yield a significant performance improvement. For the majority of installations (where users are not shifted around nearly as much) this could result in several additional cache hits per page load (especially with persistent caches in place.)

Functions like wp_dropdown_users() are called multiple times when viewing the Posts list-table, and many plugins also call get_users() knowing that user objects are cached (but not noticing the additional database hits for each call itself.)

Membership plugins like BuddyPress would have inherent benefits, and bbPress could operate more efficiently when querying for the intersection of many topics with many users who are engaged in them.

Patch imminent.

Attachments (2)

40613.patch (3.1 KB) - added by johnjamesjacoby 8 years ago.
40613.diff (6.9 KB) - added by spacedmonkey 7 years ago.

Download all attachments as: .zip

Change History (35)

#1 @johnjamesjacoby
8 years ago

  • Description modified (diff)

#2 @spacedmonkey
7 years ago

#41847 was marked as a duplicate.

@spacedmonkey
7 years ago

#3 @spacedmonkey
7 years ago

  • Keywords has-patch needs-testing added
  • Version set to 3.1

My first patch adds basic caching. It add cache invalidation for user meta, adding users to blogs and CRUD functions for users.

This changes all queries to be ids, get the whole user object. Then for fields queries, just pick off the fields out of the WP_User object that you want.

Other big change is to the cache_users function. This function is very similar to _prime_<type>_cache function. I have added new param to cache_users of $update_meta_cache to optionally prime user meta caches.

#5 @westi
7 years ago

Some thoughts on the latest patch:

  • It uses two different last_changes values - users and comments - I guess some copy paste bug.
  • As mentioned in the original ticket adding caching here is probably not going to help at all on busy multi-site installs because the cache will continually be invalidated - potentially just adding useless cache set traffic both for individual query results and updates to last_changed.
  • I wonder if a better solution to the described problems would be to introduce in-memory non-persistent caching for query results so that re-running queries in the same page generation will be neutered.
  • Ideally we should add caching only when we can demostrate a functional performance improvement not just because we can.

#6 @dd32
7 years ago

I have some serious concerns about adding caching here - it seems like it's something that for the majority of sites will fall into one of the following buckets

  • It's a small site, and get_users() is called with the same set of simple arguments, resulting in cache hits
  • It's a large site, where get_users() is seldom called but when it is it's a few simple arguments, resulting in cache hits
  • It's a large site, where get_users() is called often, with varied arguments (often complex) resulting in many cache misses due to varying keys

For the first two cases, a cache here would be useful, however, I don't think they would really need the cache at all.
The 3rd case is where caching would be thought to be needed, but if the query arguments change often then caching wouldn't actually be of benefit.
If the reasoning for this change is simply to reduce queries, that seems like the incorrect way to go, and I'd suggest that looking at what is calling get_users() often is a better case of a location for the caching.

While we could make the caching for get_users() in-memory non-persistent (like we do with say comments IIRC), I'm not convinced that it'll actually help site performance and instead just make this part of the code more complex.

To tack on to @westi's comment though - in-memory non-persistent query caching sounds like an interesting approach, and one that i believe some DB caching dropins may already do. I'd like to see that built up in a plugin myself to see exactly what impact it has on a regular site (and then a site which would actually benefit from it) past memory usage - but i suspect it wouldn't be of great benefit to the majority of installs (and probably a detriment to a lot of them).

#7 @strategio
7 years ago

We are running a big site with more than 100k users and this is causing an important performance hit for us. I think an in-memory caching would already help. For instance, a user drop-down is displayed twice in "Posts -> All Posts" and for each call, get_users is called with the same arguments. One unique query takes ~3s, so this is an important place of improvement for us.

If you still prefer not to add caching in get_users, let me know and I will open a ticket to fix it at least in WP_Posts_List_Table::inline_edit where wp_dropdown_users is called twice.

#8 @spacedmonkey
6 years ago

As caching requirements will be different for different projects, it may be best to keep this caching as a plugin. To that end, I have created a plugin that will do the caching. You can find the code here. This plugin requires a number of new filters to added in core.

I would love feedback on this plugin.

#9 @srikanthmeenakshi
3 years ago

#55223 was marked as a duplicate.

#10 @OllieJones
3 years ago

A similar issue crops up in the Gutenberg editor's REST request to populate the author dropdown.

The REST request looks like this

https://giantsite.example.com/wp-json/wp/v2/users?context=view&who=authors&per_page=50&_fields=id,name&_locale=user

Enhancing its performance requires judicious filtering at the 'rest_user_query' hook.

(Just like handling the Posts and Pages panels requires judicious filtering on 'wp_dropdown_users_args'.)

Why are these slow? They need to examine each user's wp_capabilities metadata, which holds a serialized php object. It's expensive for MariaDB / MySQL to filter on this sort of predicate.

WHERE ... meta_key = 'wp_capabilities' AND meta_value LIKE '%"author"%'

for two reasons. One is the notorious LIKE '%something' dbms performance killer. The other is the fact that meta_value columns are CLOBs, which take extra work for the dbms to retrieve. And retrieve them it must, one for every user. In a megauser installation, that is *very* slow. It also hammers the dbms internal cache, potentially slowing down everything else.

#11 follow-up: @khoipro
2 years ago

When running wp_dropdown_users with 90k users (from my project), even if we set 'role' is only administrator, the user query still slow. The alternative we must fix is providing a input text and a custom REST API to find a single user by user_login and replace input value on save.

This ticket was mentioned in PR #3544 on WordPress/wordpress-develop by @spacedmonkey.


2 years ago
#12

This is a first draft of PR for adding query caching to WP_User_Query cache. Tests need to be added and lots of manual testing.

Trac ticket: https://core.trac.wordpress.org/ticket/40613

#13 @spacedmonkey
2 years ago

  • Milestone changed from Awaiting Review to 6.2
  • Owner set to spacedmonkey
  • Status changed from new to assigned

Updated the patch to be on github.

I think core is in a place where we could introduce this without breaking anything.

#14 @OllieJones
2 years ago

For what it's worth, I have published a plugin to address many of these many-registered users issues, here. https://wordpress.org/plugins/index-wp-users-for-speed/advanced/

The core fix is still a good idea, of course.

#15 @rjasdfiii
2 years ago

Please add samples of the SQL statements necessary for this user cache. We should optimize the queries and/or improve the indexes at the same time.

#16 @spacedmonkey
2 years ago

@rjasdfiii Change databases is out of scope of this change. Let's do that in a follow up ticket.

@spacedmonkey commented on PR #3544:


2 years ago
#17

  • if users have modified the select clause

There is already a unit test for this, see test_users_pre_query_filter_should_bypass_database_query.

  • random in orderby if possible

There is no random orderby that can see.

@peterwilsoncc I pushed a lot of changes, I think this is ready for review now.

#18 @flixos90
22 months ago

  • Milestone changed from 6.2 to 6.3
  • Type changed from defect (bug) to enhancement

This is not a bug, but an enhancement. Since there have been no updates in 2 months, I'll move this to the 6.3 milestone, since the 6.2 Beta 1 deadline is tomorrow. If the PR can be finalized and approved tomorrow in time, we can move it back, but at this point it seems unrealistic given there are a few other query related enhancements already that need to be finalized by then.

#19 @spacedmonkey
22 months ago

@flixos90 For context, no update is because of a lack of feedback / code review.

Pinging @johnbillion @dd32 @peterwilsoncc for code review.

#20 in reply to: ↑ 11 @rjasdfiii
22 months ago

Replying to khoipro:

When running wp_dropdown_users with 90k users (from my project), even if we set 'role' is only administrator, the user query still slow. The alternative we must fix is providing a input text and a custom REST API to find a single user by user_login and replace input value on save.

Does that mean fetching 90K entire user objects? Or names 90K Names? And is it for displaying? Or for auto-complete?

I can think of better ways to handle auto-complete than starting with 90K strings.

#21 @mukesh27
22 months ago

This ticket was discussed in the recent performance bug scrub.

The PR 3544 need manual testing so can anyone from testing team review this. @robinwpdeveloper

Props to @spacedmonkey

#22 @mukesh27
20 months ago

  • Keywords changes-requested added

Hi @spacedmonkey, I reviewed PR and left some feedback on it.

#23 @spacedmonkey
20 months ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 55657:

Users: Cache database queries within WP_User_Query class.

Cache the results of database queries within WP_User_Query class. Only cache queries that are requesting 3 or less fields so that caches are not storing full user objects. Cache results are stored in a new global cache group named users-queries. Add a new parameter to WP_User_Query called cache_results to allow developers to opt out of a receiving cached results. cache_results parameter defaults to true. Also add a new helper function called wp_cache_set_users_last_changed, similar to wp_cache_set_posts_last_changed that incroments last changed value in cache group users. Ensure that wp_cache_set_users_last_changed is called whenever user / user meta is modified for proper cache invalidation.

Props johnjamesjacoby, spacedmonkey, westi, dd32, strategio, srikanthmeenakshi, OllieJones, khoipro, rjasdfiii, flixos90, mukesh27, peterwilsoncc.
Fixes #40613.

#25 @spacedmonkey
20 months ago

  • Keywords has-unit-tests needs-patch needs-dev-note added; has-patch needs-testing changes-requested removed

I think this change need it's own dev note.

#26 follow-up: @SergeyBiryukov
20 months ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Thanks for the commit!

Minor nitpick, could the users-queries group be renamed to user-queries, for consistency with other groups:

  • post-queries
  • comment-queries
  • term-queries
  • network-queries
  • site-queries

This ticket was mentioned in PR #4359 on WordPress/wordpress-develop by @spacedmonkey.


20 months ago
#27

  • Keywords has-patch added; needs-patch removed

#28 in reply to: ↑ 26 @spacedmonkey
20 months ago

Replying to SergeyBiryukov:

Thanks for the commit!

Minor nitpick, could the users-queries group be renamed to user-queries

Good call. I have created a PR to change this group. https://github.com/WordPress/wordpress-develop/pull/4359

#29 @spacedmonkey
20 months ago

In 55680:

Users: Change cache group from users-queries to user-queries.

The cache group users-queries was added in [55657]. This global cache group, was named to be inline with cache groups added in [55526]. However, the naming of the group does not match, as other cache groups, do not end with s at the end. This change fix this naming.

Props spacedmonkey, SergeyBiryukov, peterwilsoncc.
See #40613.

@spacedmonkey commented on PR #4359:


20 months ago
#30

Committed.

#31 @spacedmonkey
20 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed

#33 @flixos90
17 months ago

  • Keywords has-dev-note added; needs-dev-note removed
Note: See TracTickets for help on using tickets.