Make WordPress Core

Opened 5 years ago

Last modified 3 years ago

#40613 new defect (bug)

Add query cache to WP_User_Query class

Reported by: johnjamesjacoby Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 3.1
Component: Users Keywords: has-patch needs-testing
Focuses: performance Cc:

Description (last modified by johnjamesjacoby)

When calling get_users() the query itself is not cached. If multiple calls to get_users() with the same $args array occur, each will result in a database hit.

Note that each individual user object is cached, and that query caches will need to be invalidated when users are added/removed/edited.

Theoretically, on busy sites with many users who are constantly changing passwords and things, caching user queries may not yield a significant performance improvement. For the majority of installations (where users are not shifted around nearly as much) this could result in several additional cache hits per page load (especially with persistent caches in place.)

Functions like wp_dropdown_users() are called multiple times when viewing the Posts list-table, and many plugins also call get_users() knowing that user objects are cached (but not noticing the additional database hits for each call itself.)

Membership plugins like BuddyPress would have inherent benefits, and bbPress could operate more efficiently when querying for the intersection of many topics with many users who are engaged in them.

Patch imminent.

Attachments (2)

40613.patch (3.1 KB) - added by johnjamesjacoby 5 years ago.
40613.diff (6.9 KB) - added by spacedmonkey 4 years ago.

Download all attachments as: .zip

Change History (10)

#1 @johnjamesjacoby
5 years ago

  • Description modified (diff)

#2 @spacedmonkey
4 years ago

#41847 was marked as a duplicate.

4 years ago

#3 @spacedmonkey
4 years ago

  • Keywords has-patch needs-testing added
  • Version set to 3.1

My first patch adds basic caching. It add cache invalidation for user meta, adding users to blogs and CRUD functions for users.

This changes all queries to be ids, get the whole user object. Then for fields queries, just pick off the fields out of the WP_User object that you want.

Other big change is to the cache_users function. This function is very similar to _prime_<type>_cache function. I have added new param to cache_users of $update_meta_cache to optionally prime user meta caches.

#5 @westi
4 years ago

Some thoughts on the latest patch:

  • It uses two different last_changes values - users and comments - I guess some copy paste bug.
  • As mentioned in the original ticket adding caching here is probably not going to help at all on busy multi-site installs because the cache will continually be invalidated - potentially just adding useless cache set traffic both for individual query results and updates to last_changed.
  • I wonder if a better solution to the described problems would be to introduce in-memory non-persistent caching for query results so that re-running queries in the same page generation will be neutered.
  • Ideally we should add caching only when we can demostrate a functional performance improvement not just because we can.

#6 @dd32
4 years ago

I have some serious concerns about adding caching here - it seems like it's something that for the majority of sites will fall into one of the following buckets

  • It's a small site, and get_users() is called with the same set of simple arguments, resulting in cache hits
  • It's a large site, where get_users() is seldom called but when it is it's a few simple arguments, resulting in cache hits
  • It's a large site, where get_users() is called often, with varied arguments (often complex) resulting in many cache misses due to varying keys

For the first two cases, a cache here would be useful, however, I don't think they would really need the cache at all.
The 3rd case is where caching would be thought to be needed, but if the query arguments change often then caching wouldn't actually be of benefit.
If the reasoning for this change is simply to reduce queries, that seems like the incorrect way to go, and I'd suggest that looking at what is calling get_users() often is a better case of a location for the caching.

While we could make the caching for get_users() in-memory non-persistent (like we do with say comments IIRC), I'm not convinced that it'll actually help site performance and instead just make this part of the code more complex.

To tack on to @westi's comment though - in-memory non-persistent query caching sounds like an interesting approach, and one that i believe some DB caching dropins may already do. I'd like to see that built up in a plugin myself to see exactly what impact it has on a regular site (and then a site which would actually benefit from it) past memory usage - but i suspect it wouldn't be of great benefit to the majority of installs (and probably a detriment to a lot of them).

#7 @strategio
4 years ago

We are running a big site with more than 100k users and this is causing an important performance hit for us. I think an in-memory caching would already help. For instance, a user drop-down is displayed twice in "Posts -> All Posts" and for each call, get_users is called with the same arguments. One unique query takes ~3s, so this is an important place of improvement for us.

If you still prefer not to add caching in get_users, let me know and I will open a ticket to fix it at least in WP_Posts_List_Table::inline_edit where wp_dropdown_users is called twice.

#8 @spacedmonkey
3 years ago

As caching requirements will be different for different projects, it may be best to keep this caching as a plugin. To that end, I have created a plugin that will do the caching. You can find the code here. This plugin requires a number of new filters to added in core.

I would love feedback on this plugin.

Note: See TracTickets for help on using tickets.