WordPress.org

Make WordPress Core

Opened 20 months ago

Last modified 9 days ago

#38741 accepted enhancement

Introduce the concept of a large site in order to speed up the Users screen when there are many users

Reported by: johnbillion Owned by: johnbillion
Milestone: 5.0 Priority: normal
Severity: normal Version:
Component: Users Keywords: early has-patch dev-feedback needs-testing
Focuses: administration, multisite, performance Cc:

Description

Given a single site install with a high number of users, the Users screen can become very slow due to the nasty query performed by count_users(). The result of this query is used to populate the count of users with each role in the role filtering links across the top.

SELECT
	COUNT(NULLIF(`meta_value` LIKE '%\"administrator\"%', false)),
	COUNT(NULLIF(`meta_value` LIKE '%\"editor\"%', false)),
	COUNT(NULLIF(`meta_value` LIKE '%\"author\"%', false)),
	COUNT(NULLIF(`meta_value` LIKE '%\"contributor\"%', false)),
	COUNT(NULLIF(`meta_value` LIKE '%\"subscriber\"%', false)),
	COUNT(NULLIF(`meta_value` = 'a:0:{}', false)), COUNT(*)
FROM
	wp_usermeta
WHERE
	meta_key = 'wp_capabilities'

On a site with tens of thousands of users, this SQL query can easily take tens of seconds.

Multisite has the concept of a large network (by default, one with > 10,000 users or > 10,000 sites), but single site doesn't have this concept.

The query performed by count_users() can be skipped entirely. If it is skipped, the list of roles will simply display all available roles and won't display a count next to each role. I think this is a completely acceptable trade-off for such a performance gain on sites with a high number of users.

Attachments (9)

38741.diff (3.9 KB) - added by johnbillion 20 months ago.
38741.2.diff (5.5 KB) - added by tharsheblows 19 months ago.
with wp_is_large_site()
38741.3.diff (6.3 KB) - added by tharsheblows 14 months ago.
with tests and cleaned up a bit
38741.4.diff (8.5 KB) - added by tharsheblows 14 months ago.
renamed functions and apply filter before ms one
38741.5.diff (8.9 KB) - added by johnbillion 9 months ago.
38741.6.diff (8.4 KB) - added by johnbillion 9 months ago.
38741.7.diff (9.1 KB) - added by johnbillion 9 months ago.
38741.8.diff (12.2 KB) - added by spacedmonkey 12 days ago.
38741.9.diff (15.6 KB) - added by spacedmonkey 9 days ago.

Download all attachments as: .zip

Change History (51)

@johnbillion
20 months ago

#1 @johnbillion
20 months ago

  • Keywords has-patch needs-testing added

38741.diff is a partial patch which implements the performance improvement on the Users screen, but only implements a no-oped version of a wp_is_large_site() function.

To test the patch, change the return value of wp_is_large_site() to true.

TODO: Implement logic in wp_is_large_site() which determines if the site has a large number of users or not. This will probably need to be converted to a cached value in the same way that Multisite uses a cached value for wp_is_large_network().

#2 @Mista-Flo
20 months ago

+1 on this, we should focus on this performance issue.

#4 @tharsheblows
19 months ago

There's also an instance in wp-includes/update.php so I've updated that one too. This makes the count a transient with a max expiration of 12 hours and takes the query down from about 2.5s to 50.3ms with user table of 184k.

I'm not sure if:

  1. the value should be a transient
  2. the expiration time should be filterable

Still needs testing and tests.

@tharsheblows
19 months ago

with wp_is_large_site()

@tharsheblows
14 months ago

with tests and cleaned up a bit

#5 @tharsheblows
14 months ago

With tests and cleaned up a bit

#6 @johnbillion
14 months ago

  • Keywords has-unit-tests added
  • Milestone changed from Awaiting Review to 4.8

#7 @tharsheblows
14 months ago

Updated function names and applies before the 'wp_is_large_network' too.

@tharsheblows
14 months ago

renamed functions and apply filter before ms one

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


14 months ago

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


13 months ago

This ticket was mentioned in Slack in #core by obenland. View the logs.


13 months ago

#11 @obenland
13 months ago

  • Milestone changed from 4.8 to Future Release

#12 @johnbillion
13 months ago

Needs some more testing and review.

#13 @johnbillion
10 months ago

@tharsheblows I just remembered that we already have get_user_count() which stores a count of all the users on the network. This can be used in place of the new user count functions that you've added in your patches.

Do you want to update your patch? I'd like to see about getting this into 4.9.

#14 @tharsheblows
10 months ago

@johnbillion that's multisite only, isn't it? In ms-functions.php.

I'd love to get this in 4.9! :)

@johnbillion
9 months ago

@johnbillion
9 months ago

@johnbillion
9 months ago

#15 @miss_jwo
9 months ago

Looks good to me. Thansk for changing the function name to wp_is_large_user_count() as the original suggestion would have been misleading in many ways.

#16 @johnbillion
9 months ago

  • Keywords needs-testing removed
  • Milestone changed from Future Release to 4.9
  • Owner set to johnbillion
  • Status changed from new to reviewing

#17 @johnbillion
9 months ago

  • Resolution set to fixed
  • Status changed from reviewing to closed

In 41613:

Users: Introduce the concept of a large site in order to speed up the Users screen when there are many users.

Calling the count_users() function is expensive, regardless of the counting strategy that's used, and it gets
slower the more users there are on a site. In order to speed up the Users screen in the admin area, calling
count_users() can be avoided entirely while still displaying the total count for users.

This introduces some new functions:

  • wp_is_large_user_count()
  • wp_get_active_user_count()
  • wp_update_active_user_count()

A corresponding wp_is_large_user_count filter is also introduced.

Props tharsheblows, johnbillion

Fixes #38741

#18 @johnbillion
9 months ago

In 41614:

Users: There is not, in fact, 12345 users on every WordPress installation.

Props spacedmonkey

See #38741

#19 @johnbillion
9 months ago

  • Keywords needs-patch needs-unit-tests added; has-patch has-unit-tests removed
  • Resolution fixed deleted
  • Status changed from closed to reopened

Some issues have arisen:

  • The function names need improving as nothing here relates to whether a user is "active" or not.
  • COUNT(ID) should be used to better utilise the table index.
  • The tests are failing ¯\_(ツ)_/¯

#20 @johnbillion
9 months ago

In 41623:

Users: Remove the failing tests added in [41613] while they're investigated.

See #38741

#21 @jrchamp
9 months ago

If you don't want to worry about the right field to use to match the available indexes, COUNT(1) might be a simpler, consistent option that the database can optimize easily.

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


9 months ago

#23 @johnbillion
9 months ago

  • Focuses multisite added
  • Status changed from reopened to accepted

I'm going back to the idea of moving get_user_count() out of the Multisite files and making it available for single site installations too. Otherwise we have separate functions (get_user_count() and the soon-to-be-renamed wp_get_active_user_count()) for essentially doing the same thing.

Patch coming up.

Adding the multisite focus even though this functionality doesn't actually affect Multisite.

#24 @johnbillion
9 months ago

  • Keywords needs-revert added; needs-patch needs-unit-tests removed

Regrettably I think this needs some more work and I don't want to rush it in today. I'm going to revert this and get it into 5.0 early.

#25 @johnbillion
9 months ago

In 41753:

Users: Revert [41613], [41614], and [41623] as this feature needs some more work.

See #38741

#26 @johnbillion
9 months ago

  • Keywords early needs-patch added; needs-revert removed
  • Milestone changed from 4.9 to Future Release

#27 @johnbillion
8 months ago

  • Milestone changed from Future Release to 5.0

This ticket was mentioned in Slack in #core-multisite by flixos90. View the logs.


7 months ago

#29 @tharsheblows
7 months ago

I'm currently using 38741.7.diff(ish — with 4.8.3) on a production site and will let you know if it all goes horribly wrong. (It hasn't yet.)

#30 @flixos90
2 months ago

@johnbillion I like the idea of tying in more with the existing multisite functionality, and I actually prefer the initial name wp_is_large_site() as it would align well with the existing wp_is_large_network(). It could have a parameter defaulting to users, which determines what criteria to use to determine whether the site is considered large. This would make it future-proof for other similar ideas (maybe wp_is_large_site( 'posts' ) could come in handy at some point too. I don't like the wp_is_large_user_count() as it's unclear how that works (it counts all users, in multisite that's the entire setup). I also don't think we should need a separate additional filter to wp_is_large_network() just for that purpose. The new function should be there for a site. It should be available for a single site, but just as useful when having multisite enabled.

I propose the following:

  • Introduce wp_is_large_site( $using = 'users', $site_id = null ). The second parameter defaults to the current site, but allows to do the check for another site in multisite.
  • The option where the number is stored should be called user_count. I don't see why the 'active' is needed here at all.
  • The SQL query to count the users can remain as is for single site (single all users = site users here), but in multisite it should have a meta table join and do WHERE umeta.meta_key = 'wp_X_capabilities' (depending on the site checked).
  • wp_update_active_user_count() should be wp_update_site_user_counts( $site_id = null ) instead, to align with wp_update_network_user_counts( $network_id = null ).
  • I really like the idea of leveraging get_user_count() for sites as well, however I'm not sure how to sanely do so as it already has a $network_id as first parameter. We could create get_site_user_count() but that'd be kind of incoherent (unless we'd also have get_network_user_count() - but maybe we could consider refactoring this).

#31 @tharsheblows
7 weeks ago

@flixos90 It'd be useful for me to take a step back and clarify what wp_is_large_site('users') is meant to do – in my mind it's a kill switch for functions that are too expensive to run when a table is large, eg count_users() on a single site install. In this context only the size of the table matters. For example, if a site in a multisite network only has 10 users but the network has 20,000 then it's not a good idea to run a slow query on it.

My worry is that people will use wp_is_large_site( 'users' ) as a check before running expensive queries on both single site and multisite installs when this might not be what they wanted. One possible workaround is to use wp_is_large_network() as the multisite check for wp_is_large_site('users') rather than the query on usermeta when it's used on multisite and keep the rest the same because those are useful functions. Extending to posts and other tables could then be done on a site by site basis.

This ticket was mentioned in Slack in #core-multisite by tharsheblows. View the logs.


4 weeks ago

This ticket was mentioned in Slack in #core-multisite by spacedmonkey. View the logs.


3 weeks ago

#34 @flixos90
3 weeks ago

We reviewed and discussed this in today's multisite office-hours (see https://wordpress.slack.com/archives/C03BVB47S/p1527609680000818 for the chat log). Here are the decisions made during the conversation:

  • Multisite's get_user_count() has historically contained the global user count already. While it's stored on every network, it is actually a global count. Let's make this function available for single site.
    • The function supports an optional $network_id parameter. This was just recently introduced (by myself, and I regret it given these circumstances), and it doesn't affect the outcome of the function, so we can deprecate it again. This should no longer be considered a network-specific function.
    • On single-site, the user_count should be stored as a regular option. When setting up multisite, it should be migrated to become a network setting.
  • wp_is_large_user_count() should be introduced as a globally available function to check the user count. It should call get_user_count() to get its value and use 10000 as a maximum border by default. A filter should be present to modify the result of this function.
  • The multisite function wp_is_large_network() should be updated to call wp_is_large_user_count() if called with 'users' as the metric parameter. It won't have any functional change, just get rid of some logic that would otherwise be duplicated.
  • wp_is_large_user_count() should be used in count_users() to short-circuit as needed.
Last edited 3 weeks ago by flixos90 (previous) (diff)

@spacedmonkey
12 days ago

#35 @spacedmonkey
12 days ago

  • Keywords has-patch dev-feedback needs-testing added; needs-patch removed

38741.8.diff is a completely new patch based on the feedback from @flixos90 and @jeremyfelt in multisite chat.

What this does, it makes get_user_count (and related functions) available in single site. This also means that there is a scheduled event that runs on single site twice a day now to get this count and store in options. This is a better implementation than doing a count on every site on a multisite, as for large networks, running this on every site is expensive.

If the site is seen as large, it now doesn't include counts and all the registered roles are displayed. This is a functional change and may require some discussion.

#36 @tharsheblows
12 days ago

Moving get_user_count to wp-includes/functions.php so it's available for single sites might cause issues with plugins using the function name to do the same thing. I vaguely remember discussing this with @johnbillion at a contributor day which is why the initial patches didn't do that. Sorry that discussion wasn't included here!

#37 @spacedmonkey
12 days ago

Well I simple solution to that problem would be calling the function in wp_get_global_user_count, which is a better name, deprecate get_user_count and get it call wp_get_global_user_count inside it.

#38 @flixos90
12 days ago

Even more than with registered post types, metadata, options and anything else, declaring functions without a prefix is an absolute no-go unless you are a 100% sure you know what you're doing. I don't think we should change the name of get_user_count() just because someone might have declared it. We should only change the name if we actually think the name is not correct or precise enough.

This change will be documented as well, so if someone is in fact declaring get_user_count() on their own, they should at least be aware of what they're doing is critical and be responsible of following core development in that regard closely enough.

#39 @tharsheblows
11 days ago

:) Y'all are harsh.

Having "network" in the function names for single sites feels strange to me, it's contextless and a bit confusing. I would think it's probably worth doing what you suggested @spacedmonkey to have clearer names in all those functions.

I'm not sure that single site needs a scheduled twice daily user count update on single sites either, it's fine to run it on every registration and deletion, it's a quick query. For multisite, the spam and deleted columns aren't indexed so it probably makes sense there, I think that's how it currently works. (But if this function is to be used in a kill switch for expensive queries on the users table, then it needs to be the same query as single site, ie SELECT COUNT(ID) FROM $wpdb->users – it's the size of the table that matters then)

In general, I love that WordPress is sympathetic to new developers and those who might have done it wrong in the past and never fixed it. If avoiding a function name collision so that we don't brick their sites is easy, then why not do it? Using something like wp_get_user_count has the added benefit of fitting in with other WP function names.

#40 @spacedmonkey
11 days ago

I agree. Here is what I think we should do.

We should deprecate get_user_count as it currently allows for network_id to be passed. It is confusing in a single site context. We should remove all references (there is only 4 in code).

Introduce wp_get_user_count function, that doesn't have any params on it. This fixes the naming issues and is a valid reason to get rid of the get_user_count.

When I get a minute, I will update the patch.

#41 @flixos90
11 days ago

That is an argument I agree with. :)

Before we deprecate, I'd like to know though how often this function is used by plugins. We probably wanna be kind if tons of plugins use it. And one aside, if we deprecate get_user_count(), we should as well consider deprecating get_blog_count() as this should then rather be named get_network_site_count() (contrary to get_user_count(), the current get_blog_count() actually works per network). This would not be part of this ticket, but something to be aware of and think about.

@spacedmonkey
9 days ago

#42 @spacedmonkey
9 days ago

As of 38741.9.diff there are the following changes

  • Added wp_get_user_count function, with the logic contained in get_user_count but no network id param
  • Deprecated get_user_count
  • Deprecated $network_id in wp_is_large_network as unable to pass to get_user_count as wp_get_user_count now in use.
  • Removed all references to get_user_count in code.
Note: See TracTickets for help on using tickets.