WordPress.org

Make WordPress Core

Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#11914 closed defect (bug) (fixed)

"Users" admin list slow to generate on large sites

Reported by: rowanbeentje Owned by:
Milestone: 3.0 Priority: high
Severity: critical Version: 2.9.1
Component: Optimization Keywords: has-patch needs-testing
Focuses: Cc:

Description

On a site I help administer, the "Users" admin page (wp-admin/users.php) can take over 20 seconds to generate, when the server has low load (and normally serves pages within a second or two - ie this page is very much an outlier).

The queries getting the post counts for each user are definitely at fault, eg in get_usernumposts (although I see there's one or two other instances):

$count = $wpdb->get_var( $wpdb->prepare("SELECT COUNT(*) FROM $wpdb->posts WHERE post_author = %d AND post_type = 'post' AND ", $userid) . get_private_posts_cap_sql('post'));

There's currently no index on post_author; adding one makes the queries in question 5x-10x faster :)

Other queries using post authors, like viewing an author index, don't seem too slow without the index.

HOWEVER… I should mention this particular site has >30,000 rows in wp_posts, and a wp_posts data size of >135MB. So filing this as "minor" issue, and it may not be seen as worth implementing...

Attachments (8)

11914.diff (7.3 KB) - added by miqrogroove 4 years ago.
schema.diff (488 bytes) - added by miqrogroove 4 years ago.
11914.2.diff (10.2 KB) - added by miqrogroove 4 years ago.
Replaces get_posts_by_author_sql() with more robust logic.
11914.3.diff (11.3 KB) - added by miqrogroove 4 years ago.
Adds user object caching to minimize query count.
11914.4.diff (11.8 KB) - added by miqrogroove 4 years ago.
Added missing term || !$full in get_posts_by_author_sql
11914.6.diff (15.9 KB) - added by miqrogroove 4 years ago.
Adds usermeta caching, improvements suggested by ryan.
11914.7.diff (15.9 KB) - added by miqrogroove 4 years ago.
Refreshed
11914-extra.diff (337 bytes) - added by miqrogroove 4 years ago.
Schema bump

Download all attachments as: .zip

Change History (47)

comment:2 Denis-de-Bernardy4 years ago

  • Milestone changed from Unassigned to Future Release

comment:5 rowanbeentje4 years ago

#11213 does look the most relevant and covers this nicely - the other two issues are I *think* tangential (unless posts will no longer be linked to an author, but instead to an author-role combination)

comment:6 Denis-de-Bernardy4 years ago

they're all quite related, actually. but the key issue for those particular queries is #11375, and its little brother #11213.

comment:7 miqrogroove4 years ago

Good lord, what is going on with the slashing, string concatenation, and SQL injection vulnerability in get_private_posts_cap_sql() ?? /facepalm

comment:8 rowanbeentje4 years ago

On this table, the slow part of the query is the post_author part - seen when adding an index on post_author makes the query 5x-10x faster; so I think it's slightly different from #11375.

SQL injection vuln probably isn't too bad on get_private_posts_cap_sql as it's using hardcoded inputs, but always worth fixing to avoid future issues…!

comment:9 miqrogroove4 years ago

Guys, I'm really confused by this query. This would be the pseudo code:

Count all posts, WHERE one specified user is the author, AND current user is the author OR has private reading caps.

Problem #1: get_usernumposts() is being called from inside of user_row(), which is the worst possible flow of control for query optimization.

foreach ( $wp_user_search->get_results() as $userid ) {
	echo "\n\t" . user_row($user_object, $style, $role);
}

Problem #2: Why would the current user's capabilities ever affect the number of posts that have been created by someone else?

comment:10 miqrogroove4 years ago

  • Keywords users mysql slow query removed
  • Milestone changed from Future Release to 3.0
  • Priority changed from low to normal
  • Severity changed from minor to normal

Let's get the flow of control refactored for version 3.0 or 3.1. OP says the index would make this 5x-10x faster, but correcting the flow of control would make it 20x faster if there are 20 users per page. Do both, and it should be 100x faster.

comment:11 filosofo4 years ago

Replying to miqrogroove:

Good lord, what is going on with the slashing, string concatenation, and SQL injection vulnerability in get_private_posts_cap_sql() ?? /facepalm

Good question. It appeared in r5189.

comment:12 miqrogroove4 years ago

lol filosofo, did you notice the syntax error in there?

comment:14 Denis-de-Bernardy4 years ago

Replying to miqrogroove:

Problem #2: Why would the current user's capabilities ever affect the number of posts that have been created by someone else?

the cap check, in practice, is really equivalent to adding a filtered:

post_type IN ('publish', 'private')

we might as well count drafts and pending posts. the issue (i.e. the OR or IN statement) won't easily go away, however, since we'll need to filter trashed posts no matter what.

re the flow control, fixing it could be a possibility. something like this, for instance, could do the trick:

SELECT post_author, count(ID) FROM $wpdb->posts WHERE post_type = 'post' AND post_status <> 'trash' AND post_author IN ( $id_list ) GROUP BY post_author

but then, the above-mentioned filtered string no longer is around. if plugins introduce any kind of custom post statuses, the count would become completely irrelevant.

comment:15 miqrogroove4 years ago

Needs to be COUNT(*), and I really don't think post revisions were intended to show up in there. Otherwise, we're on the same page.

comment:16 miqrogroove4 years ago

er... revisions are a type not a status? Just as long as they are excluded I'm happy.

comment:17 Denis-de-Bernardy4 years ago

count(ID) works too. revisions are a post type.

comment:18 rowanbeentje4 years ago

FWIW, I can confirm that running a flow-control'd WHERE IN/GROUP BY query as suggested above does essentially take the same time as a single iteration of the current query (called in the loop), so that'd be an awesome improvement (as there's several pages of users!)

comment:19 follow-up: miqrogroove4 years ago

count(ID) adds null-checking, which is not what we want to do in an optimization patch. It's contrary to the schema:

CREATE TABLE $wpdb->posts (
  ID bigint(20) unsigned NOT NULL auto_increment

that'd be an awesome improvement

It's an easy one, too.

Denis, are you willing to mention this bug in one of your blog posts?

comment:20 in reply to: ↑ 19 Denis-de-Bernardy4 years ago

Replying to miqrogroove:

Denis, are you willing to mention this bug in one of your blog posts?

Sure. Along with a couple more tickets that relate to performance.

comment:21 voyagerfan57614 years ago

  • Cc WordPress@… added

comment:22 miqrogroove4 years ago

  • Priority changed from normal to high
  • Severity changed from normal to critical

See also #11151 and #11726. Put them all together and there's a critical case for refactoring users.php. I'll see where this fits into my new patch priorities.

comment:23 prettyboymp4 years ago

  • Cc mpretty@… added

comment:24 miqrogroove4 years ago

  • Keywords has-patch needs-testing added

miqrogroove4 years ago

comment:25 miqrogroove4 years ago

A few notes on this:

The schema patch is still in the works.

There is a leftover array initialization that can be dropped from users.php.

I still have some uncertainty about the new WHERE conditions for post counting. I want to keep it simple, but we might need a new function similar to posts_cap_sql(). It would eliminates code duplication but also needs to make sense for these queries.

miqrogroove4 years ago

miqrogroove4 years ago

Replaces get_posts_by_author_sql() with more robust logic.

comment:26 miqrogroove4 years ago

The situation with the WHERE conditions should look a lot better now.

comment:27 miqrogroove4 years ago

nacin and I are staring at this to see if there's any optimization cherry to put on top.

$user_object = new WP_User($userid);

miqrogroove4 years ago

Adds user object caching to minimize query count.

miqrogroove4 years ago

Added missing term
!$full in get_posts_by_author_sql

comment:28 miqrogroove4 years ago

note user_row @since 2.1.0

comment:29 follow-up: ryan4 years ago

Does the call to user_row() in admin-ajax.php need to change to accommodate the new arg?

cache_users() doesn't cache user meta. I think meta is expected to be cached as in _fill_user().

cache_users() could be called update_user_caches() to fit in with update_post_caches() and other.

Maybe the existing get_usernumposts() func could be called count_users_posts() to bring it in line with other count funcs and call get_users_numposts() something like count_all_users_posts().

And maybe give count_blog_users_by_role_inmem() and count_blog_users_by_role_intime() less unwieldy names. count_users() with some args?

comment:30 in reply to: ↑ 29 miqrogroove4 years ago

Replying to ryan:

Thanks for the great feedback. :)

Does the call to user_row() in admin-ajax.php need to change to accommodate the new arg?

My understanding is admin-ajax only calls user_row() for new user creation, and the new arg has an appropriate default. It's future-proof in the sense that any new behavior in admin-ajax should be able to use that arg when necessary.

cache_users() doesn't cache user meta. I think meta is expected to be cached as in _fill_user().

cache_users() could be called update_user_caches() to fit in with update_post_caches() and other.

Okay, I will research that some more.

Maybe the existing get_usernumposts() func could be called count_users_posts() to bring it in line with other count funcs and call get_users_numposts() something like count_all_users_posts().

Is there a concern about breaking plugins with that? Should I add an alias function too?

And maybe give count_blog_users_by_role_inmem() and count_blog_users_by_role_intime() less unwieldy names. count_users() with some args?

The overall intention was to present two separate strategies I worked on, and ultimately dump the _inmem version unless it is valuable to keep it in the API. count_users() with both strategies under one roof is fine with me. :)

miqrogroove4 years ago

Adds usermeta caching, improvements suggested by ryan.

comment:33 scribu4 years ago

Related: #12441

comment:34 scribu4 years ago

Also related: #10329

miqrogroove4 years ago

Refreshed

comment:35 ryan4 years ago

(In [13576]) Improve user listing performance. Props miqrogroove. see #11914

comment:36 miqrogroove4 years ago

  • Resolution set to fixed
  • Status changed from new to closed

Need to bump schema version.

miqrogroove4 years ago

Schema bump

comment:37 dd324 years ago

(In [13581]) Bump the schema after [13576]. Props miqrogroove. Fixes #11914

comment:39 hakre4 years ago

Patch partially has some issues with PHP4, I opened a new ticket: #13319

Note: See TracTickets for help on using tickets.