Make WordPress Core

Opened 3 years ago

Last modified 22 months ago

#55749 new defect (bug)

Large-site inefficiency in REST users endpoint used by Gutenberg editor

Reported by: olliejones's profile OllieJones Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version:
Component: Users Keywords:
Focuses: rest-api, performance Cc:

Description (last modified by SergeyBiryukov)

(In 6.0-RC2 and 5.9.3) The Gutenberg editor populates its dropdown list of authors by hitting this REST endpoint.

/wp-json/wp/v2/users?context=view&who=authors&per_page=50&_fields=id,name&_locale=user

This results in the following SQL statement.

SELECT SQL_CALC_FOUND_ROWS wp_users.*
  FROM wp_users 
 INNER JOIN wp_usermeta ON ( wp_users.ID = wp_usermeta.user_id )
 WHERE 1=1 
    AND (( wp_usermeta.meta_key = 'wp_user_level'
    AND wp_usermeta.meta_value != '0' ))
ORDER BY display_name ASC
LIMIT 0, 50

There are some serious inefficiencies here.

First, the notorious and deprecated performance killer SQL_CALC_FOUND_ROWS is present in the query. If the results of the query were going to be used for pagination (that is, a UI element like << < Page [1] of 250 pages > >> then SQL_CALC_FOUND_ROWS would serve a purpose. But it doesn't in this usage of REST.

Second, the filter

(( wp_usermeta.meta_key = 'wp_user_level' AND wp_usermeta.meta_value != '0' ))

correctly applies an index to get the right meta_keys from usermeta, but then it must scan the meta_values (they are CLOBs so the best we can do is prefix indexes).

Third, ORDER BY display_name LIMIT 50 means the DBMS must retrieve all the eligible users, sort them, and then discard all but 50.

Similarly, when displaying the Posts or Pages dashboard panel this query prepopulates the dropdown list to use for authors in Quick Edit.

SELECT wp_users.ID,wp_users.user_login,wp_users.display_name
  FROM wp_users
 INNER JOIN wp_usermeta ON ( wp_users.ID = wp_usermeta.user_id )
 WHERE 1=1 
   AND (((
           ( wp_usermeta.meta_key = 'wp_capabilities'
              AND wp_usermeta.meta_value LIKE '%\"edit\\_posts\"%' )
        OR ( wp_usermeta.meta_key = 'wp_capabilities'
             AND wp_usermeta.meta_value LIKE '%\"administrator\"%' ) 
        OR ( wp_usermeta.meta_key = 'wp_capabilities'
             AND wp_usermeta.meta_value LIKE '%\"editor\"%' )
        OR ( wp_usermeta.meta_key = 'wp_capabilities'
             AND wp_usermeta.meta_value LIKE '%\"author\"%' )
        OR ( wp_usermeta.meta_key = 'wp_capabilities'
             AND wp_usermeta.meta_value LIKE '%\"contributor\"%' )
    )) )
ORDER BY display_name ASC

As we say in New England USA, this is wicked slow. And each render of the Pages or Posts panel requires it to run.

#38741 provided some improvements to the handling of users on large many-user sites. But there's more to do.

Change History (6)

#1 in reply to: ↑ description @OllieJones
3 years ago

Replying to OllieJones:

For what it's worth these performance problems exist even when a site has only a handful of Administrators / Editors / Authors / Contributors, and most of the registered users are Subscribers.

#3 @SergeyBiryukov
3 years ago

  • Description modified (diff)
  • Focuses rest-api performance added

#4 @rjasdfiii
3 years ago

Please measure and report the time spent in REST processing versus the time spent running the SELECT. (We should not waste time chasing the wrong thing.)

Ollie gave an in-depth discussion of why that SELECT is inefficient. Let me continue that discussion. Here are some steps to take to speed up a query such as that one:

  1. A better INDEX -- That will help some, but there are structural issues with the table that prevent a significant improvement. Ollie and I provide a Plugin that updates the schemas to have better indexes.
  2. Reformulate the query. -- Perhaps the only easy fix is to get rid of the SQL_CALC_FOUND_ROWS. By the way, that feature is being removed in MySQL 8.0. In the future, WP will have to either abandon getting the count, or run a separate query to get the count.
  3. Redesign the schema. -- WP is based on the simple and flexible "Entity-Attribute-Value" schema pattern ("EAV"). It is great for small datasets. But for large tables, it performs miserably.

EAV...

  • Entity: post_id, user_id, etc (depending on which set of tables)
  • Attribute: xx_meta_key
  • Value: xx_meta_value -- Note when this holds a number and you want to test a range of numeric values, this LONGTEXT needs to be converted to numeric for the test. This is another case where EAV performs miserably.

We would like our indexes to be folded into WP (and Woo) Core.

Unfortunately, the next level of performance improvement means abandoning EAV. There is no "simple and straightforward" replacement for EAV.

#5 @johnbillion
3 years ago

  • Version trunk deleted

See #47280 for the SQL_CALC_FOUND_ROWS issue.

#6 @spacedmonkey
22 months ago

Hopefully now #40613 is committed, that might help.

Note: See TracTickets for help on using tickets.