Opened 6 years ago
Closed 4 years ago
#46249 closed defect (bug) (duplicate)
REST API Performance Issues: wp/v2/posts + _embed / count_user_posts()
Reported by: | iCaleb | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 4.7 |
Component: | REST API | Keywords: | needs-patch |
Focuses: | rest-api | Cc: |
Description
tl;dr: A REST API request such as wp/v2/posts?per_page=100&_embed=1
will result in 200 calls to count_user_posts()
which results in 200 uncached queries being ran.
Issue Description
As a quick recap, the _embed
parameter tells the api request to include the response of all linked resources. Can read about this here: https://developer.wordpress.org/rest-api/using-the-rest-api/linking-and-embedding/
So when making a request like wpdev.test/wp-json/wp/v2/posts?per_page=10&_embed=1
, you end up with 10 post objects and you also have 10 author objects embedded within each post. The author objects are essentially embedded via an extra internal rest api request and it's treated mostly like it would be if it were requested individually.
The performance issue is related to the author object being included in the response. More specifically, WP_REST_Users_Controller::get_item_permissions_check
is run for every item. This calls count_user_posts()
to ensure the user has posts before making the profile publicly accessible via the API. This function is an uncached query in WP, and results in a query that looks something like this:
SELECT COUNT(*) FROM wp_posts WHERE ( ( post_type = ? AND ( post_status = ? ) ) OR ( post_type = ? AND ( post_status = ? ) ) OR ( post_type = ? AND ( post_status = ? ) ) OR ( post_type = ? AND ( post_status = ? ) ) OR ( post_type = ? AND ( post_status = ? ) ) OR ( post_type = ? AND ( post_status = ? ) ) OR (post_type = ? AND ( post_status = ? ) ) ) AND post_author = ? ?
On the site this become a problem, this query was taking near 1 second to complete. Alone, that's not a huge issue. But it does become a big problem due to how many times this query is run per request. This is because for every 1 post in the feed , there are actually 2 calls to count_user_posts()
when using _embed=1
. So the 10 per_page request above will result in 20 of these queries. 100 posts will result in 200 of this query (!!!).
So why is this function called twice per one post in the response? The first call is standard due to making an internal request for the resource, such as wpdev.test/wp-json/wp/v2/users/1
. The second call is due to the rest_send_allow_header
method hooked onto rest_post_dispatch
. This is [apparently by design](https://github.com/WP-API/WP-API/issues/2400).
Relevant WP core locations:
- WP_REST_Users_Controller::get_item_permissions_check https://github.com/WordPress/WordPress/blob/56bb62543d485f491f7f614c827638e426019d93/wp-includes/rest-api/endpoints/class-wp-rest-users-controller.php#L387-L406
- count_user_posts() uncached query: https://github.com/WordPress/WordPress/blob/50ddffbac37fdae9298d1f11fe1dd7b8535fc839/wp-includes/user.php#L379-L399
- rest_send_allow_header() extra permission check: https://github.com/WordPress/WordPress/blob/2785313bf2a8e46e52863d476d68951814a6fbbf/wp-includes/rest-api.php#L635-L669
Possible Solutions
1) Improve WP_REST_Users_Controller::get_item_permissions_check
.
- When the request is being embedded with a posts request, do we even need to run the
count_user_posts()
query since they are clearly an author on a post? - Could maybe have a cache per author, even if it's just per request.
- Simplify the query to do some quicker queries first. (for example, check if that user even has a post in the DB, which should use the post_author index and be quick).
- Perhaps utilize user meta and query for that?
2) Add a filter to allow for overriding this logic
If it isn't possible to address this in core (though I feel there should be some options for us), could we get a filter added to this method to allow for some overriding of the logic? Namely replacing the uncached count_user_posts()
function with a cached version, though there are likely even better enhancements available.
I don’t think checking if it is an embedded post will work, it's too risky IMO.
However, I think caching the results of
count_user_posts
is workable. Even if it it just per-request. Perhaps something for the Query component to look at as well. cc: @boonebgorges.