Make WordPress Core

Opened 5 weeks ago

Last modified 3 days ago

#59188 assigned defect (bug)

WP_Posts_List_Table causes post, postmeta, and term caches to be primed for all pages in the DB on load

Reported by: kevinfodness's profile kevinfodness Owned by: spacedmonkey's profile spacedmonkey
Milestone: 6.4 Priority: normal
Severity: normal Version: 6.1
Component: Query Keywords: has-patch dev-feedback has-unit-tests
Focuses: administration, performance Cc:

Description

This behavior primarily triggers if a user is using an external object cache on the second load of /wp-admin/edit.php?post_type=page once the query cache has been populated for id=>parent relationships for pages. It is reproducible on the Pages listing, but will also apply to any custom post type declared as hierarchical.

The WP_Posts_List_Table class has a method called prepare_items which is called when building the table view at /wp-admin/edit.php?post_type=page. This function calls wp_edit_posts_query():

https://github.com/WordPress/WordPress/blob/master/wp-admin/includes/class-wp-posts-list-table.php#L165

This function checks to see if the post type is hierarchical, and if it is, sets posts_per_page to -1. This is done because it wouldn't be possible to build a hierarchical view of posts with pagination without understanding the relationship between all of the IDs in the database for that post type:

https://github.com/WordPress/WordPress/blob/master/wp-admin/includes/post.php#L1274

Once the query parameters are built, they are passed to the wp function for execution:

https://github.com/WordPress/WordPress/blob/master/wp-admin/includes/post.php#L1283

The wp function then calls the main method on the global $wp object, which runs query_posts, which calls $wp_the_query->query, which invokes get_posts on a WP_Query object.

Once the query is built, WP_Query attempts to see if there are cached results for that query:

https://github.com/WordPress/WordPress/blob/master/wp-includes/class-wp-query.php#L3173

If cached results are found (which would only happen after the query was run once and cached, hence my comment above about why this only triggers on the second load of the pages view with a persistent object cache in play like memcached) then this logic block flows down to where it forks based on what was requested in 'fields', which in this case is not just IDs, it's id=>parent relationships, so it flows into the else and triggers _prime_post_caches for all pages (or any other hierarchical post type) including all post meta and all terms attached to those posts. Although the prime post caches function will only prime caches for posts that aren't already in the cache, if the cache was recently flushed or cache salts or versions rotated, this can be a very expensive operation that can crash a site if it's loading in a significant number of hierarchical posts and/or if those posts have a lot of metadata.

I would recommend marking a query that is looking for id=>parent with a posts_per_page of -1 as not cacheable to prevent this behavior.

Note: I created this ticket during WordCamp US Contributor Day, and was having trouble with Core Trac being overloaded with requests, which inhibited my ability to search for related tickets. If this is a duplicate, I apologize, and will be happy to contribute to related tickets.

Change History (34)

#1 @kevinfodness
5 weeks ago

  • Summary changed from WP_Posts_List_Table causes post, page, and term caches to be primed for all pages in the DB on load to WP_Posts_List_Table causes post, postmeta, and term caches to be primed for all pages in the DB on load

#2 @joemcgill
5 weeks ago

  • Keywords needs-patch added
  • Milestone changed from Awaiting Review to Future Release

This ticket was mentioned in PR #5091 on WordPress/wordpress-develop by @kevinfodness.


5 weeks ago
#3

  • Keywords has-patch added; needs-patch removed

Fixes a bug whereby the Pages list view in the admin (or any other hierarchical post type) performs a query for id=>parent with a limit of -1 which results in priming post caches for all posts of that post type, including postmeta and terms, that aren't already in the cache, which can crash sites with an empty cache and a lot of posts/postmeta.

#4 @joemcgill
4 weeks ago

  • Milestone changed from Future Release to 6.4

Thanks @kevinfodness. I'm pulling this into the 6.4 milestone for review

#5 @spacedmonkey
3 weeks ago

  • Owner set to spacedmonkey
  • Status changed from new to assigned

#6 @spacedmonkey
3 weeks ago

#59319 was marked as a duplicate.

#8 @spacedmonkey
3 weeks ago

  • Keywords dev-feedback added

I put together a quick and simple PR here. I think this will fix the issue.

@spacedmonkey commented on PR #5193:


3 weeks ago
#9

Can you review / test this patch see if fixes the issue @kevinfodness?

This ticket was mentioned in Slack in #core-performance by spacedmonkey. View the logs.


3 weeks ago

#11 @peterwilsoncc
2 weeks ago

I've testedPR 5193 on a server running a persistent cache.

It appears the pages table has two items causing an excessive amount of caching:

  • the list table itself (per the ticket)
  • the wp_dropdown_pages() call when generating the inline edit interface

When using a persistent cache, the meta data cache is primed on the second request but not the first (possibly a bug in WP_Query. This is happening both with and without trunk checked out.

I'll keep investigating and try to figure out what is happening.

#12 @spacedmonkey
2 weeks ago

I think the issue here. Per [53941], when fields id / id=>parents were passed, term and meta caches were not primed, even if update_post_term_cache or update_post_meta_cache is passed. See this code snippet. Now in WP 6.1+, on the second run term and meta are primed. See this line.

We could consider doing something like this.

                if ( ! isset( $q['update_post_term_cache'] ) ) {
                       if( 'id=>parent' === $q['fields'] ){
                         $q['update_post_term_cache'] = false;
                       }else{
                          $q['update_post_term_cache'] = true;
                       }
                }

                if ( ! isset( $q['update_post_meta_cache'] ) ) {
                        if( 'id=>parent' === $q['fields'] ){
                         $q['update_post_meta_cache'] = false;
                       }else{
                          $q['update_post_meta_cache'] = true;
                       }
                }


This ticket was mentioned in PR #5211 on WordPress/wordpress-develop by @spacedmonkey.


2 weeks ago
#13

Trac ticket:

@peterwilsoncc commented on PR #5211:


2 weeks ago
#16

I think it's a good idea to change the defaults for these but I put together some tests based on this branch and am seeing that the options are not being respected so it might need a little work.

I've attached the diff (I had to zip it as GH doesn't accept diff's as uploads).

59188-a-tests.diff.zip

This ticket was mentioned in PR #5219 on WordPress/wordpress-develop by @peterwilsoncc.


2 weeks ago
#17

  • Keywords has-unit-tests added

Trac ticket:

@peterwilsoncc commented on PR #5211:


2 weeks ago
#18

@spacedmonkey I've built up on this in PR #5219.

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.


12 days ago

This ticket was mentioned in PR #5245 on WordPress/wordpress-develop by @spacedmonkey.


12 days ago
#20

Trac ticket:

@spacedmonkey commented on PR #5219:


12 days ago
#21

@peterwilsoncc @joemcgill I have been wonder if a number of years if we should just make update_post_term_cache and update_post_meta_cache work consistently. At the moment, these do not work consistently and there are code paths where do and do not work. I have put together a PR based on this to make this the case.

https://github.com/WordPress/wordpress-develop/pull/5245

Do we think this is a good idea or be a problem?

This ticket was mentioned in Slack in #core-performance by spacedmonkey. View the logs.


12 days ago

This ticket was mentioned in Slack in #core-performance by spacedmonkey. View the logs.


11 days ago

#25 @spacedmonkey
9 days ago

@kevinfodness Is there any chance you could review / test my PR.

#26 @kevinfodness
9 days ago

@spacedmonkey yep, will do!

This ticket was mentioned in Slack in #core-performance by spacedmonkey. View the logs.


6 days ago

This ticket was mentioned in Slack in #core-performance by spacedmonkey. View the logs.


5 days ago

#29 @LinSoftware
5 days ago

I attempted to reproduce Kevin's experience while on trunk, but didn't see the reported issue on the second page load of /wp-admin/edit.php?post_type=page

However, I was able to reproduce the issue described in #59319 where the page crashes due to out of memory issue. This was reproducible with a data set of 60,000 pages. While @spacedmonkey's PR to not cache the terms and meta is an improvement, the root of the issue appears to stem from the posts_per_page being set to -1, which causes memory exhaustion with a large data set.

#30 @spacedmonkey
5 days ago

@LinSoftware Before, _prime_post_caches was called, meaning for a site with 60k worth of pages, that would load all the post objects ( including post content ) and post meta and terms. But this code avoids using _prime_post_caches and just caches the post parents, which is an int.

60k work of ints in an array is large, but I do not believe we would be hitting memory limits.

I believe it maybe impossible to remove the posts_per_page being set to -1, as this might be a breaking change.

Out of interest, are you using a object caching plugin that supports wp_cache_get_multiple?

#31 @kevinfodness
4 days ago

I tested this and it works well for me. I tested it using a facsimile of WordPress VIP's environment, which uses memcached, and supports wp_cache_get_multiple. I did not test it using a huge number of pages, though.

#32 @kevinfodness
4 days ago

I just tested the patch with a sandbox site with 60,000 pages and it worked fine.

#33 @LinSoftware
4 days ago

I am using the same object cache dropin as Kevin which does support wp_cache_get_multiple. I'm not sure why we're getting different results. I'll continue to investigate.

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


3 days ago

Note: See TracTickets for help on using tickets.