#15459 closed task (blessed) (fixed)
Need Better Page Hierarchy Display Algorithm
Reported by: | truthmedia | Owned by: | johnbillion |
---|---|---|---|
Milestone: | 4.2 | Priority: | normal |
Severity: | normal | Version: | 3.0.1 |
Component: | Posts, Post Types | Keywords: | has-patch commit |
Focuses: | performance, administration | Cc: |
Description
WordPress 3.0.1
File: /wp-admin/includes/post.php
Lines: 904-912
Code:
// Hierarchical types require special args. if ( is_post_type_hierarchical( $post_type ) ) { $query['orderby'] = 'menu_order title'; $query['order'] = 'asc'; $query['posts_per_page'] = -1; $query['posts_per_archive_page'] = -1; } wp( $query );
The code in the area mentioned above causes our site to use over 170mb of memory on each load of the Pages index. This really should be improved to accommodate sites that have large amounts of hierarchical data. Looks to me like it's pulling all the data for all the pages on the site. Perhaps a solution where unnecessary data is not included might be better here (such as content or excerpt).
Attachments (6)
Change History (57)
#1
@
14 years ago
- Keywords 3.2-early added; memory pages hierarchy removed
- Milestone changed from Awaiting Review to Future Release
- Type changed from defect (bug) to enhancement
#2
@
14 years ago
- Keywords 3.2-early removed
This would require quite a bit of refactoring; no patch in sight.
#6
@
13 years ago
Here's a quick patch that breaks all over the place but could be a good starting point for someone wishing to solve this.
Essentially, here's what we currently do:
- Query all posts
- Group them based on ID and post_parent
- Begin rendering posts until we've reached the total to display per page
What we should do instead:
- Query all IDs and post parents
- Group them
- Figure out what posts we need to grab to display the right number on a page
- Use _prime_post_caches() to pull those specific posts into cache
- Render those posts
Querying all IDs and parents still isn't ideal, but it's a step in the right direction. Without MPTT this is not easy.
#7
@
13 years ago
Here's a quick patch that breaks all over the place but could be a good starting point for someone wishing to solve this.
A note, it's a total duck punch. The idea is a good starting point. The code is not.
#8
@
13 years ago
Certainly quite dated, but I wonder if anything ever happened here or if anything can be salvaged -
#17
@
11 years ago
Attempt to implement Nacin's suggestions. I'm not familiarized with core so I'm sure there is room for improvements. Suggestions are welcome.
I run a site with around 4000 hierarchical posts of a custom type. After applying this patch, memory consumption to load wp-admin/edit.php dropped from 321MB to 63MB.
#20
@
11 years ago
- Component changed from Administration to Posts, Post Types
- Focuses administration added
#21
@
11 years ago
Hi there,
I'm super interested in this thread as I've got a few clients really suffering with 30 second load times with about 5000 hierarchical posts. The patch improves this, but it is still pretty slow. Maybe 10 seconds.
Wouldn't a better algorithm be:
- Do a normal query (limited by page size) but only return posts with no parent
- Do another query to find all the children of just these top level posts
- Merge them together
In the case with a ridiculous hierarchy (say a parent post with 1000s or child posts) this won't help too much. But in the cases when each post only has a couple of children it'll make a massive difference I think.
Admittedly things get messy if you're looking at deeper pages (say page 500 not page 1) of results, but in practice no-one really does this, right?
I guess the real correct way to do this is to store a flattened hierarchy table in the database and keep this up to date on post create/update/delete, and then query on that.
Note both Oracle and SQL Server support hierarchy queries easily. MySql doesn't, but you could fake it at the database level:
http://explainextended.com/2009/03/17/hierarchical-queries-in-mysql/
Just my thoughts. If only I could still code I'd try to do it myself ...
Jon
#22
@
11 years ago
Hi!
I tried the patch suggested by rodrigosprimo and it works very well. I think it will be good if WordPress adopt this change in its core.
Regards.
#23
@
11 years ago
- Keywords dev-feedback added
Thanks for your comment McBoof but I can't think of an easy way to implement your suggestion for pages other than the first. When building the second page how could we tell which posts have already been used to build the first page? Maybe I'm missing something.
Anyway, at least in my case, the solution proposed by Nacin and implemented in the patch I sent is good enough and is a huge improvement compared with the current algorithm.
Just to give an example, in one of the sites I run with about 4000 hierarchical posts, without this patch the page takes around 25 seconds to load and uses 320MB of memory. After applying the patch the page loads in 9 seconds and uses 68MB of memory.
This ticket was mentioned in IRC in #wordpress-dev by rodrigoprimo1. View the logs.
10 years ago
#29
@
10 years ago
No news? I'm plannig to start a big project using hierarchical custom post type and i'd like to know if i'm gonna have problems...
#30
@
10 years ago
The last patch version doesn't work for pages after the first one, which breaks the default WordPress functionality. I believe that we could build the tree of IDs, offset based on the page number * posts per page (from the screen options) and query only the posts that are needed. I'll give it some thought, based on your patch which looks elegant.
#31
@
10 years ago
Even when applied properly (small fixes), there are some records that are missing. For example:
Items number 4, 5, 6 are children of item 3 (imagine 3 child pages of a parent page). Like the following structure:
Parent1
Parent2
Parent3
-Child1
-Child2
-Child3
Parent4
If you set the "posts per page" to 5, with the original implementation you'll see the following results.
First page from pagination:
Parent1
Parent2
Parent3
-Child1
-Child2
Second page from pagination:
Parent3
-Child3
Parent4
The patch skips the Child3 display (which is tricky itself to get the right number of entries transferred to the second page).
That said, the second page would actually display 6 items instead of 5 due to Parent3 being duplicated in both pages 1 and 2.
The other problem is that there are different types of ordering - by name, date etc., so I'm not sure that the patch would handle all of them gracefully even if we delegate the search queries to SQL and fetch the IDs only as we do now.
#32
@
10 years ago
- Keywords needs-unit-tests added
All of that sounds like a good reason to be writing tests for this. :)
This ticket was mentioned in IRC in #wordpress-dev by nofearinc. View the logs.
10 years ago
#35
@
10 years ago
@helen I'm not sure how to test this. AFAIK, WP doesn't have functional tests and the methods we are changing output HTML directly so it is not easy to create unit tests for them. The only way I can think of is to use PHPUnit_Framework_TestCase::expectOutputString() or PHPUnit_Framework_TestCase::expectOutputRegex() to check the HTML output. Any other suggestion? I couldn't find tests for WP_List_Table or any of its child classes. Are there other related tests I could use as a reference?
#36
@
10 years ago
@nofearinc thanks for testing this patch and reporting the issues you found. Just to make sure we are on the same page, in your example, with the patch applied, on the second page Parent3 is missing and not Child3, right?
#37
@
10 years ago
@rodrigosprimo in fact Child3 is also missing - so the second page doesn't list children belonging to parents already listed on the previous page, so building the tree isn't doing it right so far.
#38
@
10 years ago
- Keywords needs-unit-tests removed
I worked today with @nofearinc and we fixed the issues he found. We also added basic test coverage for this ticket.
#39
@
10 years ago
Some performance benchmarking on this patch collected using query-monitor plugin and a WordPress install with 6000 posts of a hierarchical custom type:
Without the patch:
Page generation time: 55.3375
Peak memory usage: 466 MB
Database query time: 2.2530
Database queries: SELECT 161
After applying the patch:
Page generation time: 27.6313
Peak memory usage: 87MB
Database query time: 0.4134
Database queries: SELECT 141
This ticket was mentioned in Slack in #core by rodrigoprimo. View the logs.
10 years ago
#41
@
10 years ago
- Keywords needs-unit-tests needs-testing 4.2-early added; dev-feedback removed
I like this, and it's effectively quite a simple change, but I think we need some more unit tests to make sure we don't miss anything. You can never have too many tests.
- Displaying pages filtered by date or search terms (behaviour should be the same as it is currently)
- Displaying child pages where the parent page is in the trash
- Displaying pages with two or more levels of child pages (grandchildren)
We also need to make sure that Quick Edit continues to function as expected after saving a page (with and without changing its parent). Won't be unit-testable, but needs some manual verification.
#42
@
10 years ago
- Summary changed from Need Better Page Hierarchy Display Algo to Need Better Page Hierarchy Display Algorithm
#43
@
10 years ago
- Milestone changed from Future Release to 4.2
has-patch 4.2-early so moving to 4.2.
This ticket was mentioned in Slack in #core by drew. View the logs.
10 years ago
#45
@
10 years ago
- Owner set to johnbillion
- Status changed from new to accepted
I'm going to add some more tests for this and see about getting it in early next week.
This ticket was mentioned in Slack in #core by rodrigoprimo. View the logs.
10 years ago
This ticket was mentioned in Slack in #core by santagada. View the logs.
10 years ago
#48
@
10 years ago
- Keywords commit added; needs-unit-tests needs-testing 4.2-early removed
- Type changed from enhancement to task (blessed)
15459.6.diff introduces better and more tests, all of which are passing. I'm very happy with this and it should go into 4.2 beta.
For taxonomies, we store the tree of IDs and then query based on that. We could do the same for hierarchical post types.