#5303 closed defect (bug) (fixed)
"Manage > Pages" becomes very slow with hundreds of pages
Reported by: | MartijnD | Owned by: | |
---|---|---|---|
Milestone: | 2.5 | Priority: | normal |
Severity: | normal | Version: | 2.5 |
Component: | General | Keywords: | has-patch |
Focuses: | Cc: |
Description
Apologies if this has been mentioned before, I couldn't find a relevant related bug.
I am currently building a prototype blog / content site that has several hundreds of "Pages"; and things have slown down a lot. With just over 250 pages, generation of "Manage > Pages" takes over 30 seconds, which causes PHP time out errors on the server.
Some profiling shows that most time is spend in:
\wp-admin\includes\template.php
166: function page_rows()
The code for hierarchal display is very inefficient, as it loops over and over the full set of posts to check if a post and a parent are related.
180: if ( $hierarchy && ($post->post_parent != $parent) )
181: continue;
Forcing $hierachy to false reduces the page creation time to 7-8 seconds. Still not great, it should be possible to do this in < 2 seconds.
Please take this as suggestion for improvement.
If I have time, I will look into finding a more efficient piece of code -- for now I still have a couple more pages to add ;-)
Attachments (2)
Change History (21)
#5
@
17 years ago
http://bitsinashortbit.wordpress.com/2007/09/03/first-showcase-of-page-sorting-in-wordpress/
A GSoC student looked at paginating hierarchical content with this problem in mind.
#6
@
17 years ago
- Resolution set to fixed
- Status changed from new to closed
Let N = number of pages
The original algorithm takes O(Nx) where x>=2. That is why it is very slow to display pages when N is large (over 200)
The proposed new algorithm accepts as input pages array sorted by post_parent, ID ASC.
Then splice the array into two trunks:
trunk#1 contains pages whose post_parent is 0
turnk#2 contains pages whose post_parent > 0
For every page is trunk#1, we look into trunk#2 to see if there is a child.
If yes, we display the child page, remove the child page, then recursively examine trunk#2 again to see if there are nested pages to be displayed.
We take advantage of the fact that child pages in trunk#2 is sorted by post_parent,
And use heuristics to enable us to do minimal lookups needed.
The time complexity is optimal for this problem – it is O(N).
Note that this algorithm depends on the assumption that input pages are sorted on post_parent, ID. I just discovered that current wporg core has another bug which can cause cached pages in query results out of order.
#8
@
17 years ago
- Keywords has-patch added
- Milestone changed from 2.5 to 2.4
- Priority changed from low to normal
- Type changed from task to defect
- Version set to 2.4
#9
@
17 years ago
The new patch does not depend on pages being sorted by post_parent,ID.
It is a little slower, but still within O(N) complexity.
Reviewed by Ryan, Matt.
#12
@
17 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
re-opening, leaving for ryan to close.
#13
@
17 years ago
- Resolution set to fixed
- Status changed from reopened to closed
As good as it gets for now. Resolving as fixed.
#14
follow-up:
↓ 17
@
15 years ago
- Cc mihai added
- Milestone changed from 2.5 to 2.8.5
- Resolution fixed deleted
- Status changed from closed to reopened
- Version changed from 2.5 to 2.8.4
that patch may have improved things but with over 7000 pages ( most of them children of one page ) it takes almost 3 minutes to load the page admin.
If no one has a better way of doing this wordpress should offer a no hierarchy display if that's faster. ( maybe include an option in the screen options for this )
#15
@
15 years ago
After using xdebug I found out that the main problem was the function get_page_children . I modified this function to something similar to the idea from http://core.trac.wordpress.org/changeset/6380 decreased the time to 1 minute.
Now it seems like the rest of the time wordpress tries to update the cache with all those 7000 pages. If I just remove the call to update_post_caches in get_posts the load time goes down to 15 seconds. This is finally acceptable . Why does it need to update the cache for all pages when they are displayed in admin ?
#16
@
15 years ago
I agree that get_page_children needs some improvement. Currently it's O(N2).
A better way to improve it is to use a similar algorithm used in walker class.
I will take a shot at this.
#17
in reply to:
↑ 14
@
15 years ago
- Milestone changed from 2.8.5 to 2.5
- Resolution set to fixed
- Status changed from reopened to closed
- Version changed from 2.8.4 to 2.5
mihai & hailin really good info! Please open a new ticket referencing this one, as this ticket resulted in code being checked in to a specific release, and things get messy if reopening for anything but a regression.
Yes, that code is awful. Improvements are definitely welcome.