Make WordPress Core

Opened 17 years ago

Closed 15 years ago

Last modified 15 years ago

#5303 closed defect (bug) (fixed)

"Manage > Pages" becomes very slow with hundreds of pages

Reported by: martijnd's profile MartijnD Owned by:
Milestone: 2.5 Priority: normal
Severity: normal Version: 2.5
Component: General Keywords: has-patch
Focuses: Cc:

Description

Apologies if this has been mentioned before, I couldn't find a relevant related bug.

I am currently building a prototype blog / content site that has several hundreds of "Pages"; and things have slown down a lot. With just over 250 pages, generation of "Manage > Pages" takes over 30 seconds, which causes PHP time out errors on the server.

Some profiling shows that most time is spend in:

\wp-admin\includes\template.php
166: function page_rows()

The code for hierarchal display is very inefficient, as it loops over and over the full set of posts to check if a post and a parent are related.

180: if ( $hierarchy && ($post->post_parent != $parent) )
181: continue;

Forcing $hierachy to false reduces the page creation time to 7-8 seconds. Still not great, it should be possible to do this in < 2 seconds.

Please take this as suggestion for improvement.

If I have time, I will look into finding a more efficient piece of code -- for now I still have a couple more pages to add ;-)

Attachments (2)

5303_pages.diff (5.7 KB) - added by hailin 17 years ago.
patch file
post.diff (1.1 KB) - added by mihai 15 years ago.
fast get_page_children

Download all attachments as: .zip

Change History (21)

#1 @ryan
17 years ago

Yes, that code is awful. Improvements are definitely welcome.

#2 @foolswisdom
17 years ago

Relates to #5303?

#3 @foolswisdom
17 years ago

Gah, relates to 3614?

#4 @besonen
17 years ago

  • Cc davidb@… added

#5 @mdawaffe
17 years ago

http://bitsinashortbit.wordpress.com/2007/09/03/first-showcase-of-page-sorting-in-wordpress/

A GSoC student looked at paginating hierarchical content with this problem in mind.

#6 @hailin
17 years ago

  • Resolution set to fixed
  • Status changed from new to closed

Let N = number of pages
The original algorithm takes O(Nx) where x>=2. That is why it is very slow to display pages when N is large (over 200)

The proposed new algorithm accepts as input pages array sorted by post_parent, ID ASC.

Then splice the array into two trunks:
trunk#1 contains pages whose post_parent is 0
turnk#2 contains pages whose post_parent > 0

For every page is trunk#1, we look into trunk#2 to see if there is a child.
If yes, we display the child page, remove the child page, then recursively examine trunk#2 again to see if there are nested pages to be displayed.

We take advantage of the fact that child pages in trunk#2 is sorted by post_parent,
And use heuristics to enable us to do minimal lookups needed.

The time complexity is optimal for this problem – it is O(N).

Note that this algorithm depends on the assumption that input pages are sorted on post_parent, ID. I just discovered that current wporg core has another bug which can cause cached pages in query results out of order.

#7 @hailin
17 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

#8 @lloydbudd
17 years ago

  • Keywords has-patch added
  • Milestone changed from 2.5 to 2.4
  • Priority changed from low to normal
  • Type changed from task to defect
  • Version set to 2.4

#9 @hailin
17 years ago

The new patch does not depend on pages being sorted by post_parent,ID.
It is a little slower, but still within O(N) complexity.
Reviewed by Ryan, Matt.

@hailin
17 years ago

patch file

#10 @ryan
17 years ago

(In [6380]) Faster page_rows() from hailin. see #5303

#11 @thee17
17 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

#12 @lloydbudd
17 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

re-opening, leaving for ryan to close.

#13 @ryan
17 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

As good as it gets for now. Resolving as fixed.

#14 follow-up: @mihai
15 years ago

  • Cc mihai added
  • Milestone changed from 2.5 to 2.8.5
  • Resolution fixed deleted
  • Status changed from closed to reopened
  • Version changed from 2.5 to 2.8.4

that patch may have improved things but with over 7000 pages ( most of them children of one page ) it takes almost 3 minutes to load the page admin.

If no one has a better way of doing this wordpress should offer a no hierarchy display if that's faster. ( maybe include an option in the screen options for this )

#15 @mihai
15 years ago

After using xdebug I found out that the main problem was the function get_page_children . I modified this function to something similar to the idea from http://core.trac.wordpress.org/changeset/6380 decreased the time to 1 minute.
Now it seems like the rest of the time wordpress tries to update the cache with all those 7000 pages. If I just remove the call to update_post_caches in get_posts the load time goes down to 15 seconds. This is finally acceptable . Why does it need to update the cache for all pages when they are displayed in admin ?

@mihai
15 years ago

fast get_page_children

#16 @hailin
15 years ago

I agree that get_page_children needs some improvement. Currently it's O(N2).
A better way to improve it is to use a similar algorithm used in walker class.
I will take a shot at this.

#17 in reply to: ↑ 14 @lloydbudd
15 years ago

  • Milestone changed from 2.8.5 to 2.5
  • Resolution set to fixed
  • Status changed from reopened to closed
  • Version changed from 2.8.4 to 2.5

mihai & hailin really good info! Please open a new ticket referencing this one, as this ticket resulted in code being checked in to a specific release, and things get messy if reopening for anything but a regression.

#18 @hailin
15 years ago

new ticket created in #10852

#19 @hailin
15 years ago

mihai:
You are welcome to run your 7000 pages example again with #10852 patch.

Note: See TracTickets for help on using tickets.