Make WordPress Core

Opened 15 years ago

Closed 17 months ago

Last modified 17 months ago

#9864 closed defect (bug) (invalid)

Performance issues with large number of pages

Reported by: pp19dd's profile pp19dd Owned by:
Milestone: Priority: normal
Severity: normal Version: 2.7.1
Component: Posts, Post Types Keywords: 2nd-opinion has-patch
Focuses: performance, administration Cc:

Description

Environment:

  • Default install, default theme, no plugins, reasonable server configuration (typical execution, memory limits). In my particular case, that was set to 30 seconds, 384 MB.

To reproduce:

  • Create a large number of pages (5,000+)
  • Try to edit a page

What happens:

  • Maximum execution time error
Fatal error: Maximum execution time of 30 seconds exceeded in \wp-includes\post.php on line 1998

Workaround to prevent error:

  • set_time_limit to 0 as server admin

My diagnosis:

Suspect that the immediate problem is that get_pages() inside wp-admin/post.php, line 2173, performs a query that selects EVERYTHING:

$query = "SELECT * FROM $wpdb->posts $join WHERE (post_type = 'page' AND post_status = 'publish') $where ";

Now, in this case this query (AFAIK) is just supposed to pull up the page list (with IDs, titles and parents). What it actually does is it pulls up ALL of the post data. This works fine for low number of pages, but on a larger data set this causes a chokepoint. After all, the end result is just supposed to be a dropdown with a list of pages.

Workaround to improve performance:

  • comment out line 273 in wp-admin\edit-page-form.php and skip displaying the page parent in the edit screen

The bigger picture:

Well, the ENTIRE list of pages is queried in this 'edit page' screen (see screenshot # 2 - with a dropdown that contains 5,000+ entries.) I think this is a design limitation and it looks clumsy once the list gets larger. Rather than patch the display routines, I think this thing needs to be replaced with something that can search / paginate piecemeal server-side.

Attachments (4)

page_edit_error.png (8.4 KB) - added by pp19dd 15 years ago.
page_edit_eyesore.png (8.9 KB) - added by pp19dd 15 years ago.
9864.diff (8.4 KB) - added by oso96_2000 10 years ago.
9864.2.diff (8.3 KB) - added by oso96_2000 10 years ago.
Thanks for the feedback, SergeyBiryukov

Download all attachments as: .zip

Change History (59)

#1 @mrmist
15 years ago

Fairly sure that the scalability with lots of pages has already been reported.

#2 @Denis-de-Bernardy
15 years ago

  • Component changed from Administration to Performance
  • Milestone changed from Unassigned to Future Release
  • Owner set to Denis-de-Bernardy

yep. quite a few times, even.

#3 @Denis-de-Bernardy
15 years ago

the query is:

SELECT * FROM wp_posts WHERE (post_type = 'page' AND post_status = 'publish') ORDER BY menu_order, post_title ASC

please alter your database, and add an index on (post_type, post_status). see whether that makes any difference.

#4 @josephscott
15 years ago

That query appears to never use any of the current indexes. It's the ORDER BY clause that is causing the indexes to not be usable. Perhaps it's worth leaving the ORDER BY off and sorting the results in PHP, outside of the DB.

#5 follow-up: @Denis-de-Bernardy
15 years ago

php won't beat an index if one is used. especially if we're talking 3k page lists. that'll consume huge amounts of memory.

an index on post_type, menu_order, post_title may do the trick, but at the end of the day it goes down to the amount of clutter that is present in the database. if the number of posts, revisions and attachments is small, and the query planner decides it's faster to seq scan the mess and sort it without an index, it will...

anyway, see #9642

#6 follow-up: @pp19dd
15 years ago

Thanks. I was unable to find related tickets, but thanks to the references I see this is being looked at. I added an index to the suggested fields and in this case it really made no difference.

The question that I have is -- why would we want to list all 5,000+ pages in a dropdown on the 'edit page' screen? It would seem prudent to put up a paginated search widget there. I mean, its purpose is to assign a parent to the page.

I don't think this is a question of optimization as much as "why are we querying this at all?" (page_edit_eyesore.png shows a dropdown that's had its scroll bar pushed off the screen - but - more importantly it shows the entire 5,000 page listing.)

At any rate, I think that you might have covered this under another patch. I will keep looking around for an upcoming solution - otherwise - I'll manually disable that admin feature.

#7 in reply to: ↑ 6 @Denis-de-Bernardy
15 years ago

Replying to pp19dd:

The question that I have is -- why would we want to list all 5,000+ pages in a dropdown on the 'edit page' screen? It would seem prudent to put up a paginated search widget there. I mean, its purpose is to assign a parent to the page.

Simple: because there is no patch to do things differently, and that would be the simplest way to proceed in order to assign a page parent.

Imo, it would be better if a simple text field existed. As the user types a title in it, we'd ajax query page titles and return a small selection of pages to choose from. That would be great, and it would scale admirably. If you're familiar with php and js, go right ahead and write the patch. If not, it may be ages before it gets fixed if your DB is so that the indexes are not used.

#8 @pp19dd
15 years ago

Denis- thanks for the suggestion. I wonder if I get it done quickly, will make it into 2.8?

#9 @Denis-de-Bernardy
15 years ago

Quite honestly I doubt it. We're in beta, which is another word for feature freeze. It would surely go into an early 2.9 if a patch is around, however.

Then again, I've no commit access, so this would really be a question for one of Ryan, Andrew, Mark or Peter.

#10 @mrmist
15 years ago

As I thought, I raised similar issues a while back as #8468 and #8492

Essentially, this problem exists whereever we have a drop down of potentially infinite length. Some sort of logic should be used to limit records that are returned to a pull-down menu. Whether that means that for large numbers of records they have to be LIMITed and split by some criteria, I am not sure.

#11 @mrmist
15 years ago

#8648 rather.

#12 @Denis-de-Bernardy
15 years ago

@mrmist: the thing to do, I think, would be the following:

A simple text field. As the user types a title in it, we'd ajax query page titles and return a small selection of pages to choose from.

Now someone needs to volunteer the needed ajax in a generic enough way. ;-)

#13 @pp19dd
15 years ago

I'll do some wishful thinking (that the fix would magically end up in 2.8) and give it a try in the next day. Think that I can do a fairly unobtrusive patch that would give us that capability. Question is whether to work from 2.7.1 or the trunk. What do you guys think? (Word had it that the trunk had updated "optimized" queries.)

#14 @Denis-de-Bernardy
15 years ago

Add this index to your wp_posts table:

(post_type, menu_order, post_title)

And then use trunk. Not 2.7.1.

Intuitively, there is a need for:

  • a new case in wp-admin/admin-ajax.php (e.g. quickfind) that tosses an empty array into a filter with a parameter (pages, in our case), so $results = apply_filters(quickfind, array(), pages)
  • some kind of function that returns the needed array using a db query
  • the needed UI functions in wp-admin/includes/template.php
  • possibly some edits to wp-admin/js/pages.dev.js (use the SCRIPT_DEBUG define to use that one)

#15 @Denis-de-Bernardy
15 years ago

another potential option could be a simple index on (post_type, post_title) and an unordered query:

select * from $wpdb->posts where post_type = page and post_title like foo%

that *should* use the index as well, and it might be faster than the previous index. (string comparison functions conveniently happen to be case insensitive in mysql.)

#16 @Denis-de-Bernardy
15 years ago

  • Milestone changed from Future Release to 2.9

#17 @Denis-de-Bernardy
15 years ago

  • Keywords 2nd-opinion removed

#18 @Denis-de-Bernardy
15 years ago

  • Owner Denis-de-Bernardy deleted
  • Status changed from new to assigned

#19 @prettyboymp
14 years ago

What if the functionality of this metabox just degraded nicer? We could add some handling that said if the number of pages were over say 200, then instead of doing the normal drop down box, we use pop up style selector similar to the way the media_upload browser is used for the post image handling.

This would allow smaller sites to continue to make quick and easy selection like they do now while keeping it from breaking completely for larger sites.

#20 @prettyboymp
14 years ago

  • Cc mpretty@… added

#21 @janeforshort
14 years ago

  • Milestone changed from 2.9 to Future Release

Punting b/c we're in beta 2 and this needs an updated patch.

#22 @hakre
13 years ago

Duplicate: #16851

#23 @jkudish
12 years ago

  • Cc joachim.kudish@… added

#24 @husobj
11 years ago

  • Cc ben@… added

#25 in reply to: ↑ 5 ; follow-up: @husobj
11 years ago

Replying to Denis-de-Bernardy:

an index on post_type, menu_order, post_title may do the trick, but at the end of the day it goes down to the amount of clutter that is present in the database. if the number of posts, revisions and attachments is small, and the query planner decides it's faster to seq scan the mess and sort it without an index, it will...

Is there any reason why 'menu_order' doesn't have an index?
It strikes me the primary use for 'menu_order' would be to sort results so surely having an index on this would help?
I'll caveat this by saying I'm no database expert :)

#26 in reply to: ↑ 25 ; follow-up: @Denis-de-Bernardy
11 years ago

Replying to husobj:

Replying to Denis-de-Bernardy:

an index on post_type, menu_order, post_title may do the trick (...)

Is there any reason why 'menu_order' doesn't have an index?
It strikes me the primary use for 'menu_order' would be to sort results so surely having an index on this would help?
I'll caveat this by saying I'm no database expert :)

An index on menu_order just by itself would never get used. The only case where such an index would get considered is when it's 1) selective enough (but here, an index on type/status will likely beat it, since it would be an index where most values are 0) and/or 2) it returns rows in the correct order (which it doesn't, since we order by order/title).

On a normal blog with lots of posts and only a handful of pages, the query engine would likely plan to grab published pages using an index on type/status, sort them, and return the result. The index I suggested would, in this case, strip out the sorting step on most normal sites. If, in contrast, a site has lots of pages and little to no posts, the optimal query plan would be to avoid the index entirely, read the whole table followed by a sort -- because in memory sorting the whole table will trounce random disk reads.

In so far as I remember from two years ago, though, this seemed to be primarily a throughput issue, rather than a DB indexing issue. Things time out either because it takes too long to move the pages from the query results to PHP from the DB, or because it takes too long for WP to do its things once they're retrieved, or because it take too long to output said things once it's done, or something else. Someone would need to go in there and measure, to identify where time is actually being spent.

#27 in reply to: ↑ 26 @husobj
11 years ago

Replying to Denis-de-Bernardy:

Thanks for your in-depth reply.

I suppose the use case I am thinking of is when you have many pages (or custom post types) and you use a plugin like CMS Tree Page View or other plugin which allows you to set the order of posts by drag-n-drop. General these tend to set a different menu_order value for every post - is that what you mean above when you say "selective enough"?

So that's fine if you have an average quantity of pages, but say you have loads of pages or an e-commerce store with thousands of products which users can order in a custom way by menu_order. In that instance would an index on menu_order be beneficial?

#28 follow-up: @Denis-de-Bernardy
11 years ago

Potentially beneficial for selectivity

where post_status = 'published' and menu_order > 0 -- eliminates posts, which have menu_order set to zero


Potentially beneficial for ordering:

order by menu_order

Not beneficial for ordering, unless the index is on menu_order, post_title:

order by menu_order, post_title

But in the latter case it needs to be selective as well, else, the query planner may decide its more efficient to grab all rows using a smaller, more selective index (on post_type or post_type, post_status), hence my initial suggesting to try one on post_type, menu_order, post_title -- potentiall selective and in the right order.

Caveat: if using the index means randomly moving back and forth over disk pages all over the place, the planner might still (correctly) prefer a sequential scan with an in-memory sort. Using a clustered index would work around this, but you wouldn't want to go there, since the current clustered index (the id) more or less matches the order we use to run blog-related queries.

Anyway... I hope it clarifies your thoughts. At the end of the day, this is really an issue at the php level imho, rather than an index problem.

#29 in reply to: ↑ 28 @husobj
11 years ago

Replying to Denis-de-Bernardy:

Thanks for taking the time to explain.

#30 @nacin
10 years ago

  • Component changed from Performance to Administration
  • Focuses performance added
  • Milestone changed from Future Release to 3.9

Would love to fix this by moving to an auto-complete solution, akin to #19867 for users.

#31 @helen
10 years ago

Thinking through this a bit - since autocomplete results are flattened out do we need to / how should we show any hierarchy? Add parent page name somewhere in the result? Something else?

#32 @nacin
10 years ago

#8492 was marked as a duplicate.

This ticket was mentioned in IRC in #wordpress-dev by helen. View the logs.


10 years ago

#34 @nacin
10 years ago

  • Component changed from Administration to Posts, Post Types
  • Focuses administration added

This ticket was mentioned in IRC in #wordpress-dev by nacin. View the logs.


10 years ago

This ticket was mentioned in IRC in #wordpress-dev by helen. View the logs.


10 years ago

This ticket was mentioned in IRC in #wordpress-dev by JustinSainton_. View the logs.


10 years ago

This ticket was mentioned in IRC in #wordpress-dev by DH-Shredder. View the logs.


10 years ago

@oso96_2000
10 years ago

#40 @oso96_2000
10 years ago

  • Keywords 2nd-opinion has-patch added

Attached is a first try with this. Had something done, but when I saw the last patch from @helen at #19867 I changed it to also use autocomplete only with page number > 100. We still need to do something with indexes or selecting just some fields, because get_pages is still taking long time when there are lots of pages.

#41 @oso96_2000
10 years ago

Btw, almost forgot. For displaying the parent/child pages I was thinking on something like this example: http://jqueryui.com/autocomplete/#categories

#42 @SergeyBiryukov
10 years ago

Haven't tested 9864.diff yet, but it has some i18n issues.

/* translators: 1: post_title */
'id'    => $page->{$field},
'label' => sprintf( __('%1$s'), $page->post_title ),
'value' => sprintf( __('%1$s'), $page->post_title )

Dynamic strings like these are not localizable, as they cannot be extracted by gettext:
http://ottopress.com/2012/internationalization-youre-probably-doing-it-wrong/

Why would we need to translate a post title?

We do, however need to use esc_attr__() instead of just esc_attr() here:

$output = "<input name='{$name}' type='text' ... title='" . esc_attr( 'Page Title' ) . "' ... />\n";
Last edited 10 years ago by SergeyBiryukov (previous) (diff)

This ticket was mentioned in IRC in #wordpress-dev by oso96_2000. View the logs.


10 years ago

@oso96_2000
10 years ago

Thanks for the feedback, SergeyBiryukov

#44 @helen
10 years ago

  • Milestone changed from 3.9 to Future Release

See: #19867. Sad, because I hate punting. But we'll get there.

This ticket was mentioned in IRC in #wordpress-dev by helen. View the logs.


9 years ago

This ticket was mentioned in Slack in #design by helen. View the logs.


9 years ago

#47 @lukecavanagh
7 years ago

@pp19dd

Seems like this would be a good performance change in core for WP 4.8.

#48 @SergeyBiryukov
6 years ago

#42369 was marked as a duplicate.

#49 follow-up: @Takahashi_Fumiki
6 years ago

I have 3 questions before making patches.

UI enhancement

WordPress uses select element to "select parent" UI. But it may raises performance issue if target option elements increase.

  • Parent page(also, hierarchical post types)
  • Parent term(also, hierarchical taxonomies)

SQL optimization doesn't solve these, because parent candidates keep increasing.

So I recommend incremental select, but is it beyond scope of this ticket?

Library to use

I think we should choose some library to Candidates.

  • Select2
  • SelectWoo
  • jQuery UI autocomplete

I prefer select2 https://github.com/select2/select2 but core team tried to include and nothing updated https://github.com/select2/select2/issues/3744
WooCommerce team made SelectWoo and I think this might be good https://github.com/woocommerce/selectwoo
jQuery UI autocomplete is included in core but not similar to original select element.

What should we choose?

Resulting UI

Current select element displays option hierarchically, but select2 or something like autocomplete don't display "non-match parent.
Considering its purpose of autocomplete UI, non-matching elements should be displayed, but someone may think its worse UX than before.

Should we show non-match parent elements?

Thanks in advance.

#50 in reply to: ↑ 49 @SergeyBiryukov
6 years ago

Replying to Takahashi_Fumiki:

So I recommend incremental select, but is it beyond scope of this ticket?

There are also some considerations in comment:32:ticket:19867.

#51 @birgire
6 years ago

#43487 was marked as a duplicate.

#52 @SergeyBiryukov
4 years ago

#49928 was marked as a duplicate.

#53 @SergeyBiryukov
4 years ago

#50017 was marked as a duplicate.

#54 @adamsilverstein
17 months ago

  • Milestone Future Release deleted
  • Resolution set to invalid
  • Status changed from assigned to closed

The parent page selector was replaced with an accessible autocomplete search selector in Gutenberg, see https://github.com/WordPress/gutenberg/pull/25267. Closing this ticket as invalid. Please re-open if I have missed something.

#55 @spacedmonkey
17 months ago

I believe once #12821 is commit, it might help this problem.

Note: See TracTickets for help on using tickets.