WordPress.org

Make WordPress Core

Opened 17 months ago

Last modified 3 months ago

#34560 assigned defect (bug)

High memory usage ( and possible server error) editing post with many/large revisions

Reported by: pdfernhout Owned by: adamsilverstein
Milestone: Future Release Priority: normal
Severity: major Version: 3.6
Component: Revisions Keywords: needs-unit-tests has-patch dev-feedback needs-testing
Focuses: administration, performance Cc:

Description

Version: This issue was seen under WordPress 4.3.1, but it likely applies to any modern version of WordPress, even as far back as 2.9 or maybe earlier.

Expected Behavior: A WordPress user should be able to happily create thousands of revisions of pages (or blog posts) containing megabytes of HTML text, limited only by the amount of server memory needed to contain the specific edited revision of the page or post (minus baseline memory needs). So, for example, if you have a 256MB PHP memory limit, you should be able in theory to edit pages of 100MB or more for thousands of revisions (ignoring non-memory constraints like CPU usage, database configuration, communication timeouts, or browser-side limitations). Certainly you should be able to edit a page or post of under 1MB for hundreds of revisions given 256MB of allowed PHP memory (or even just with a 32MB memory limit given few plugins). While editing in WordPress will have limits based on available memory or CPU use or communications timeouts and the memory demand of the current revision being edited, those editing limits should not be substantially affected by the number of revisions of a post previously saved or the sizes of previously saved revisions.

Exhibited Behavior: Right now, you will unhappily see "500 server error" and "white screen of death" results eventually if you, say, edit a 400K page for hundreds of revisions, even with a 256MB server memory limit. Here is an example error from a server with 256MB memory limit resulting from trying to edit a page after saving about 350 revisions of a page that grew to be 437,042 bytes in size:

[02-Nov-2015 14:40:42 UTC] PHP Fatal error:  Allowed memory size of 268435456 bytes exhausted (tried to allocate 437043 bytes) in /usr/home/narrafirma/public_html/narrafirma.com/wp-includes/wp-db.php on line 1389

Steps to reproduce: Create a new WordPress page. Put 500K or so of text in it (or any other large amount, the smaller the amount, the more revisions needed to be saved). Save the page for hundreds of revisions. Failure will occur when the number of revisions times the post size exceeds the PHP available memory minus the core memory need of WordPress+plugins. To speed this failure process, reduce the memory available to PHP.

Likely cause: The file edit-form-advanced.php has the following section of code which retrieves all revisions for a post using wp_get_post_revisions and then counts the result (twice) to display a count of revisions on the page. While the current approach has a certain elegant simplicity and is easy to understand, the approach is unfortunately computationally inefficient because retrieving all the revisions of a post just to count them potentially consumes a lot of memory, a lot of CPU time, and a lot of database bandwidth when there are many revisions of a post if some are of significant size.

From edit-form-advanced.php (4.3.1):

$publish_callback_args = null;
if ( post_type_supports($post_type, 'revisions') && 'auto-draft' != $post->post_status ) {
        $revisions = wp_get_post_revisions( $post_ID );
 
        // We should aim to show the revisions metabox only when there are revisions.
        if ( count( $revisions ) > 1 ) {
                reset( $revisions ); // Reset pointer for key()
                $publish_callback_args = array( 'revisions_count' => count( $revisions ), 'revision_id' => key( $revisions ) );
                add_meta_box('revisionsdiv', __('Revisions'), 'post_revisions_meta_box', null, 'normal', 'core');
        }
}

Suggested fix: Retrieve the count of revisions in some other way with custom SQL. Alternatively, add an option in when retrieving revisions via "wp_get_post_revisions" (in "wp-includes/revision.php") to only retrieve the metadata for the revision (like size, timestamp, user, and so on) and use that option in this code. Incidentally, another minor optimization for this code fragment may be to store the count of revisions in the code above so the counting process is not done twice (depending perhaps on how count is implemented versus variable allocation costs in PHP).

Caveats: I have only tested this error condition for a page, but I am assuming this issue will apply to blog posts as well (or any kind of posts with revisions). I am not sure what other implications there would be from such a change, like if the revisions' contents were needed elsewhere later for comparisons -- although such a situation could potentially be dealt with by selectively loading specific revision's contents as needed for diff comparisons, even if all the metadata might be needed up front.

Other potential benefits of the proposed fix: Beyond avoiding server errors and white screens, making the proposed change may speed up opening long posts and pages for editing by reducing server load substantially. When editing even small posts with a few revisions, this change will still likely speed up the initial opening of the editor at least slightly.

Use cases: An example situations of a large page with multiple revisions is where a team uses a WordPress page to track component version deployment status of a large system. Another use case is when a WordPress page grows into a book-length document through repeated editing by someone :-) who is also used to saving a lot to avoid losing work from browser crashes.

Additional suggestion for further code review: All uses of "wp_get_post_revisions" and any similar functions, as well as all uses of "count", could be reviewed in the WordPress codebase looking for similar computational inefficiencies.

Attachments (14)

34560.diff (2.8 KB) - added by adamsilverstein 17 months ago.
34560.2.diff (4.7 KB) - added by adamsilverstein 17 months ago.
34560.3.diff (4.7 KB) - added by adamsilverstein 17 months ago.
WordPress Issue #34560 Screenshot 2015-11-04 of three calls to get all revisions when editing.png (203.8 KB) - added by pdfernhout 17 months ago.
Screenshot from WordPress 4.3.1 with Query Monitor showing apparently three queries to get all post revisions when editing a post
34560.4.diff (5.7 KB) - added by adamsilverstein 17 months ago.
34560.5.diff (5.0 KB) - added by adamsilverstein 17 months ago.
34560.6.diff (6.4 KB) - added by adamsilverstein 17 months ago.
34560.7.diff (6.1 KB) - added by adamsilverstein 17 months ago.
34560.8.diff (5.7 KB) - added by adamsilverstein 15 months ago.
34560.9.diff (5.9 KB) - added by DBrumbaugh10Up 14 months ago.
Adam, I took your code and added a bit more "oopness"
34560.10.diff (6.4 KB) - added by DBrumbaugh10Up 14 months ago.
Added static getInstance method.
34560.11.diff (4.2 KB) - added by adamsilverstein 11 months ago.
Refresh of .8 patch against trunk
34560.12.diff (4.2 KB) - added by adamsilverstein 9 months ago.
34560.13.diff (6.1 KB) - added by adamsilverstein 9 months ago.

Download all attachments as: .zip

Change History (78)

#1 @pdfernhout
17 months ago

The snippet of code from edit-form-advanced.php (4.3.1) mentioned in the original report appears to still be the same in WordPress 4.4 Beta 2 (the latest available as of this report).

#2 @johnnyb
17 months ago

Additional Use Case, (from a real-world site): A post/page is used as a scratch pad for creating new posts, and the real new post is only created by duplicating the the scratchpad post when it's "done." I know this isn't the workflow that the WP designers had in mind, but I found an organization using it this year.

Exacerbating Circumstances: If the user doing the editor is not an Administrator account WP uses a lower memory limit, so this happens even sooner. It also means that an Administrator can edit a post and push it past the memory threshold for an account with fewer privileges.

#3 @miqrogroove
17 months ago

  • Milestone changed from Awaiting Review to Future Release
  • Severity changed from normal to major
  • Type changed from enhancement to defect (bug)
  • Version changed from 4.3.1 to 3.6

#4 @miqrogroove
17 months ago

Yeah, this is garbage code. The revisions API doesn't even have a function for counting revisions. Relying on WP_Query for this is wrong in so many ways.

#5 @miqrogroove
17 months ago

  • Keywords needs-patch needs-unit-tests added; efficiency scalability removed

#6 @miqrogroove
17 months ago

  • Summary changed from Server error after saving many revisions of a large page in editor & possible fix to Server error after saving many revisions

#7 follow-up: @pdfernhout
17 months ago

@miqrogroove Thank you for looking into this. Issue #24958 from about two years ago is also related to this one, although apparently somewhat different because it seems more about the revisions screen. Testing code discussed in that issue could be applicable to this one includes this ocean90 script. The issue also mentions: "The reason this was surfaced is because we have a test suite for the WordPress.com REST API that edits the same post over and over - therefor it quickly got to 1000+ revisions".

The original issue ticket submitter @jshreve reported failures in wp_update_post called from a plugin and when using the revisions API but also said "I can seem to update from the post edit screen." Is it possible that those other systems failed sooner due to using more memory for other reasons and so the edit screen still had some memory remaining but would have failed soon?

That other issue discusses near the end the desirability of having a way to retrieve post metadata, but the patch diffs there seem to be about limiting the number of revisions retrieved for comparison (like to 100).

There is a reference at the current last comment (15) from eight months ago by @adamsilverstein to wp_prepare_revisions_for_js, but the code shown on that page from includes/revision.php still calls wp_get_post_revisions and then just reformats the results without the post content. So, developing a general way to address this issue may also improve performance of that function as well (and by extension, other JavaScript consumers of such data).

#8 in reply to: ↑ 7 @adamsilverstein
17 months ago

  • Owner set to adamsilverstein
  • Status changed from new to assigned

Replying to pdfernhout:

Good summary of the issues, thanks for opening this ticket.

I will work on a patch to address this issue and hopefully fix #24958 at the same time.

#9 @adamsilverstein
17 months ago

  • Keywords has-patch dev-feedback added; needs-patch removed

I modified used Ocean90's script to create a post with hundreds of large revisions and the high memory usage is pretty apparent.

34560.diff Is an initial attempt at a fix, updating the count and link to last revision in the publish box in edit-form-advanced.php referenced in the ticket description. This does help memory usage significantly, however I think I can improve it even further by tackling the revisions list meta box.

In 34560.diff:

  • Add two new helper methods wp_get_post_revision_count and wp_get_last_revision_id that do what you would expect.
  • Use new helpers in edit-form-advanced.php

Testing & feedback appreciated.

#10 follow-up: @miqrogroove
17 months ago

Looks like a great improvement. Just a couple style ideas:

SELECT COUNT(1) should be SELECT COUNT(*) for performance. Asterisk is optimal for row counts in MySQL.

( $revision_count && $revision_count > 1 ) should be ( $revision_count > 1 ) for readability because any boolean value will not evaluate greater than one anyway.

#11 in reply to: ↑ 10 ; follow-up: @adamsilverstein
17 months ago

  • Keywords needs-testing added

Replying to miqrogroove:

Looks like a great improvement. Just a couple style ideas:

Thanks for the feedback. I addressed these points in my latest patch.

I wasn't sure about the COUNT(1) code, I think I saw that elsewhere in core and my brief research I found one post that seemed to indicate the (1) form might be faster; however I'm more familiar with (*) and didn’t detect any change in performance with either form so sticking with your suggestion.

In 34560.3.diff

  • Add a new wp_get_post_revisions_details function so we can get only the details we need to build the revisions list for the post page without making any backwards compatible breaking changes to wp_get_post_revisions - appreciate feedback on query and approach here
  • In my testing with this patch in place, my heavy revision edit post page went from using ~108MB to using ~ 64MB of memory, a 40% reduction.

Query Monitor details before and after:

before:
http://cl.ly/image/0D1i2A3b0725/Edit_Post__This_is_nice__WordPress_2015-11-03_13-04-51.jpg

after:
http://cl.ly/image/2E133G470k2j/Edit_Post__This_is_nice__WordPress_2015-11-03_13-04-15.jpg

#12 @miqrogroove
17 months ago

@return int should be @return int|bool where appropriate.

Just out of curiosity, did you dump those 40 queries to find out if there are any other large result sets involved?

#13 in reply to: ↑ 11 ; follow-up: @pdfernhout
17 months ago

tl;dr Testing shows the patch from @adamsilverstein greatly reduces peak WordPress memory consumption when editing and thus resolves the original specific issue I reported in my specific case in production; however, there is still significant apparently-unneeded resource use (excessive memory use definitely; excessive CPU and bandwidth use likely) going on in the WordPress code either when editing or looking at revisions, so my initial suggestion for improvement was incomplete.

Replying to adamsilverstein:

@adamsilverstein Thanks! Since you went to all the trouble to make this patch, and so quickly, after looking at the code, I thought the least I could do was test it in production. :-)

So, I put the website with the issue under git source control (just in case, and a good thing to do anyway) and fired away with "patch < 34560.3.diff" (on top of 4.3.1). The site seemed to still be running OK (crossing fingers). I eventually did "git stash" the changes after about an hour or so to bring the production server back to plain 4.3.1 out of precaution and a concern for auto updates colliding with the changes -- even though everything was running even better than usual up to that point.

Here are some memory statistics after patching as reported by WP-Memory-Usage (in MB):

Viewing plugin admin panel: 23.69 of 256

Editing simple page about 1K with 17 revisions: 27.21 of 256

Editing 437K page with 338 revisions: 91.81 of 256

Looking at a revision of the big page: 88.83 of 256

It's a huge improvement in that I can edit the page again (in theory, as while the editor comes up with the page's text and editor controls instead of a white screen or server error now, I have not actually tried saving a new revision of that page yet). Also, instead of the page consuming more than 256MB during editing (causing the server error), there is about 160MB still free. Big score! Yay! Thanks!

However, looking at the numbers, there is obviously something I still don't understand here. It seems to me that the big page of 437K should still only take less than a megabyte more to edit than the tiny one. I can guess that all the extra memory must be in having the revisions list for the page, both when editing a page and when looking at the revisions list? Or could there be another reason (character representation issues or whatever)?

If it is the revisions using up all the memory, are all the retrieved items needed in the new function "wp_get_post_revisions_details"? Those are: "ID, post_author, post_date, post_date_gmt, post_title, post_status, post_parent, post_modified". I see the convenience of having them all, but are some invariant (like ID) or unneeded (like title)? Or could the list be passed into the function perhaps if only some users need them? The revision list in the editor only displays author and timestamp information (and a relative time derived from the timestamp), and also autosave status (at least, as I see the list right now). It does not display this other information (size of the content might be nice in the revision list though, if that was doable, but a different issue). If the fields are all needed, as a more substantial future change (which could be its own new issue), perhaps the revisions list in the editor could drop down and only retrieve the revision metadata then? Most of the time I'm editing something the revisions list is not needed in the editor and just adds clutter. But I'm still not thinking those extra fields of revision metadata are the issue, even if these sorts of changes might make very minor improvements in memory use (and maybe not worth the effort). ID is an int, so 338 of them is not going to take up that much space, and even title strings are unlikely to be more than 100 bytes each, so 34K more total, and so on.

Something still does not add up to me if 338 revisions are using about 63MB. That is about 186K per revision, which seems way too much for a few strings. In fact, that is suspiciously again like maybe the average amount of memory needed for the big page's content per revision, as it is about half the final size of the content, which makes sense given the page grew slowly over time. I'm wondering if maybe somehow WordPress has been retrieving more than one copy of all the post revisions all this time and maybe the patch you made (similar to my original suggestion, but better) only gets rid of most of those sets of revisions -- but one other last set of unneeded revisions is lurking in WordPress somewhere?

Or maybe there is just some other reason to explain this? How much memory should a revision's metadata typically take? I just don't expect it taking much more than 1K or so per revision even with all the fields the latest patch retrieves, or about 338K max in this case. Or am I just way off base here or missed a couple decimal points somewhere or something?

BTW, for the memory use figures you provided, it would be useful to have a baseline memory value in the "after" case for a small page with just a few revisions to compare, similar to the comparison I did above, to see if this is maybe just something weird about my particular system (maybe from plugins or something). Then we can compare the difference from the baseline with the expected memory use of all the revisions generated by the version of the script you used to generate them and see if they match up.

I've never used XDebug myself, but maybe that tool or a similar one might provide some insight here if there is not some other obvious answer as to where the memory is being allocated?

Or is this extra memory use perhaps what you meant by "tackling the revisions list meta box"?

Last edited 17 months ago by pdfernhout (previous) (diff)

#14 in reply to: ↑ 13 ; follow-up: @adamsilverstein
17 months ago

Replying to pdfernhout:

Thanks for the extensive testing!

I did already address excessive memory use in the revisions list, removing the content from the retrieved values was a big part of the memory gains, I think all the remaining fields are required.

Are you seeing issues on the revisions interface screen? I have mainly been testing on the post edit screen.

I discovered that even if I completely remove the code firing for the revisions meta box, I still see large amounts of memory consumed on the edit screen when the post contains many revisions; I'm going to try to dig into the other queries to see whats going on. I've been using @johnbillion's Query Monitor to view the memory usage.

#15 in reply to: ↑ 14 @pdfernhout
17 months ago

Replying to adamsilverstein:

Thanks for the extensive testing!

You're welcome. Thanks again for making these improvements. Let me know if you want further testing of new patches on this issue.

I did already address excessive memory use in the revisions list, removing the content from the retrieved values was a big part of the memory gains, I think all the remaining fields are required.

Yes, the editor is much better now. I don't think removing more revision metadata fields will make much of a difference in any case. But there seem to probably be two revisions lists retrieved by the editor. The first one was just to count them (which you improved). I'm guessing there is a second one retrieved as well to display a list of revisions at the bottom of the page and which may still be an issue? Hard to believe the editor page was retrieving a list of all revisions of the post twice (or maybe more), but that seems to be the case somehow, or at least, that is the most obvious hypothesis that fits the facts.

Are you seeing issues on the revisions interface screen? I have mainly been testing on the post edit screen.

As reported, from incidental testing, I see unexpectedly high memory use on the revisions screen with this big page with many revisions. It otherwise works OK. I would expect the revisions screen should use the memory needed for the revisions list (<1MB by my estimate) plus twice the biggest post (again < 1MB total in my test). So, for my test situation, I would expect about 28MB to be used, not 89MB. So, about 61MB of memory beyond the minimum possible needed seems to be used on the revisions screen.

It looks like issue #25233 is related to this one in a similar way as issue #24958 is. Again the difference is that the other issue is on the revision list itself; especially how the "revisions metabox suffers from performance degradation when ... a large amount of revisions (e.g. 600+) is achieved on a post." That is the same thing I am seeing here, assuming "revisions metabox" is referring to the one on the bottom of the editor screen. And that issue also includes a comment from you referring to issue #25123, on perhaps replacing the revisions metabox with a plugin. Although other proposed solutions are to selectively load posts as needed for comparison on the revisions screen.

Presumably, removing the revisions metabox list from the bottom of the editor screen would indeed fix the rest of the memory use issue in the editor for large numbers of big posts? Improving the revisions list to use the new wp_get_post_revisions_details would also presumably address the issue too.

So, the good news is, several related issues could be resolved eventually by these improvements. However, the rest of you comment leaves me puzzled that maybe the revisions list is not the issue?

I discovered that even if I completely remove the code firing for the revisions meta box, I still see large amounts of memory consumed on the edit screen when the post contains many revisions; I'm going to try to dig into the other queries to see whats going on. I've been using @johnbillion's Query Monitor to view the memory usage.

That seems really odd. Life is full of surprises, some bad, some good. :-) Some error_log() statements might quickly confirm what functionality is being called, especially by putting one in the wp_get_post_revisions function.

Last edited 17 months ago by pdfernhout (previous) (diff)

#16 @miqrogroove
17 months ago

Per https://developer.wordpress.org/reference/functions/wp_get_post_revisions/

We still need to look at these...

wp-admin/includes/ajax-actions.php: wp_ajax_get_revision_diffs()
wp-admin/includes/post.php: edit_post()
wp-admin/includes/revision.php: wp_prepare_revisions_for_js()
wp-includes/revision.php: wp_save_post_revision()
wp-includes/revision.php: wp_get_post_autosave()
wp-includes/class-wp-xmlrpc-server.php: wp_xmlrpc_server::wp_getRevisions()

Last edited 17 months ago by miqrogroove (previous) (diff)

#17 @pdfernhout
17 months ago

As a caveat for any fix for this issue, issue #25233 from two years ago says: "Currently WordPress loads all the revisions in order to possibly upgrade them to the 3.6 versioning of revisions. It does so on every post loaded in wp-admin. It might make sense to move this to an upgrade routine or somewhere else as it adds to the performance degradation." That issue mentions other things to consider as well.

I am not sure if that specific concern on upgrading revisions applies here or if data corruption/confusion could be a possibility somehow if that was not done and a very old post was edited. But it seemed worth mentioning in case upgrading all the post revisions was a crucial (but undocumented in the code) side-effect of counting revisions inefficiently in the original editor code?

Last edited 17 months ago by pdfernhout (previous) (diff)

@pdfernhout
17 months ago

Screenshot from WordPress 4.3.1 with Query Monitor showing apparently three queries to get all post revisions when editing a post

#18 @pdfernhout
17 months ago

@adamsilverstein Thanks for the tip about @johnbillion's Query Monitor. I am trying it now with a local WordPress site on a development machine running 4.3.1 *without* your patch.

There are apparently *three* requests for all posts versions being made by the (Page) editor.

I attached a screenshot to this issue. Look for the "SELECT wp_posts.*" items for queries 26, 28, and 32.

I'm guessing your patch changed two of those calls (using your more efficient replacement) and there is one left somewhere?

As @miqrogroove points out, other code calls wp_get_post_revisions. So, if you have commented out the revisions metabox, maybe there is a call to get all revisions in the autosave code or something like that?

Last edited 17 months ago by pdfernhout (previous) (diff)

#19 @pdfernhout
17 months ago

Yes, I'm suspicious that the autosave call in edit-form-advanced.php (4.3.1) could be contributing to this as the third call to wp_get_post_revisions; from that file:

$autosave = wp_get_post_autosave( $post_ID );

And from the [autosave code in wp-includes/revision.php:

function wp_get_post_autosave( $post_id, $user_id = 0 ) {
    $revisions = wp_get_post_revisions( $post_id, array( 'check_enabled' => false ) );
 
    foreach ( $revisions as $revision ) {
        if ( false !== strpos( $revision->post_name, "{$post_id}-autosave" ) ) {
            if ( $user_id && $user_id != $revision->post_author )
                continue;
 
            return $revision;
        }
    }

Thanks for the pointers to other things to look at @miqrogroove. There may perhaps still be other such calls to functions you list that get made under various circumstances in the editor.

It's a tribute to how much people love WordPress that this inefficiency has been lived with for so long, since this peak resource demand may increase routine server costs (like with a VPS often chosen by maximum memory) to avoid server errors when editing in edge cases. :-) Either that, or people like me who would try to write an overly-long overly-verbose under-edited under-proofread rambling book-length 64000-plus-word autobiographical essay, that essentially no one is ever going to want to read, as a single WordPress page with lots of revisions (instead of taking some other more sensible approach to effective writing) are thankfully rare. :-) Or maybe both? :-)

Last edited 17 months ago by pdfernhout (previous) (diff)

#20 @adamsilverstein
17 months ago

Thanks for the detailed feedback again @pdfernhout - I will keep digging to see what other efficiencies we can gain here.

#21 @iseulde
17 months ago

  • Component changed from Editor to Revisions

#22 @adamsilverstein
17 months ago

In 34560.4.diff

  • Rewrite wp_get_post_autosave to use direct query to find correct autosave instead of getting all revisions then using PHP to loop thru and find the correct autosave.

Not sure this helped memory much, it did reduce the largest query.

http://cl.ly/image/1c2P2O2H3a11/Image%202015-11-09%20at%207.33.57%20PM.png

#23 follow-up: @miqrogroove
17 months ago

756 queries?

#24 follow-up: @miqrogroove
17 months ago

This looks weird too:

$post_id + '-autosave' 

#25 in reply to: ↑ 24 @adamsilverstein
17 months ago

oops, that got left in by mistake. correcting.

Replying to miqrogroove:

This looks weird too:

$post_id + '-autosave' 

#26 in reply to: ↑ 23 @adamsilverstein
17 months ago

Replying to miqrogroove:

756 queries?

Hahaha! that will teach me to upload a patch when I am in a hurry. New patch incoming.

#27 follow-up: @adamsilverstein
17 months ago

in 34560.5.diff :

  • cleanup patch

memory use now down to ~6.3MB (!!!), the revisions are no longer causing high memory consumption on the post edit page.

before:
http://cl.ly/image/1L1c1D120C31/Edit_Post__This_is_nice__WordPress_2015-11-10_00-28-23.jpg

after:
http://cl.ly/image/3t3z3P1D0f1y/Edit_Post__This_is_nice__WordPress_2015-11-10_00-41-34.jpg

Note: the issue is most notable when there are many large revisions, however it is present on much smaller data sizes as well.

#28 @adamsilverstein
17 months ago

  • Focuses administration added
  • Summary changed from Server error after saving many revisions to High memory usage ( and possible server error) editing post with many/large revisions

#29 follow-up: @miqrogroove
17 months ago

Needs cleanup on usage of $post->ID

Looking much better though :)

#30 in reply to: ↑ 27 @pdfernhout
17 months ago

Replying to adamsilverstein:

in 34560.5.diff :

  • cleanup patch

memory use now down to ~6.3MB (!!!), the revisions are no longer causing high memory consumption on the post edit page.

@adamsilverstein Thanks for the great patch, and @miqrogroove thanks for helping with it too.

I am testing it now with the production server with the issue. Previously, the editor failed to open a ~437K page due to a server error from exceeding 256MB. Now the editor page opens, and memory use according to WP-Memory-Usage is 27.96MB of 256MB. So, huge improvement! As a baseline, 23.68MB of 256MB is used in the admin plugin panel on that server, and 26.55MB of 256MB is used by the editor when creating a new page. So, only about 1.41MB is being used by the editor above the new page baseline for the really big page. Wow!!!

In all other ways I spot-tested after applying the patch, WordPress seems to be running OK. I'll probably git stash that patch soon though, just in case. I still have not tried saving a post on that server, and I have not tried anything related to autosaves. Those would be better tested on a test setup in case data gets munged somehow (which I think highly unlikely given the code changes).

So, it seems that WordPress was indeed loading all the revisions of a post *three* times when editing it with the edit_form_advanced: once to count the revisions, once to list the timestamps and other metadata of the revisions, and once to check for the latest autosaved revision (if any).

As far as the issue I originally reported, this patch seems to totally fix it (subject to further testing). Thanks.

It would not surprise me if there were other problematical uses of "wp_get_post_revisions", but code review for that could perhaps become a new issue.

One remaining concern is the one in comment 17 referencing issue #25233 by @jkudish (the second concern under "Possible bottlenecks") about WordPress needing to load all revisions to upgrade them. Does that concern apply here? I do not see such code in the 4.3 version of that function. There is a separate "_wp_upgrade_revisions_of_post" function in that file. It seems to be called separately when the edit process is started via edit_post (code below). Based on inspecting the code for WordPress 4.3 on GitHub, I think that concern does *not* apply here far as the patch itself goes -- but someone else should review that just in case to ensure there is no risk of data corruption.

See this code in includes/post.php to understand what the concern may be:

function edit_post( $post_data = null ) {
        ...
	if ( post_type_supports( $ptype->name, 'revisions' ) ) {
		$revisions = wp_get_post_revisions( $post_ID, array( 'order' => 'ASC', 'posts_per_page' => 1 ) );
		$revision = current( $revisions );
		// Check if the revisions have been upgraded
		if ( $revisions && _wp_get_post_revision_version( $revision ) < 1 )
			_wp_upgrade_revisions_of_post( $post, wp_get_post_revisions( $post_ID ) );
	}

That fragment of code seems to call "wp_get_post_revisions" twice, btw. Update: Because the arguments are different in the first call (posts_per_page of 1), perhaps the second call is definitely needed?

Still, what are the differences between the edit_post function in post.php and the code in edit_form_advanced.php? Perhaps there is some other bug which could become new issues issue related to differences between those two functions regarding upgrading revisions? Or maybe there are two separate new issues -- one possible issue on memory use for edit_post which calls wp_get_post_revisions twice (but maybe needs to, or maybe the first call should be different and more specialized), and another issue perhaps on _wp_upgrade_revisions_of_post not being called in edit-form-advanced.php when it should?

Last edited 17 months ago by pdfernhout (previous) (diff)

#31 @samuelsidler
17 months ago

  • Focuses performance added

#32 in reply to: ↑ 29 @adamsilverstein
17 months ago

Replying to miqrogroove:

Needs cleanup on usage of $post->ID

Ok, I see that - thanks for the feedback. Also need to update the doc blocks. Updated patch coming soon.

#33 follow-up: @adamsilverstein
17 months ago

In 34560.6.diff

  • cleanup $post_id - only pass around ID, not post objects
  • fix spots that were expecting post object or contracting post object with get_post if not passed, resulting in many queries

#34 follow-up: @wonderboymusic
17 months ago

only pass around ID, not post objects

why? if you pass the post, it skips the lookups in get_post()

#35 in reply to: ↑ 33 ; follow-up: @ocean90
17 months ago

Replying to adamsilverstein:

In 34560.6.diff

  • cleanup $post_id - only pass around ID, not post objects

Since the ID is required it shouldn't default to 0 IMO.

#36 @pdfernhout
17 months ago

I have tested the 34560.5 patch when saving edits to a long page and the server seems to continue to work OK and the page updates OK. However, 34560.5 is now not the very latest patch.

#37 in reply to: ↑ 34 @adamsilverstein
17 months ago

why? if you pass the post, it skips the lookups in get_post()

because I am no longer getting every revision as revision objects to generate the list. instead I'm getting only the fields needed for the revisions list, avoiding loading the post content, title and excerpt data into memory. we don't really need complete post objects to generate the list, and getting them causes memory use bloat, especially for large/many revisions.

#38 in reply to: ↑ 35 @adamsilverstein
17 months ago

Since the ID is required it shouldn't default to 0 IMO.

Good point.

#39 @adamsilverstein
17 months ago

34560.7.diff :

  • More cleanup
  • better doc blocks
  • remove default 0 for required post ids

#40 @adamsilverstein
17 months ago

Note:

To test this fix, try creating a post with many large revisions; this script can help. I had to create 100 or so revisions at a time and hard coded a post id and refresh several times to get my huge test post with > 700 large revisions.

Last edited 17 months ago by adamsilverstein (previous) (diff)

#41 follow-up: @pdfernhout
17 months ago

Replying to adamsilverstein:

34560.7.diff :

@adamsilverstein Thanks for the continued improvements. I "smoke tested" this patch by editing that pre-existing 461K post (with 381 revisions of increasing size) and then saving it again and it worked OK. Memory use was 29.17MB of 256MB. I did not try creating a new page.

However, when I browsed revisions, memory use was 109.01MB of 256MB. While not unexpected, there remains an unnecessary limit here that a user could bump into related to loading unneeded revisions, as otherwise I'd guess memory use could be closer to 20MB for that action.

I'm not saying that revision browser memory issue should necessarily be fixed as part of this issue (which was related to the editor). There is a big difference in functionality between suddenly not being able to edit a long page at all (which seems essential) and not being able to compare revisions (which is just really nice to have). Improving memory use of the revisions browser would probably require re-engineering to just pull in specific revisions as needed to compare differences on demand via AJAX. It could perhaps just be done either as a new issue or under one of the other related issues previously mentioned like issue #24958 or issue #25233.

#42 in reply to: ↑ 41 @adamsilverstein
17 months ago

Replying to pdfernhout:

Thanks for the feedback and testing, lets keep working on this!

The large memory usage when browsing/comparing revisions is covered in #24958. Fortunately the JS already includes a mechanism for lazy loading the data (we do this for the compare any two revisions mode). I agree this ticket is far more important to get resolved.

#43 follow-up: @senatorman
16 months ago

You are doing great work.

Here the same problem. Very high server load and extreem long page loading.

I using the last Nightly build beta version.

The problem error has related with:

wp_load_alloptions()
 wp-includes/option.php:181-
is_blog_installed()
 wp-includes/functions.php:1355
wp_not_installed()
 wp-includes/load.php:496

I have a huge woocommerce (online)shop, and this problem is unworkable.
My php knowhow is minimal, but when i can help you, let me know.
Hopely you can find the error soon.

Greetings Senatorman

Last edited 16 months ago by senatorman (previous) (diff)

#44 in reply to: ↑ 43 @pdfernhout
16 months ago

@senatorman Thanks for taking the time to report an issue. It's not clear to me yet if what you post on is exactly the "same problem". What you outline sounds more like a general slowdown for all pages rather than a slowdown (or whitescreen or server error) only when opening an editor page? Unless you see a slowdown or errors when editing posts with many revisions, the issue you outline sounds like it is probably different than this one. If it is a different issue, you will get better troubleshooting help by posting elsewhere (to help determine if the issue is too many page views for the server capacity, misbehaving plugins, core WordPress, or something else). Or, alternatively, you might get the issue fixed sooner by finding a WordPress consultant to help you troubleshoot it quickly if it is more about configuration issue than a bug in core WordPress (what this particular Trac system is for).

It looks like from your other posts that you site has more than 40,000 products and others might be having a similar issue to what you describe. If there was any broad similarity to this issue, I would expect it could be from a similar sort of inefficiency somewhere where WordPress might load more data than it has to when loading all options. However, conceptually, that is what wp_load_alloptions should be doing -- loading all options. :-) So, the issue you report could result from from some plugin (or even core code) abusing the options system to store more data there than it should (like setting a unique option for every product or post or something). But without detailed troubleshooting done elsewhere, it's not possible to know. So, I'd encourage you (or someone you hire) to collect some more debugging information on what is stored in the options data for your specific site, and if it then indeed looks like a bug in core WordPress, post a separate Trac issue on it (if needed) -- or otherwise report the issue to the appropriate plugin maintainer.

Replying to senatorman:

Here the same problem. Very high server load and extreem long page loading.

I using the last Nightly build beta version.

The problem error has related with:

wp_load_alloptions()
 wp-includes/option.php:181-
is_blog_installed()
 wp-includes/functions.php:1355
wp_not_installed()
 wp-includes/load.php:496

I have a huge woocommerce (online)shop, and this problem is unworkable.
My php knowhow is minimal, but when i can help you, let me know.
Hopely you can find the error soon.

Last edited 16 months ago by pdfernhout (previous) (diff)

#45 @senatorman
16 months ago

Thx Paul,for your reply.

When i read this topic, it looks exactly the reason for my problem with the load.
But in your comment i can read that you think my problem is different and not related too the problem discribed in this topic.

Our site is the first site with woocommerce and in the last months i've install en uninstall many plugins for testing. Maybe this messed up my code and database.

The site is slower since installing version 4.3 of wordpress and when i google, i found more users with this problem. I've not enough php knowhow, so maybe for a best result i must rebuild the complete shop, and only install the plugins i need. and use wordpress version 3.9

Greets Ed


#46 @adamsilverstein
15 months ago

34560.8.diff is an alternate approach, further reducing the query:

  • Construct a WP_Revisions class when revisions are used
  • Runs a single query on creation, storing revisions details, revision count and last revision id (all needed for the post edit screen)

@pdfernhout Can you try running your tests again with the latest patch? Thanks!

@DBrumbaugh10Up
14 months ago

Adam, I took your code and added a bit more "oopness"

#47 follow-up: @DBrumbaugh10Up
14 months ago

Things I did to make it more "oop":

  1. Renamed the class to make it more clear WHAT we were tracking revisions of. In theory things besides posts could have revisions.
  2. Made it less dependent on globals - which were unnecessary in this case - by passing the post id to the constructor.
  3. Removed the pseudo singleton pattern and tied the instantiation clearly to the post in question.

#48 in reply to: ↑ 47 @adamsilverstein
14 months ago

Replying to DBrumbaugh10Up:

Things I did to make it more "oop":

Great, thanks for helping out with this @DBrumbaugh10Up!

Question: won't doing it this way result in the query running twice? Note that both wp-admin/edit-form-advanced.php and wp-includes/post-template.php load on the post edit page, so I think this means two objects get created? Trying to avoid that! Maybe I am reading it wrong, I can test as well.

#49 @DBrumbaugh10Up
14 months ago

Yeah, you are right, let me fix that.

@DBrumbaugh10Up
14 months ago

Added static getInstance method.

#50 @DBrumbaugh10Up
14 months ago

OK, Adam:

I made the constructor protected so we could not accidentally create an instance when we did not intend to.

I added two static data members to the class. The instance we are using and a "last post id". That way, as long as the helper is dealing with the same post id, all the initialization will only happen once.

The static getInstance method will only reinitialize the object if the post ID changes for any reason.

I realize that in this case we are 99.999% likely to be dealing with one and only one post, however, the principal of loose coupling would suggest that we not count on it.

#51 @martijn-van-der-kooij
13 months ago

I'm really glad this issue is detected and solved!

Thanks Paul, Adam and David!

I'm currently using a workaround in our plugin to improve the speed of retrieving the last autosave.

(I copied the wp_get_post_autosave function and adjusted it to use the "$post[ID]-autosave-v1" as post_name in the args of wp_get_post_revisions.)

@adamsilverstein
11 months ago

Refresh of .8 patch against trunk

#52 @adamsilverstein
11 months ago

34560.11.diff refreshes .8 against trunk

#53 @adamsilverstein
9 months ago

  • Milestone changed from Future Release to 4.6

34560.12.diff: Refresh against trunk.

#54 follow-up: @adamsilverstein
9 months ago

@ocean90 any further feedback here?

I started working on some unit tests, somewhat blocked trying to figure out how to load the post edit screen successfully from the unit test side, i'd like to demonstrate memory usage isn't growing dramatically/continually as many revisions are added.

#55 @adamsilverstein
9 months ago

34560.13.diff refresh of correct patch. going to look at the possibility of adding support for additional fields to WP_Query fields argument so we could use WP_Query and benefit from a cached query for wp_get_post_revisions_details. I'd also love to write some unit tests for this, although it may not really be possible with or current testing setup.

Appreciate any feedback on the efficiency of the queries here, cc: @pento

#56 follow-up: @martijn-van-der-kooij
9 months ago

@adamsilverstein the new version of wp_get_post_autosave is getting the current user when the provided $user_id is 0, this is different behaviour from the original version which simply gets the latest autosave.

It will break my use case which simply uses the lasest autosave on purpose.

(Sorry for not noticing this earlier, but it was very recently that I had to change my own solution and figered out that it depends on the last autosave without user_id)

Last edited 9 months ago by martijn-van-der-kooij (previous) (diff)

#57 in reply to: ↑ 56 ; follow-up: @adamsilverstein
9 months ago

Replying to martijn-van-der-kooij:

It will break my use case which simply uses the lasest autosave on purpose.

@martijn-van-der-kooij thanks for the feedback; so are you calling wp_get_post_autosave with no user id? and this returns the last autosave, right? I want to verify so I can fix the rewritten function; also, I can add some unit test verifying this behavior so we don't break it in the future.

#58 in reply to: ↑ 57 ; follow-up: @martijn-van-der-kooij
9 months ago

Replying to adamsilverstein:

@adamsilverstein yes, we are calling wp_get_post_autosave with no user id and this returns the last autosave that is made by any user.

A test could perform a autosave for several users and while being the first user use the function with no userid and get the last saved one.

I've adjusted your code here:

    function swifty_get_post_autosave( $post_id, $user_id = 0 ) {
        global $wpdb;

        // Construct the autosave query.
        $autosave_query = "
                SELECT *
                FROM $wpdb->posts
                WHERE post_parent = %d
                AND   post_type   = 'revision'
                AND   post_status = 'inherit'
                AND   post_name   = %s" .
            ( ( 0 !== $user_id ) ?
                "AND post_author = %d" : "" ) . "
                ORDER BY post_date DESC, ID DESC
                LIMIT 1";

        $autosave_details = $wpdb->get_results(
            $wpdb->prepare(
                $autosave_query,
                $post_id,
                $post_id . '-autosave-v1',
                $user_id
            )
        );

        if( empty( $autosave_details ) ) {
            return false;
        }

        return $autosave_details[ 0 ];
    }
Last edited 9 months ago by martijn-van-der-kooij (previous) (diff)

#59 in reply to: ↑ 58 @adamsilverstein
9 months ago

Replying to martijn-van-der-kooij:

Replying to adamsilverstein:

@adamsilverstein yes, we are calling wp_get_post_autosave with no user id and this returns the last autosave that is made by any user.

Great, thanks for the details and updated query. I will include that in my next patch, I am currently investigating whether I can extend WP_Query to alloy for additional 'fields' to enable using it to get better query caching (especially for the list of revisions query).

#60 in reply to: ↑ 54 ; follow-up: @ocean90
9 months ago

  • Keywords punt added

Replying to adamsilverstein:

I started working on some unit tests, somewhat blocked trying to figure out how to load the post edit screen successfully from the unit test side, i'd like to demonstrate memory usage isn't growing dramatically/continually as many revisions are added.

Why would you need to load the edit screen here? Unit tests for each of the functions should be fine.

Are you able to get a patch ready by end of the week? The latest one still looks like it requires a lot of work (unit tests, docs, why does it use raw queries?, why are some current_user_can() removed?) I have the feeling that this needs to land earlier in a cycle.

#61 in reply to: ↑ 60 @adamsilverstein
9 months ago

Replying to ocean90:

Why would you need to load the edit screen here? Unit tests for each of the functions should be fine.

You are right except that I am introducing some new functions here, still I can test the before/after for each change.

Are you able to get a patch ready by end of the week? The latest one still looks like it requires a lot of work (unit tests, docs, why does it use raw queries?, why are some current_user_can() removed?) I have the feeling that this needs to land earlier in a cycle.

I agree this should probably be punted as I haven't been able to give it the attention it needs.

The removed current_user_can is reintroduced in a different spot, I will double check that change is required.

The raw queries are used in part because I am trying to perform queries in the most performant, least memory consumptive way - that are currently not supported by WP_Query. (for example selecting a group of fields, not including 'content' and 'excerpt'). I reviewed these queries with @nacin and @pento and we agreed that although they are fine, its worth exploring extending WP_Query to support these cases (especially the fields query which would be quite useful). If we add that, we may be able to use WP_Query entirely and benefit from caching. I'd like to work on this before proceeding further and should have some time next week at WCNY contributor day; if you need to punt before then I understand and will keep working towards early 4.7.

#62 @adamsilverstein
9 months ago

I reviewed the queries here and approaches for using a more cacheable query using WP_Query with @boonebgorges at WCNYC and he pointed out ticket #19866 which has an in depth discussion of altering the fields variable to accept more options. tldr; its not going to fly, the queries won't be cacheable anyway. The best option would be considering a field that excludes the (potentially heavy) content - however this makes the post objects essentially uncacheable as later cache calls would get incomplete post objects.

Instead I'll keep working on refining the queries as I have them and also consider some other optimizations to improve the queries.

#63 @rachelbaker
8 months ago

  • Keywords punt removed
  • Milestone changed from 4.6 to Future Release

@adamsilverstein Looks like you are still working through this issue and how to best solve it. I am punting this ticket out of the 4.6 milestone, it can be reconsidered for 4.7.

#64 @pavelevap
3 months ago

I noticed similar behaviour after updating client website from WP 3.1.x to latest 4.7. Homepage had more than 900 revisions and editing page was without TinyMCE (and not responsive), but I could compare revisions without problem.

I found following error in logs: PHP Fatal error: Allowed memory size of 67108864 bytes exhausted (tried to allocate 8789 bytes) in /home/users/.../wp-includes/wp-db.php on line 1846

Cleaning revisions to latest 300 was the trick. But it can happen for other users and it is a little bit tricky to find the cause (it looks more like JS error).

Note: See TracTickets for help on using tickets.