Make WordPress Core

Opened 4 years ago

Closed 3 years ago

Last modified 3 years ago

#50567 closed enhancement (fixed)

Set $post filter in update_post_cache()

Reported by: cybr's profile Cybr Owned by: peterwilsoncc's profile peterwilsoncc
Milestone: 6.0 Priority: normal
Severity: normal Version:
Component: Posts, Post Types Keywords: has-patch has-unit-tests
Focuses: performance Cc:

Description

update_post_cache() stores each post provided in the cache, the same cache WP_Post::get_instance() uses. But in doing so, it doesn't necessarily add a $filter to the post.

Because of this, WP_Post::get_instance() will always re-sanitize the post whenever a post stored prior via update_post_cache() is requested--since WP_Post::get_instance() tests for empty( $_post->filter ) after retrieving the post back from cache (but doesn't repopulate the post cache).

This re-sanitization most prominently happens at wp-admin/edit.php, where each post displayed goes through this process about 24 times (on a clean WP installation).

To address this issue, we should prime the post cache with a $filter. The file attached will already improve the aforementioned page's load time by 62%--it's quite substantial.

Attachments (3)

post.php.patch (379 bytes) - added by Cybr 4 years ago.
Sanitize post before storing it in cache.
cg.zip (2.8 MB) - added by Cybr 4 years ago.
Cachegrind outputs
50567.png (196.2 KB) - added by Cybr 4 years ago.

Change History (36)

@Cybr
4 years ago

Sanitize post before storing it in cache.

This ticket was mentioned in Slack in #core by cybr. View the logs.


4 years ago

#2 @SergeyBiryukov
4 years ago

  • Milestone changed from Awaiting Review to 5.6

@Cybr
4 years ago

Cachegrind outputs

@Cybr
4 years ago

#3 follow-up: @Cybr
4 years ago

@peterwilsoncc asked me in Slack if I could post fuller screenshots (50567.png) pertaining to this ticket.

Attached (cg.zip) are before and after CacheGrind outputs generated by xDebug on PHP 7.4.7 via XAMPP on a VM with generous headroom (16GB/4x3.8GHz).

I've used Twenty Twenty with no plugins active on WP 5.4.2 en_US, displaying 20 posts (duplicated default post "Hello World!") at /wp-admin/edit.php, and 10 posts at / (homepage).

To learn how to inspect the files, see https://xdebug.org/docs/profiler. I used QCacheGrind to generate the attached screenshot.

1 unit of Incl./Self/Time is equal to 1 microsecond (0.001ms).

The attached screenshot shows the grind's output in this order:
Home Before / Home After
Admin Before / Admin After

https://core.trac.wordpress.org/raw-attachment/ticket/50567/50567.png

At the bottom-left of each window in the screenshots, you can see the Total Time Cost to generate the page. Since we can't use opcode caching during debugging, file inclusion takes up a generous amount of time. So, you can't calculate the relative performance impact of the improvements to the total page generation time. Nevertheless, you can measure a significant boost there, too.

sanitize_post_field Called Microtime Seconds Improvement
Location Before After Before After Before After =-1*((Af-Be)/Be)
(logged out) / 4 554 276 100 911 6 110 0.101 0.006 94%
(logged in) /wp-admin/edit.php 12 443 483 272 566 10 372 0.273 0.010 96%

So, we can assume almost up to 0.3s improved response times using PHP 7.4, implying a total of 62% of the total page load time (which I tested with OpCache enabled elsewhere).

You'll notice from the same CacheGrind files, sanitize_term_field() is also affected by this issue, where category ID 1 is resanitized over and over again. To accommodate for that, see #50568.

Last edited 4 years ago by Cybr (previous) (diff)

#4 in reply to: ↑ 3 @peterwilsoncc
4 years ago

  • Keywords has-patch needs-unit-tests added

Replying to Cybr:

Attached (cg.zip) are before and after CacheGrind outputs generated by xDebug on PHP 7.4.7 via XAMPP on a VM with generous headroom (16GB/4x3.8GHz).

I've used Twenty Twenty with no plugins active on WP 5.4.2 en_US, displaying 20 posts (duplicated default post "Hello World!") at /wp-admin/edit.php, and 10 posts at / (homepage).

Thanks for attaching the screen shots and additional summary, it's very much appreciated.

I can see this happening with a very simple test plugin running on the /wp-admin/edit.php screen:

<?php
add_action( 'admin_enqueue_scripts', function() {
        /*
         * Post 72 is off the page (ie, this is a new query).
         * 
         * The wp_cache_get dump includes `filter: raw`
         */
        var_dump( get_post( 72, OBJECT, 'display' ) );
        var_dump( wp_cache_get( 72, 'posts' ) );

        /*
         * Post 38 is on the page and has been returned by WP_Query.
         * 
         * The wp_cache_get dump does not include `filter: raw`
         */
        var_dump( wp_cache_get( 38, 'posts' ) );
        var_dump( get_post( 38 ) );
        exit;
} );

The one question I have about the patch is is there a situation in which $post->filter can be a value other than raw?

If the cache is expected to have raw data, then it would be dandy to be certain that the data is indeed stored that way. Maybe it would be better to call empty( $post->filter ) || $post->filter !== raw.

#5 @Cybr
4 years ago

The one question I have about the patch is is there a situation in which $post->filter can be a value other than raw?

I looked at the default behavior from the WP_Post class constructor. I don't think another value (like display) would benefit most sites. You can test this, of course. Keep in mind that plugins and themes will (significantly) affect this behavior. We learned to escape as late as possible, so using raw values should be fine for most cases.

Maybe it would be better to call empty( $post->filter ) || $post->filter !== raw.

Yes, I believe that'd be better, too. But, please keep in mind that WP_Post does not check against that, so this change adds a discrepancy. We may want to improve things in the WP_Post class, as well.

This ticket was mentioned in Slack in #core by sergey. View the logs.


4 years ago

#7 @hellofromTonya
4 years ago

  • Keywords early added
  • Milestone changed from 5.6 to 5.7

From today's 5.6 core scrub, Sergey:

This could probably use some unit tests. Both #50567 and this one #50568 are nice performance enhancements, but also seem like early tickets, should probably be moved to 5.7.

Moving this ticket to early 5.7 milestone along with #50568.

If any maintainer or committer feels this can be resolved in time, or wishes to assume ownership during a specific cycle, feel free to update the milestone accordingly.

This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.


4 years ago

This ticket was mentioned in Slack in #core by metalandcoffee. View the logs.


4 years ago

#10 @hellofromTonya
4 years ago

  • Milestone changed from 5.7 to Future Release

This ticket is marked for early in the alpha cycle. We are now in the Beta cycle and long past early. Moving this ticket and #50568 to Future Release. However, would be great to get both committed early in 5.8 alpha cycle if possible.

If any maintainer or committer feels this can be resolved in time, or wishes to assume ownership during a specific cycle, feel free to update the milestone accordingly.

#11 @Cybr
4 years ago

The patch provided has not caused issues with any installation I manage. It's only two code lines that make WordPress over 60% more performant, let alone the climate impact it could have.

What's holding it back from WP 5.8? Practically, unit tests are already in place.

#12 @peterwilsoncc
4 years ago

  • Milestone changed from Future Release to 5.8

@Cybr I've put this on 5.8, you make an excellent point.

#13 @hellofromTonya
4 years ago

  • Milestone changed from 5.8 to 5.9
  • Owner set to hellofromTonya
  • Status changed from new to assigned

Today is 5.8 Beta 1. Unfortunately ran out of time to do performance checks and code review.

Punting to 5.9 and assigning myself as owner to finish it early and get it committed.

This ticket was mentioned in Slack in #core-test by hellofromtonya. View the logs.


3 years ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


3 years ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


3 years ago

#17 @hellofromTonya
3 years ago

  • Keywords early removed

Removing this from early as it is not a requirement for considering this enhancement.

#18 @peterwilsoncc
3 years ago

I've just caught up on this, I'd like to get it in pre feature freeze.

I still think the cached value will need to be raw sanitization as that's the most common use case (almost all calls to get post are simply get_post( $post_id )).

@Cybr would you mind creating a PR on github.com/wordpress/wordpress-develop? That will allow @hellofromTonya and I to collaborate on the PR, whereas if either of us create one you will be unable to push to it.

#20 @Cybr
3 years ago

I still think the cached value will need to be raw sanitization as that's the most common use case (almost all calls to get post are simply get_post( $post_id )).

This skipped my mind, so I forwarded the PR as-is-proposed in the patch attached to this ticket.

Still, another check against 'raw' !== $post->filter wouldn't hurt. Please note the discrepancy with WP_Post::get_instance(), which could be addressed as described below.

This...:

<?php
// ..
} elseif ( empty( $_post->filter ) ) {
        $_post = sanitize_post( $_post, 'raw' );
}
// ..

...would then become:

<?php
// ..
} elseif ( empty( $_post->filter ) || 'raw' !== $_post->filter ) {
        $_post = sanitize_post( $_post, 'raw' );
}
// ..

@peterwilsoncc, do you wish for me to forward that in the PR?

sybrew commented on PR #1806:


3 years ago
#21

Not being used to PRs (I mostly push to master like a real pro 👀), I didn't know my commit push instantly merged with the PR, forgoing courtesy awaiting a response on my Core ticket reply.

peterwilsoncc commented on PR #1806:


3 years ago
#22

Not being used to PRs (I mostly push to master like a real pro 👀), I didn't know my commit push instantly merged with the PR, forgoing courtesy awaiting a response on my Core ticket reply.

Don't worry, always feel free to push modifications to your own PRs. You need to be happy with your contributions as much as anyone else. If a committer catches something, they won't be shy making suggestions. As we both know ;)

The logic check to ensure raw sanitization makes sense to me.

peterwilsoncc commented on PR #1806:


3 years ago
#23

@sybrew Do you have any experience writing unit tests in PHPUnit?

Re-reading the ticket, I noticed it mentioned that it would be good to have some tests in place to make sure committing this code doesn't break anything. I think it's a wise suggestion.

sybrew commented on PR #1806:


3 years ago
#24

@peterwilsoncc I do not have experience with PHPUnit. However, since the posts/query system is fundamental to WordPress, I believe all necessary query tests exist. It is why I brought it up in comment:11.

A few applicable examples I could find (and all of which passed), are:

  1. Tests_Post_Query::test_the_posts_filter()
  2. Tests_Post_GetPages::test_get_pages_cache() et al. (why didn't this one catch the bug this ticket addresses?)
  3. Everything in Tests_Post_GetPosts.

Is there a scenario you'd like to have explicitly tested, which isn't covered already in many other tests?

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


3 years ago

#26 @hellofromTonya
3 years ago

  • Milestone changed from 5.9 to 6.0

Today is 5.9 Beta 1 (in a couple of hours potentially). As the PR needs tests and review approval, moving it to 6.0. Adding to my needs unit tests queue and will get it done in 6.0.

peterwilsoncc commented on PR #1806:


3 years ago
#27

@sybrew I hope you don't mind but I have retargeted this to trunk and pushed some tests to your repo in https://github.com/WordPress/wordpress-develop/pull/1806/commits/b05ce534851e9bf7271518db3fc457ca64a4f6e3

#28 @peterwilsoncc
3 years ago

  • Keywords has-unit-tests added; needs-unit-tests removed
  • Owner changed from hellofromTonya to peterwilsoncc
  • Type changed from defect (bug) to enhancement
  • Version 5.5 deleted

I've pushed some tests to the PR and updated against trunk. I've asked for a second set of eyes to look at the tests in the hope I can get this committed.

#29 @peterwilsoncc
3 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 53042:

Posts, Post Types: Set post filter in update_post_cache().

Ensure the post cache is primed with raw sanitized data. This resolves an inconsistency between how posts retrieved via get_post() vs WP_Query are cached.

This prevents sanitize_post( $post, 'raw' ) being run multiple times on a cached post. This can happen over 20 times per post on some page loads so avoiding this will provide a noticeable performance boost.

Props Cybr, SergeyBiryukov, peterwilsoncc, hellofromTonya, costdev.
Fixes #50567.

sybrew commented on PR #1806:


3 years ago
#31

Committed via https://github.com/WordPress/wordpress-develop/commit/b1b305b82631ab2c7d72f0388cda035c8454f1e5.

Somewhat sad, the lines I contributed directly don't have my blame attached; why was this decision made?

peterwilsoncc commented on PR #1806:


3 years ago
#32

WP is actually committed via SVN on develop.svn.wordpress.org, what's shown in GitHub is a mirror of that.

Unfortunately that means that the git blame only shows the core committer who did the act of committing, rather than the additional information GitHub provides. There's a current discussion on how to improve that. https://make.wordpress.org/core/2022/01/31/ensuring-proper-attribution-for-contributions-to-wordpress-on-github/

In terms of including you in the credits for WP, they're generated from the props line in the commit message which includes your wordpress.org username.

sybrew commented on PR #1806:


3 years ago
#33

Thank you for the clarification.

I'm glad this is finally merged into Core, and I'm eager to hear of improvements -- a patch that implemented this fix I shared among friends astounded them at how fast their sites have become on cheap hosting.

With all other resolved performance tickets combined, WP 6.0 seems to be on the right track to becoming "the update for the planet." Thank you for pushing that effort.

Note: See TracTickets for help on using tickets.