Make WordPress Core

Opened 7 years ago

Last modified 7 years ago

#43256 new enhancement

wp_delete_post() should use get_post() not wpdb->prepare

Reported by: charlestonsw's profile charlestonsw Owned by:
Milestone: Future Release Priority: normal
Severity: normal Version: 1.0
Component: Posts, Post Types Keywords:
Focuses: Cc:

Description

wp_delete_post() uses a custom SQL query to pre-load the post info in a raw state and ends up adding one extra data I/O call per post/page being deleted.

This bypasses the WP_Post::get_instance() wp_cache seeding. Consequently later calls to get_post() such as the 'before_delete_post' action to call _reset_front_page_settings_for_post() then falls through and calls WP_Post::get_instance() calling a nearly-identical SQL query (adds LIMIT 1 to to the same query.

Is there any reason why the start of wp_delete_post() should not just call $post = get_post( $postid ) ?

Replace:

    $post = $wpdb->get_row( $wpdb->prepare( "SELECT * FROM $wpdb->posts WHERE ID = %d", $postid ) );
 
    if ( ! $post ) {
        return $post;
    }
 
    $post = get_post( $post );

With:

    $post = get_post( $postid ); 
    if ( ! $post ) {
        return $post;
    }

reference article for in-depth analysis and performance implications is online at http://lance.bio/2018/02/08/improving-wp_delete_post-wordpress-core/

If this warrants attention I can find time to spin up my VVV box on latest build and generate a patch file when time allows.

Change History (4)

#1 @swissspidy
7 years ago

  • Focuses performance removed
  • Version changed from 4.9.4 to 1.0

One minor difference is that the SQL query will check whether the post really exists (not just in the cache) whereas calling get_post() with the wrong parameter (for example 0, the default for the function's $post_id parameter) will retrieve and ultimately delete the global post. That shouldn't happen of course and could be prevented with good unit tests.

#2 @johnbillion
7 years ago

Very related: #32991 (needs somebody to check and follow up with the most recent comment).

#3 @charlestonsw
7 years ago

For a moment let's assume all of that is true, at the very least wp_delete_post will call nearly identical SELECT statements if you pass a valid post ID on any of the subsequent methods in the call stack , like _reset_front_page_settings_for_post(), that later call get_post( <id> ).

Add to this:

   $post = $wpdb->get_row( $wpdb->prepare( "SELECT * FROM $wpdb->posts WHERE ID = %d", $postid ) );
 
    if ( ! $post ) {
        return $post;
    }
 
    $post = get_post( $post );
   $post = $wpdb->get_row( $wpdb->prepare( "SELECT * FROM $wpdb->posts WHERE ID = %d", $postid ) );
 
    if ( ! $post ) {
        return $post;
    }
 
    $post = get_post( $post );

To set the cache and avoid the subsequent extra direct data I/O calls:

   $post = $wpdb->get_row( $wpdb->prepare( "SELECT * FROM $wpdb->posts WHERE ID = %d", $postid ) );
 
    if ( ! $post ) {
        return $post;
    }
 
    $post = get_post( $post );

    wp_cache_add( $post->ID, $post, 'posts' );

If you want to make those lines 100% identical to the WP_Post::get_instance() call that will ALWAYS follow this at some point in the very near future whenever wp_delete_post() is called with a valid ID, then change that SELECT as well. Since wp_delete_post is supposed to only get a single int as the post ID param, this should be done anyway:

ONE row only please...

{{{
   $post = $wpdb->get_row( $wpdb->prepare( "SELECT * FROM $wpdb->posts WHERE ID = %d LIMIT 1", $postid ) );
 
    if ( ! $post ) {
        return $post;
    }
 
    $post = get_post( $post );

    wp_cache_add( $post->ID, $post, 'posts' );
}}}

You've now avoided the next 5+ steps in wp_delete_post from having to run the same exact block of code about 100 milliseconds later.

BTW, all of this analysis is part of that article as well.

As for calling get_post( <invalid id> ) as an int -- and it can/should be cast to int to be sure in wp_delete_post, that will cause get_post to fall through to $_post = WP_Post::get_instance( $post );

WP_Post::get_instance( 0 ) -- as an example will properly cast the post ID to an int, if it is zero the ( ! post_id ) right at the top will cause get_instance to return FALSE. This comes back to get_post() and if ( ! post ) there will return... NULL.

Which, now your entire wp_delete_post is busted after the $post = get_post( $post ) cast which now has $post to NULL.

In this case you really need to add ANOTHER ! post check to wp_delete_post...

        // FIRST PATCH:: limit to 1 only please
	$post = $wpdb->get_row( $wpdb->prepare( "SELECT * FROM $wpdb->posts WHERE ID = %d LIMIT 1", $postid ) );

	if ( ! $post ) {
		return $post;
	}

	$post = get_post( $post );

       // SECOND PATCH -- catch broken things if somehow get_post() failed during casting
       if ( ! post ) {
              return $post;
        }

        // THIRD PATCH - let's prevent duplicate SELECTS direct from the DB
        wp_cache_add( $post->ID, $post, 'posts' );

	if ( ! $force_delete && ( 'post' === $post->post_type || 'page' === $post->post_type ) && 'trash' !== get_post_status( $postid ) && EMPTY_TRASH_DAYS ) {
		return wp_trash_post( $postid );
	}

And yes, #32991 is still very relevant. wp_delete_post() can return just about anything. A stdClass object, null, a WP_Post() object.

This also begs the question, if wp_delete_post() cannot trust wp_cache to be valid, can ANYTHING. Why do so many other things trust WP_Post::get_instance() and its direct pull from cache when a post is in there yet wp_delete_post() does not? If you cannot be deleting records that don't exist you cannot be updating them, shouldn't be displaying their content, etc. But that is a story for another day.

Maybe the right fix is to start by forcing the entry for the cache for this post to be DELETED first (we're deleting this post anyway) then replace the entire SELECT block here with $post = get_post( (int) $postid); if ( ! $post ) return $post; instead.

One thing I do know is that deleting a VALID custom post type unnecessarily runs two identical selects direct from the DB in a row. I'm guessing it is true for ALL post / page types.

The article I linked drills down fairly deep into all of this.

Version 0, edited 7 years ago by charlestonsw (next)

#4 @swissspidy
7 years ago

  • Milestone changed from Awaiting Review to Future Release

Let's have a closer look at this and ideally add some tests.

Note: See TracTickets for help on using tickets.