Opened 6 years ago

Closed 5 years ago

#4665 closed enhancement (fixed)

Optimize get_posts query

Reported by: Otto42 Owned by: anonymous
Priority: normal Milestone: 2.6
Component: Optimization Version: 2.3
Severity: major Keywords: has-patch
Cc:

Description

In post.php, in get_posts(), this is the main query built:

	$query  = "SELECT DISTINCT * FROM $wpdb->posts ";
	$query .= empty( $category ) ? '' : ", $wpdb->term_relationships, $wpdb->term_taxonomy  "; 
	$query .= empty( $meta_key ) ? '' : ", $wpdb->postmeta ";
	$query .= " WHERE 1=1 ";
	$query .= empty( $post_type ) ? '' : "AND post_type = '$post_type' ";
	$query .= empty( $post_status ) ? '' : "AND post_status = '$post_status' ";
	$query .= "$exclusions $inclusions " ;
	$query .= empty( $category ) ? '' : "AND ($wpdb->posts.ID = $wpdb->term_relationships.object_id AND $wpdb->term_relationships.term_taxonomy_id = $wpdb->term_taxonomy.term_taxonomy_id AND $wpdb->term_taxonomy.term_id = " . $category. ") ";
	$query .= empty( $post_parent ) ? '' : "AND $wpdb->posts.post_parent = '$post_parent' ";
	$query .= empty( $meta_key ) | empty($meta_value)  ? '' : " AND ($wpdb->posts.ID = $wpdb->postmeta.post_id AND $wpdb->postmeta.meta_key = '$meta_key' AND $wpdb->postmeta.meta_value = '$meta_value' )";
	$query .= " GROUP BY $wpdb->posts.ID ORDER BY " . $orderby . ' ' . $order;
	if ( 0 < $numberposts )
		$query .= " LIMIT " . $offset . ',' . $numberposts;

The two things I'm curious about are the DISTINCT and the GROUP BY clauses. Neither one seems to be actually necessary.

DISTINCT would eliminate duplicate rows. However, because it's selecting from $wpdb->posts (and not joining to itself), no duplicate rows are possible here.

GROUP BY the posts->ID is useless, since the ID is the primary key, and unique.

While I don't know if this has any real performance impact, it's certainly ripe for trimming and testing. I cut those out on my site a while back and have not noticed anything strange happening.

Change the query to this:

	$query  = "SELECT * FROM $wpdb->posts ";
... unchanged stuff ...
	$query .= " ORDER BY " . $orderby . ' ' . $order;
... unchanged stuff ...

Seems to work fine for me. And, in theory, should provide a speedup. The last thing we want is extra unnecessary load on the database.

Attachments (1)

4665.diff (4.2 KB) - added by Otto42 6 years ago.
Patch to eliminate distinct and group by from get_posts()

Download all attachments as: .zip

Change History (12)

  • Version changed from 2.2.1 to 2.3
  • Keywords needs-patch added

+1

This looks like a good idea.

Referring to #2604 and [3678] the DISTINCT and GROUP BY went from query.php a while ago giving some performance improvement.

Otto426 years ago

Patch to eliminate distinct and group by from get_posts()

  • Keywords has-patch added; needs-patch removed

+1. Patch looks good.

comment:5   ryan6 years ago

I think we have to retain group by for queries involving categories.

I don't follow you there, ryan. The posts->ID is unique, why would grouping by it help any?

comment:7   matt6 years ago

Because of the joins, try one of the queries without it. You'll get "dupe" posts.

matt: No, still not seeing it. You're joining to terms and to postmeta. Unless you have duplicated terms or postmeta info, you're not going to get the same post twice in that query.

comment:9   ryan6 years ago

  • Milestone changed from 2.3 to 2.4 (next)
  • Milestone changed from 2.5 to 2.6
  • Resolution set to fixed
  • Status changed from new to closed

get_posts() now uses WP_Query.

Note: See TracTickets for help on using tickets.