Make WordPress Core

Opened 15 years ago

Closed 14 years ago

#10576 closed task (blessed) (duplicate)

Use subqueries for taxonomy queries

Reported by: ryan's profile ryan Owned by:
Milestone: Priority: normal
Severity: normal Version:
Component: Taxonomy Keywords:
Focuses: Cc:

Description

Fetching posts within a given categories/tags would be faster with subqueries. 2.9 will bump the MySQL requirement to 4.1 so we can rewrite to make use of subqueries.

Change History (11)

#1 @ryan
15 years ago

Sample query:

SELECT $wpdb->posts.ID FROM  $wpdb->posts WHERE
$wpdb->posts.ID IN (
     SELECT $wpdb->term_relationships.object_id FROM $wpdb->term_relationships
     WHERE $wpdb->term_relationships.term_taxonomy_id IN (
          SELECT $wpdb->term_taxonomy.term_taxonomy_id FROM $wpdb->term_taxonomy
          WHERE $wpdb->term_taxonomy.term_id = '$cat_id' AND $wpdb->term_taxonomy.taxonomy = 'category'
          )
     )
AND $wpdb->posts.post_type = 'post'
AND $wpdb->posts.post_status = 'draft' "
ORDER BY $wpdb->posts.post_date DESC LIMIT $limit OFFSET $offset

#2 @Denis-de-Bernardy
15 years ago

actually, that particular one would be much faster with a join:

SELECT $wpdb->posts.ID FROM  $wpdb->posts
JOIN $wpdb->term_relationships
ON $wpdb->term_relationships.object_id = $wpdb->posts.ID
JOIN $wpdb->term_taxonomy
ON $wpdb->term_taxonomy.term_taxonomy_id = $wpdb->term_relationships.term_taxonomy_id
AND $wpdb->term_taxonomy.term_id = '$cat_id'
AND $wpdb->term_taxonomy.taxonomy = 'category'
WHERE $wpdb->posts.post_type = 'post'
AND $wpdb->posts.post_status = 'draft' "
ORDER BY $wpdb->posts.post_date DESC LIMIT $limit OFFSET $offset

else, you end up with a temporary table, a merge join plan, and no index whatsoever if the mysql optimizer misses the potential optimization.

the same kind of query holds for tags. for a combo of one tag and one category, using a big join will be fastest.

the issue is when we query for multiple terms. one with multiple tags and categories would become something like the following and it no longer uses any kind of meaningful index, because it's a tough one to for optimizers to get right:

SELECT $wpdb->posts.ID FROM  $wpdb->posts
JOIN $wpdb->term_relationships as cat_rel
ON cat_rel.object_id = $wpdb->posts.ID
JOIN $wpdb->term_taxonomy as cat_tax
ON cat_tax.term_taxonomy_id = cat_rel.term_taxonomy_id
AND cat_tax.term_id IN ( '$cat_ids' )  AND cat_rel.taxonomy = 'category'
JOIN $wpdb->term_relationships as tag_rel
ON tag_rel.object_id = $wpdb->posts.ID
JOIN $wpdb->term_taxonomy as tag_tax
ON tag_tax.term_taxonomy_id = tag_rel.term_taxonomy_id
AND tag_tax.term_id IN ( '$tag_ids' )  AND tag_tax.taxonomy = 'tag'
WHERE $wpdb->posts.post_type = 'post'
AND $wpdb->posts.post_status = 'draft' "
GROUP BY $wpdb->posts.ID
ORDER BY $wpdb->posts.post_date DESC LIMIT $limit OFFSET $offset

where:

$cat_ids = join("', '", array_map('intval', $cat_ids));
$tag_ids = join("', '", array_map('intval', $tag_ids));

the latter is a case where, intuitively, a subquery along the lines of your example would probably win based on the table stats because we then get rid of the group by and end up with a hash join plan. but it'll probably be the exception.

#3 @Denis-de-Bernardy
15 years ago

err, nm on the merge join mention in the first query, it's definitely looking forward for a hash join. but anyway... :-P

using the joins instead of subqueries will be faster if there is no IN() clause, and an order by limit.

#4 @dd32
15 years ago

What about multiple-taxonomy querying Denis?

Or is that as easy as?

AND (
tag_tax.term_id IN ( '$tag_ids' )  AND tag_tax.taxonomy = 'tag'
OR 
tag_tax.term_id IN ( '$cat_ids' )  AND tag_tax.taxonomy = 'cat'
)

#5 @ryan
15 years ago

I haven't profiled any of this myself, but that query I snippeted is something that one of the VIPs on wordpress.com used in place of the default query and supposedly got much, much better performance.

#6 @johnbillion
15 years ago

  • Cc johnbillion@… added

#7 @ryan
15 years ago

  • Milestone changed from 2.9 to Future Release

#8 @willmot
15 years ago

  • Cc willmot added

#9 @filosofo
15 years ago

  • Owner filosofo deleted
  • Status changed from new to assigned

#10 @kevinB
14 years ago

  • Cc kevinB added

#11 @nacin
14 years ago

  • Milestone Future Release deleted
  • Resolution set to duplicate
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.