#12891 closed task (blessed) (fixed)
Advanced multi-taxonomy WP_Query()s
Reported by: | markjaquith | Owned by: | markjaquith |
---|---|---|---|
Milestone: | 3.1 | Priority: | normal |
Severity: | normal | Version: | 3.0.1 |
Component: | Query | Keywords: | ongoing-project |
Focuses: | Cc: |
Description
We have this great taxonomy system, but we're only able to query one of them at a time. There is no way to ask for posts that are in the 'cars' category but lack the "VW Beetle" tag. What people end up doing is crafting complex category systems to simulate multiple taxonomies. That's lame. We should make them want to use multiple taxonomies. Let's give them the tools to do it.
Proposed syntax, and half-working example implementation:
<?php class WP_Query_Tax_Or { var $contents = array(); function WP_Query_Tax_Or( $array ) { $this->contents = $array; return $this; } } class WP_Query_Tax_And { var $contents = array(); function WP_Query_Tax_And( $array ) { $this->contents = $array; return $this; } } class WP_Query_Tax { var $taxonomy; var $object; var $ID; function WP_Query_Tax( $taxonomy, $object ) { $this->taxonomy = $taxonomy; $this->object = $object; $this->ID = wp_some_function_to_get_an_ID_from_ID_or_name_or_slug( $taxonomy, $object ); return $this; } } class WP_Query_Tax_Not { var $taxonomy; var $object; var $ID; function WP_Query_Tax_Not( $taxonomy, $object ) { $this->taxonomy = $taxonomy; $this->object = $object; $this->ID = wp_some_function_to_get_an_ID_from_ID_or_name_or_slug( $taxonomy, $object ); return $this; } } function wp_tax_and() { return new WP_Query_Tax_And( func_get_args() ); } function wp_tax_or() { return new WP_Query_Tax_Or( func_get_args() ); } function wp_tax( $taxonomy, $object ) { return new WP_Query_Tax( $taxonomy, $object ); } function wp_tax_not( $taxonomy, $object ) { return new WP_Query_Tax_Not( $taxonomy, $object ); } function wp_some_function_to_get_an_ID_from_ID_or_name_or_slug( $taxonomy, $object ) { return $object; } $array = array( 'tax' => wp_tax_and( wp_tax_or( wp_tax( 'post_category', 59 ), wp_tax( 'post_tag', 'web' ) ), wp_tax_not( 'post_category', 24 ), wp_tax_not( 'post_tag', 'wordpress' ) ) ); var_dump( $array );
Simple enough. group with wp_tax_and()
or wp_tax_or()
. Select a taxonomy item using wp_tax( 'taxonomy', 'ID or name or slug' )
and omit a taxonomy item using wp_tax_not( 'taxonomy', 'ID or name or slug' )
.
If you run the example code, you'll get an array of objects that could be iterated by WP_Query code using subqueries and parenthesis to generate the precise query that was requested.
It'll translate directly into SQL. Start of AND or OR means you start parenthesis, and join all children elements with AND or OR as appropriate. Getting to the end of an AND or OR means you close parenthesis. And then just use IN() and NOT IN() for wp_tax()
and wp_tax_not()
respectively.
Attachments (28)
Change History (213)
#4
@
15 years ago
I'm trying to achieve the same thing with the Query Multiple Taxonomies plugin. It progressively filters posts using multiple queries.
I'm a bit worried about having it all in a single query: Will it scale, given all the subqueries and joins that would be required? (Of course, this could be rigorously tested once we have an implementation)
#5
follow-up:
↓ 7
@
15 years ago
scribu — it'd use subqueries, which shouldn't be slower than multiple queries. It wouldn't be as fast as querying a single taxonomy, for sure. If I were using it, I'd cache the result. But MySQL is rarely the WP bottleneck. I think it could work.
#7
in reply to:
↑ 5
@
15 years ago
- Keywords WP_Query removed
Replying to markjaquith:
I was thinking about "MySQL server went away" type of errors.
Anyway, I like the potential of this proposed API.
#8
@
14 years ago
- Keywords has-patch added
I've added two sample implementations:
- eager.php: each function generates the SQL code as soon as it's called
- lazy.php: each function creates a new object; the sql can be generated at a later point.
The lazy parsing implementation would only be useful if it would first analyze all the clauses and then output optimized SQL.
On that note, instead of wp_some_function_to_get_an_ID_from_ID_or_name_or_slug(), I think it would be better if wp_tax() accepted a third $field argument, to avoid doing extra queries.
#12
follow-up:
↓ 15
@
14 years ago
I thought of this ticket while reading this post, because basically what this proposed system attempts to do is put an ORM on top of the taxonomy and post database queries.
One (worthy) goal is obviously to provide an abstraction for complex taxonomy queries, an abstraction that's independent of the actual database structure. The problem is that it ends up forcing developers to learn a new, custom querying syntax that isn't as powerful as basic SQL.
I don't have a good solution at this point. Although we don't want to expose the actual database structure, I don't think we can get away from SQL-like syntax because it's really the most familiar and appropriate tool for the job.
Facebook Query Language (FQL) is an example of how you can expose an SQL-like API, so perhaps like it we could offer a few reduced SELECT query patterns that are trivial to parse. The idea would be that it is a syntax already known to those who need to make the complex queries in the first place, but yet a syntax not coupled to the actual, current database structure.
#13
@
14 years ago
- Milestone changed from Awaiting Triage to 3.1
- Owner changed from ryan to markjaquith
- Status changed from new to assigned
- Type changed from feature request to task (blessed)
- Version changed from 3.0 to 3.0.1
I will bless and take responsibility for this one.
#15
in reply to:
↑ 12
@
14 years ago
- Cc mikeschinkel@… added
Replying to filosofo:
I thought of this ticket while reading this post, because basically what this proposed system attempts to do is put an ORM on top of the taxonomy and post database queries.
One (worthy) goal is obviously to provide an abstraction for complex taxonomy queries, an abstraction that's independent of the actual database structure. The problem is that it ends up forcing developers to learn a new, custom querying syntax that isn't as powerful as basic SQL.
I don't have a good solution at this point. Although we don't want to expose the actual database structure, I don't think we can get away from SQL-like syntax because it's really the most familiar and appropriate tool for the job.
Facebook Query Language (FQL) is an example of how you can expose an SQL-like API, so perhaps like it we could offer a few reduced SELECT query patterns that are trivial to parse. The idea would be that it is a syntax already known to those who need to make the complex queries in the first place, but yet a syntax not coupled to the actual, current database structure.
+1. I echo filosofo's concerns and really like his suggestions.
#16
@
14 years ago
I think I have something that will satisfy 90% of the use cases. While working on #14572, I introduced a helper function: _wp_meta_sql(). It takes a list of meta queries.
A single meta query looks like this:
array( 'meta_key' => 'foo', 'meta_value' => 'bar', 'meta_compare' => '!=', )
I think we can have something similar for taxonomies, which would cover 90% of the use cases:
array( 'taxonomy' => 'post_tag', 'terms' => array('green', 'blue'), 'operator' => 'IN', )
where 'operator' can be 'IN', 'NOT IN', 'AND' etc.
WP_Query would be enhanced to recognize multiple taxonomy => value pairs.
Also, there could be a new 'taxonomy_query' var that would accept a list of taxonomy queries.
#17
follow-up:
↓ 18
@
14 years ago
So, simple usage:
query_posts( array( 'tag' => 'foobar', 'color' => 'green' ) );
Advanced usage:
query_posts( array( 'taxonomy_query' => array( 'taxonomy' => 'post_tag', 'terms' => array( 'foobar' ), ), array( 'taxonomy' => 'color', 'terms' => array( 'green', 'white' ), 'operator' => 'AND' ) ) );
#18
in reply to:
↑ 17
@
14 years ago
Replying to scribu:
So, simple usage:
query_posts( array( 'tag' => 'foobar', 'color' => 'green' ) );Advanced usage:
query_posts( array( 'taxonomy_query' => array( 'taxonomy' => 'post_tag', 'terms' => array( 'foobar' ), ), array( 'taxonomy' => 'color', 'terms' => array( 'green', 'white' ), 'operator' => 'AND' ) ) );
How to handle complex OR
, especially when post_meta
needs to be considered, i.e. in pseudo-SQL
WHERE (META team = Bulldogs AND (TERM color IN red AND black) OR (META team = Yellow Jackets AND (TERM color IN white AND gold)
?
#19
follow-up:
↓ 20
@
14 years ago
You can't, but then again you will never be able to do everything using only the API, unless the API uses ORM, which you seem to be against.
#20
in reply to:
↑ 19
@
14 years ago
Replying to scribu:
You can't, but then again you will never be able to do everything using only the API, unless the API uses ORM, which you seem to be against.
Why put words in my mouth from a simple +1? And never say never; I'm sure it's possible with a well thought-out design.
#21
@
14 years ago
Be more careful where you put your +1s next time. :P
Joking aside, I believe having a working solution in 3.1 is better than maybe having an ideal solution in a future release.
#22
@
14 years ago
I mean, this ticket has been open for 5 months, with an even older one open for 16 months. It's time we ship something.
#23
@
14 years ago
Replying to scribu:
I believe having a working solution in 3.1 is better than maybe having an ideal solution in a future release.
I mean, this ticket has been open for 5 months, with an even older one open for 16 months. It's time we ship something.
Fair point, and agreed.
But realize it will just push the limits to different point; i.e there will still be people needing more.
Something to consider for v3.2 or later; a specialized SQL-like WP-specific query language that can handle anything SQL can handle, have it's own QOM (Query Object Mode) like HTML5 has it's own DOM, and be much simpler given it recognizes the underlying patterns in WordPress and would ultimately map to MySQL (or maybe SQLite or even a NoSQL variant.) Envision queries like:
LIST posts WITH TAG featured LIST posts WITH TAXONOMY color TERMS red,green,blue LIST posts WITH TYPE event SORTED BY title LIST posts DURING 2010 REVERSE SORTED BY date LIST posts CONTAINING "WordPress Plugins" GET post IDENTIFIED BY 37 GET post ID 37 GET page IDENTIFIED BY about GET page SLUG about LIST postfields WHERE post IDENTIFIED BY 37 LIST postfields WHERE posts WITH TAG featured LIST posts WITH TAGS GROUP(featured AND issues) BUT NOT politics
The design is to be approachable by mere mortals and to avoid much of what make SQL appear to be arcane to most people.
I've already written much of the EBNF for it. And yes, it would start life as a plugin.
-Mike
#24
@
14 years ago
12891.diff is a start in abstracting away the taxonomy SQL generation out of WP_Query.
#25
follow-up:
↓ 26
@
14 years ago
Scribu, I'd really like to be able to determine the operator between taxonomies, and be able to group terms. Is there something about my proposed syntax that you objected to?
As two minor enhancements, I'd take your advice about id/slug and make explicit wrappers. wp_tax_id()
, wp_tax_slug()
and their "not" counterparts. I'd also introduce specification of hierarchy handling, for hierarchical taxonomies. Maybe _but_no_children
suffixes to avoid the default behavior of querying for all children.
filosofo's suggestion of something like FQL (WPQL?) has appeal. You could do all of things I want to do, with that. You might need something to say "_but_no_children" for hierarchical taxonomies, but otherwise, it translates well to a simplified SQL-like language. I'm just not sure we're ready for that — it's a big step.
#26
in reply to:
↑ 25
@
14 years ago
Replying to markjaquith:
filosofo's suggestion of something like FQL (WPQL?) has appeal. You could do all of things I want to do, with that. You might need something to say "_but_no_children" for hierarchical taxonomies, but otherwise, it translates well to a simplified SQL-like language. I'm just not sure we're ready for that — it's a big step.
A "WPQL" was exactly what I was showing a hypothetical for above but agree that it's a big step. It would take quite a while to get it right including having lots of people throwing advanced use-cases at it to make sure its design was robust before unleashing it on the world.
v3.5? ;-)
#27
@
14 years ago
Scribu, I'd really like to be able to determine the operator between taxonomies, and be able to group terms. Is there something about my proposed syntax that you objected to?
The main problem I have with it is that it's akward to build a query progressively. Here is some code from my Query Multiple Taxonomies plugin:
$query = array(); foreach ( $wp_query->_qmt_query as $taxname => $value ) foreach ( explode( '+', $value ) as $value ) $query[] = wp_tax( $taxname, explode( ',', $value ), 'slug' ); $post_ids = $wpdb->get_col( wp_tax_query( wp_tax_group( 'AND', $query ) ) );
You tell me if it's readable or not.
#28
@
14 years ago
Anyway, regardless if we're going to use this syntax or not, we still need to clean up WP_Query first, so I'll just continue working on 12891.diff
#29
@
14 years ago
12891.2.diff: all taxonomy combinations supported: query categories, tags and custom taxonomies in one go.
#30
@
14 years ago
I've found that making get_objects_in_term() more flexible is the organic way to extend the taxonomy API.
I've also added a wp_tax_query() function on top of it. See 12891.4.diff
#34
@
14 years ago
Went ahead with the spring cleaning in WP_Query. I expect there will be a lot of obscure side effects, so I wanted to get a head start before the beta period.
#35
@
14 years ago
What's the chance of adding a callback filter to the $args
for WP_Query()
that would be called after the posts_request
filter so that we don't have to depend on global filters when we need to modify a query for a specific call in a plugin? (Ideally this callback would work for all the filters in WP_Query()
but I'd settle for one that gets called after posts_request
vs. not having any.)
#37
follow-up:
↓ 39
@
14 years ago
Mike, that's outside the scope of this ticket. Please open a new one.
#43
follow-up:
↓ 45
@
14 years ago
Some things I've noticed
- Viewing Term pages which have no objects within will display a DB Error ( SQL IN() Can't be empty )
- Viewing parent Terms (Categories, and a Custom Taxonomy) will not include the child term's objects(posts). This was definitely the case with categories previously, but am unsure of what the situation with custom tax's were.
#45
in reply to:
↑ 43
@
14 years ago
Replying to dd32:
- Viewing parent Terms (Categories, and a Custom Taxonomy) will not include the child term's objects(posts). This was definitely the case with categories previously, but am unsure of what the situation with custom tax's were.
Not true. It's taken care of in get_objects_in_term().
#47
@
14 years ago
It's taken care of in get_objects_in_term().
You may want to thoroughly test it then.
get_objects_in_term() handles it for requests where its being queried by a term_id, In the case of pretty permalinks, where its being queried by slug, They are not.
#54
@
14 years ago
Receiving this after SVN'uping this evening:
WordPress database error: [You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ')' at line 5] SELECT object_id FROM wp_term_relationships INNER JOIN wp_term_taxonomy USING (term_taxonomy_id) WHERE taxonomy IN ('state') AND term_id IN ()
when viewing a taxonomy archive which doesnt exist (ie. the term doesnt exist, /state/404-me-please/)
#56
@
14 years ago
Let's go with something closer to what scribu suggested. Let's skip parentheticals this time around, and assume the various taxonomy queries are ANDed together. So, taxonomy, term(s) and an operator to join those terms. Scribu, want to take first shot at that, since you suggested the syntax?
#59
follow-up:
↓ 61
@
14 years ago
Done. Advanced query example updated:
get_posts( array( 'post_type' => 'event', 'tax_query' => array( array( 'taxonomy' => 'location', 'terms' => array( 'everywhere' ) ), array( 'taxonomy' => 'location', 'terms' => array( 'there' ), 'operator' => 'NOT IN' ), ) ) );
#61
in reply to:
↑ 59
@
14 years ago
Replying to scribu:
Done. Advanced query example updated:
Nice! But would 'tax_terms'
not be more consistent with other arguments than 'tax_query'
?
#66
@
14 years ago
It does, but I think we should handle it in the WP class or in WP_Rewrite. Related: #9591
#71
@
14 years ago
'cat' query var is set to a blank string on category archives page. Patch attached.
#73
@
14 years ago
The change in [15613] makes post queries such as get_posts('tag=test') no longer filter on tags if the SimpleTags plugin is enabled.
Steps to reproduce:
- Using a fresh WP trunk checkout, create a second post with a tag of "test"
- Do a print_r(get_posts('tag=test')) from index.php - only one post is displayed.
- Download and active the SimpleTags plugin from http://wordpress.org/extend/plugins/simple-tags/
- Reload index.php, both posts are now displayed
- The problem goes away if SimpleTags is deactivated or if you roll back this change, ie. it exists with "svn up -r 15613" and goes away with "svn up -r 15612"
#75
@
14 years ago
I reported the above issue to the Simple Tags author:
http://wordpress.org/support/topic/simple-tags-breaks-tag-based-post-queries-in-wp-trunk
#76
@
14 years ago
Where did the taxonomy and term query vars go?
They're not getting set properly in the case of querying a custom taxonomy page, and it is causing errors in wp_title(), in the is_tax() branch.
#78
@
14 years ago
They're not getting set properly in the case of querying a custom taxonomy page, and it is causing errors in wp_title(), in the is_tax() branch.
Steps to reproduce?
#81
@
14 years ago
They're not getting set properly in the case of querying a custom taxonomy page, and it is causing errors in wp_title(), in the is_tax() branch.
Steps to reproduce?
View a custom Taxonomy index page with WP_DEBUG enabled, Take note of the page title containing PHP Notice's.
I mentioned the lack of taxonomy/term query vars in a previous ticket 2+ weeks back, I cant find it right now, But I believe you agree'd that adding it was needed for backwards-compat at least (back-compat isn't the only issue here however).
#82
@
14 years ago
I mentioned the lack of taxonomy/term query vars in a previous ticket 2+ weeks back, I cant find it right now
Comment: http://core.trac.wordpress.org/ticket/12659#comment:25
#83
@
14 years ago
Minor error in phpdoc: #15431
Also get_tax_sql()
needs better error handling. It can fatal error out if a non-existent taxonomy name is passed. See incoming patch for fix for that...
#85
@
14 years ago
Sites that perform well with 3.0 fall over with very slow queries like the following when running 3.1:
SELECT object_id FROM wp_term_relationships WHERE 1 =1 AND object_id IN ( SELECT object_id FROM wp_term_relationships INNER JOIN wp_term_taxonomy USING ( term_taxonomy_id ) INNER JOIN wp_terms USING ( term_id ) WHERE taxonomy IN ( 'post_tag' ) AND slug IN ( 'chevy-volt', 'fisker', 'ford', 'gm', 'karma', 'leaf', 'nissan', 'roadster', 'tesla', 'v-vehicle', 'volt' ) ) AND object_id IN ( SELECT object_id FROM wp_term_relationships INNER JOIN wp_term_taxonomy USING ( term_taxonomy_id ) INNER JOIN wp_terms USING ( term_id ) WHERE taxonomy IN ( 'primary_channel' ) AND slug IN ( 'all' ) )
That takes over a minute.
#86
@
14 years ago
MySQL returned an empty result set (i.e. zero rows). (Query took 61.8451 sec)
Ouch!
#88
@
14 years ago
In this case, WP might be acting on query vars too early, before filters have had a chance to do their thing. Still investigating.
#89
@
14 years ago
Looks like the plugins is hooking onto parse_query and looking for the primary_channel query var to be set to all. If it is set to all then the primary_channel query var as well as the term and taxonomy query vars are set to null so that they won't be a part of the query later. This is breaking down somewhere in 3.1.
#91
@
14 years ago
I'm dumping the query vars and it does seem like that is fixed. Still trying to track it down. Sorry for the noise.
#92
@
14 years ago
Fixed the plugin. I'm not really sure how it was working under 3.0. I don't think there's anything here we need to defend against back compat wise.
#93
@
14 years ago
WP_Query('posts_per_page=15&cat=-21072,-37613,-4533907');
Generates a query like:
SELECT SQL_CALC_FOUND_ROWS wp_3508545_posts.* FROM wp_3508545_posts WHERE 1=1 AND wp_3508545_posts.ID IN(0, 14, 17, 25
Where the post ID list has 1666871 IDs and the resulting query is 1.2 million characters long. This causes segfaults.
#94
@
14 years ago
The analagous query in 3.0 is:
SELECT * FROM wp_3508545_posts WHERE ID IN(233413,232570,233293,232774,233148,233135,232474,233138,40895,230938,40891,225264,42561,41262,41460)
#95
follow-up:
↓ 107
@
14 years ago
get_tax_sql() does this:
SELECT object_id FROM wp_trunk_term_relationships INNER JOIN wp_trunk_term_taxonomy USING (term_taxonomy_id) WHERE taxonomy IN ('category') AND term_id NOT IN (21072,37613,4533907)
That query is unlimited despite the original request being for only 15 posts. This won't scale.
#96
@
14 years ago
get_tax_sql() has to go, it seems. It runs before any of the orderby and limit args are processed. I think we have to pare this back.
@
14 years ago
Bottom up approach started with 3.0 query.php and added changesets. Missed pieces of [15649], [15650] (did those get reverted); [15732] [15824] [15825] some of [15827] (hierarchical URLs); [15773] (urlencode); [15795] (pieces filter); [15923] then [16155] (first hunk failed); [15887] (I don't see what this is patching, most was reverted in [15888]); [15982] ('fields' arg).
#97
@
14 years ago
Would it be better if get_tax_sql() returned the raw subqueries instead of the resulting post ids?
#98
follow-up:
↓ 99
@
14 years ago
Or we could do it more similarly to get_meta_sql(), generating JOIN clauses besides WHERE.
#99
in reply to:
↑ 98
@
14 years ago
Replying to scribu:
Or we could do it more similarly to get_meta_sql(), generating JOIN clauses besides WHERE.
A big problem with get_meta_sql() is it lowers the barrier for entry. I liked that I needed to roll my own JOIN clauses because if I was capable of doing that, then I was capable of realizing the bad queries that could come out of it.
Being able to blindly specify any number of meta key queries is going to result in abuse.
#101
@
14 years ago
I think avoiding subqueries in WHERE IN clauses is considered good practice with MySQL. JOINs would probably be better.
#104
@
14 years ago
query.php.r15471.2.diff starts with r15471 as a base, builds query.php back up without and taxonomy or metadata query changes. I can compile a list of changesets that it reverts and tickets it re-opens.
The following changesets were skipped. They'll need to be rewritten as appropriate:
#105
@
14 years ago
Queries for existing term ids first.
This avoids the scaling problem when doing something like cat=-123456
, which will have no effect (and therefore no performance penalty) if that category doesn't exist.
#107
in reply to:
↑ 95
@
14 years ago
Replying to ryan:
That query is unlimited despite the original request being for only 15 posts. This won't scale.
We can't limit the IN() to 15 ids because that would give incorrect results. The first 15 matched IDs might not match the outer query, like post_type etc.
I see 3.0 was using GROUP BY to handle intersections so I think we can use that in get_tax_sql() too.
#108
@
14 years ago
Each tax_query generates a subquery. It might not be the fastest way, but at least it won't segfault.
#109
@
14 years ago
A simple foo=bar
query executes in 0.10 seconds with 22.000 posts, matching 113 of them:
SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_posts WHERE 1=1 AND wp_posts.ID IN ( SELECT object_id FROM wp_term_relationships INNER JOIN wp_term_taxonomy USING (term_taxonomy_id) WHERE taxonomy IN ('foo') AND term_id IN (31) ) AND wp_posts.post_type = 'post' AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'future' OR wp_posts.post_status = 'draft' OR wp_posts.post_status = 'pending' OR wp_posts.post_status = 'private') ORDER BY wp_posts.post_date DESC LIMIT 0, 100
#112
@
14 years ago
Testing that patch. Is this necessary:
SELECT term_id FROM wp_3508545_term_taxonomy WHERE taxonomy IN ('category') AND term_id IN (21072,37613,4533907)
Can we assume that the requested terms are in the request taxonomy?
#113
@
14 years ago
The blog that created that crazy cat exclusion query is much better now. It runs in about 2.5 seconds. I don't know how that compares to 3.0 yet, but it is an improvement from a 1.2 million char query string. :-)
#114
@
14 years ago
Can we assume that the requested terms are in the request taxonomy?
Only if I fetch the term_taxonomy_ids instead of the term ids. That would also remove an extra join in each subquery. I'll write it up.
#115
@
14 years ago
In 3.1 this query takes 2,417.7ms.
SELECT SQL_CALC_FOUND_ROWS wp_3508545_posts.* FROM wp_3508545_posts WHERE 1=1 AND wp_3508545_posts.ID IN ( SELECT object_id FROM wp_3508545_term_relationships INNER JOIN wp_3508545_term_taxonomy USING (term_taxonomy_id) WHERE taxonomy IN ('category') AND term_id IN (14667805,14667940,14667953,14667963,34584120,1769037) ) AND wp_3508545_posts.post_type IN ('post', 'page', 'attachment') AND (wp_3508545_posts.post_status = 'publish' OR wp_3508545_posts.post_status = 'private') ORDER BY wp_3508545_posts.post_date DESC LIMIT 0, 6; SELECT FOUND_ROWS()
In 3.0 the analogous query takes 178.5ms:
SELECT SQL_CALC_FOUND_ROWS wp_3508545_posts.* FROM wp_3508545_posts INNER JOIN wp_3508545_term_relationships ON (wp_3508545_posts.ID = wp_3508545_term_relationships.object_id) INNER JOIN wp_3508545_term_taxonomy ON (wp_3508545_term_relationships.term_taxonomy_id = wp_3508545_term_taxonomy.term_taxonomy_id) WHERE 1=1 AND wp_3508545_term_taxonomy.taxonomy = 'category' AND wp_3508545_term_taxonomy.term_id IN ('1769037', '14667805', '14667940', '14667953', '14667963', '34584120') AND wp_3508545_posts.post_type = 'post' AND (wp_3508545_posts.post_status = 'publish' OR wp_3508545_posts.post_status = 'private') GROUP BY wp_3508545_posts.ID ORDER BY wp_3508545_posts.post_date DESC LIMIT 0, 6; SELECT FOUND_ROWS()
#120
@
14 years ago
It is faster than 3.0 except for excludes, which are very slow.
SELECT SQL_CALC_FOUND_ROWS wp_3508545_posts.* FROM wp_3508545_posts INNER JOIN wp_3508545_term_relationships ON (wp_3508545_posts.ID = wp_3508545_term_relationships.object_id) WHERE 1=1 AND wp_3508545_term_relationships.term_taxonomy_id NOT IN (16,38415,45884,44849) AND wp_3508545_posts.post_type IN ('post', 'page', 'attachment') AND (wp_3508545_posts.post_status = 'publish' OR wp_3508545_posts.post_status = 'private') ORDER BY wp_3508545_posts.post_date DESC LIMIT 0, 15
That took 35 seconds.
#121
@
14 years ago
The exclusion is much faster now. 1.6s vs 35s, which is comparable to 3.0. The inclusions are slightly faster than 3.0.
#123
@
14 years ago
Commit, and we'll see how it goes. Thanks for making this happen. I was not optimistic this morning. :-)
#126
@
14 years ago
Ran into some errors with the generated query.
WordPress database error Column 'term_id' in field list is ambiguous for query made by _transform_term
Attached patch attachment:ticket:12891:taxonomy.diff should fix this.
#128
follow-up:
↓ 129
@
14 years ago
Since the decision to switch away from NOT IN to a subquery for exclusions happened in IRC, let me fill in the background. NOT IN was scanning the whole table (almost 200k rows). The same query without the "NOT" scanned only 5k rows. The unbounded subquery we switched to for exclusions is much faster, as mentioned above.
#129
in reply to:
↑ 128
@
14 years ago
The subquery for NOT IN added in r12891 is not quite right as it includes posts with the excluded terms so long as the post has other terms that are not excluded. I moved the NOT in front of the subquery and MySQL shows that either way it scans the same number of rows.
$where .= " AND $primary_table.$primary_id_column NOT IN ( SELECT object_id FROM $wpdb->term_relationships WHERE term_taxonomy_id IN ($terms) )";
#133
@
14 years ago
SELECT SQL_CALC_FOUND_ROWS wp_14578484_posts.* FROM wp_14578484_posts INNER JOIN wp_14578484_term_relationships ON (wp_14578484_posts.ID = wp_14578484_term_relationships.object_id) WHERE 1=1 AND wp_14578484_posts.ID NOT IN (104754,104534,104560,104705) AND wp_14578484_term_relationships.term_taxonomy_id IN (1194,11561,12084,1202,1203,1199,1196,1200,1197,1201,1204,5016,6745,6950,6744,6558,1195,1198,6746,13187) AND wp_14578484_posts.post_type IN ('post', 'page', 'attachment', 'calendar_event', 'listicle', 'background', 'top-list', 'guide', 'gallery', 'lead', 'personality', 'show', 'station', 'breaking') AND (wp_14578484_posts.post_status = 'publish' OR wp_14578484_posts.post_status = 'private') ORDER BY wp_14578484_posts.post_date DESC LIMIT 0, 15
That query returns duplicate posts. If a post belongs to more than one of the terms in the IN() then the post will be duplicated.
Query created with this call:
query_posts(array('category_name'=> 'a_slug', 'post__not_in'=>$exclude_ids, 'paged'=>get_query_var('paged')))
Also notice all of those post types. In 3.0 only the 'post' post_type is queried. That could be unrelated to this ticket though.
#137
@
14 years ago
I thought I committed a fix for this last night. This morning I noticed I had the svn ci sitting on the command line but I never hit enter. :-)
#138
@
14 years ago
[15649] breaks existing behavior by setting post_type to 'any' instead of defaulting to 'post' when a taxonomy is specified but post_type is not. We're already being bitten by this on VIP sites on .com.
#139
@
14 years ago
Is there no way to do an OR query with this? I can't select posts in the gallery category OR in the gallery post_format?
#140
@
14 years ago
When trying to create a query to do what I want (namely gallery-category OR gallery-post_format), I eventually came up with this:
`query_posts(array(
'tax_query' => array(
array(
'taxonomy' => 'post_format',
'terms' => array('post-format-gallery'),
'field' => 'slug',
),
array(
'taxonomy' => 'category',
'terms' => array('gallery'),
'field' => 'slug',
),
)
) );
`
But of course can find no way to make that an OR relationship. In looking at the resulting query this produces (eventually), I have this:
SELECT SQL_CALC_FOUND_ROWS wp_posts.* FROM wp_posts INNER JOIN wp_term_relationships ON (wp_posts.ID = wp_term_relationships.object_id) INNER JOIN wp_term_relationships AS tt1 ON (wp_posts.ID = tt1.object_id) WHERE 1=1 AND wp_term_relationships.term_taxonomy_id IN (110) AND tt1.term_taxonomy_id IN (11) AND wp_posts.post_type IN ('post', 'page', 'attachment') AND (wp_posts.post_status = 'publish' OR wp_posts.post_author = 1 AND wp_posts.post_status = 'private') GROUP BY wp_posts.ID ORDER BY wp_posts.post_date DESC LIMIT 0, 10
That naturally produces zero rows, because it's assuming an AND relationship. The key part of that turns out to be here:
wp_term_relationships.term_taxonomy_id IN (110) AND tt1.term_taxonomy_id IN (11)
Simply changing that AND to an OR and grouping them in parentheses gets me the results I want.
So perhaps we can add the ability of grouping to this, somehow? If I was to group the two things I wanted OR'ed into their own array with some kind of extra parameter to specify an OR relationship between them? Something like this:
`query_posts(array(
'tax_query' => array(
array('relationship'=>'OR',
array(
'taxonomy' => 'post_format',
'terms' => array('post-format-gallery'),
'field' => 'slug',
),
array(
'taxonomy' => 'category',
'terms' => array('gallery'),
'field' => 'slug',
),
);
)
) );
`
Or something? This array nesting is bloody complicated though, there has to be an easier way to allow this sort of thing.
#141
@
14 years ago
get-tax-or.diff is a working example of the proposal I made earlier. It uses get_tax_sql recursively for grouped cases that have a relationship defined between them.
I have not tested it extensively, I suggest somebody look it over and rethink what I probably missed.
#142
follow-up:
↓ 143
@
14 years ago
WP_Query( array( 'category_name' => 'A Cat', 'showposts' => '1' ) );
Produces the following assortment of queries:
SELECT wp_trunk_term_taxonomy.term_id FROM wp_trunk_term_taxonomy INNER JOIN wp_trunk_terms USING (term_id) WHERE taxonomy IN ('category') AND wp_trunk_terms.slug IN ('a') SELECT wp_trunk_term_taxonomy.term_id FROM wp_trunk_term_taxonomy INNER JOIN wp_trunk_terms USING (term_id) WHERE taxonomy IN ('category') AND wp_trunk_terms.slug IN ('cat') SELECT t.*, tt.* FROM wp_trunk_terms AS t INNER JOIN wp_trunk_term_taxonomy AS tt ON t.term_id = tt.term_id WHERE tt.taxonomy = 'category' AND t.slug = 'a' LIMIT 1 SELECT SQL_CALC_FOUND_ROWS wp_trunk_posts.* FROM wp_trunk_posts WHERE 1=1 AND wp_trunk_posts.post_type IN ('post', 'page', 'attachment') AND (wp_trunk_posts.post_status = 'publish' OR wp_trunk_posts.post_status = 'private') GROUP BY wp_trunk_posts.ID ORDER BY wp_trunk_posts.post_date DESC LIMIT 0, 1
The final query simply returns the latest post, regardless of category.
#143
in reply to:
↑ 142
;
follow-up:
↓ 145
@
14 years ago
Replying to ryan:
WP_Query( array( 'category_name' => 'A Cat', 'showposts' => '1' ) );
The category_name is supposed to be a slug, isn't it? Slugs can't have spaces in them, so it's probably splitting on the space somewhere.
#144
@
14 years ago
Replying to chrisscott:
[15649] breaks existing behavior by setting post_type to 'any' instead of defaulting to 'post' when a taxonomy is specified but post_type is not. We're already being bitten by this on VIP sites on .com.
It looks like 3.0 searched "any" for custom taxonomies but not for categories and tags. Whereas 3.1 searches any for all taxonomies.
#145
in reply to:
↑ 143
@
14 years ago
Replying to Otto42:
Replying to ryan:
WP_Query( array( 'category_name' => 'A Cat', 'showposts' => '1' ) );The category_name is supposed to be a slug, isn't it? Slugs can't have spaces in them, so it's probably splitting on the space somewhere.
Before it would sanitize_title() so could get away with putting in a name. Not the best thing to due but this is a back compat break.
#146
@
14 years ago
Plus, it reveals some bad behavior such as fetching a ttid that is never used and constructing a query that just returns the latest post.
#148
in reply to:
↑ 147
@
14 years ago
Replying to ryan:
categoryin didn't include children before. Now it does.
Hmm, it looks like it was intended to get children before. But some queries that are using categoryin in 3.0 are behaving differently than in 3.1. Still investigating.
#150
follow-up:
↓ 153
@
14 years ago
'tax_query' => array( array( 'taxonomy' => 'category', 'terms' => array(1), 'operator' => 'IN', 'field' => 'term_id', 'include_children' => false ) ),
A shortcut for excluding children from a simple single taxonomy query would be nice. And maybe if the tax_query has only one query in it we can forego requiring the outer array().
#151
@
14 years ago
[16413] causes non existent categories to be treated as a category. To reproduce, visit domain.com/category/category-exists/not-exists.xml and a redirect to domain.com/category/category-exists/some-file.xml/ takes place, returns the default posts, and is_404() is false.
#152
in reply to:
↑ 149
;
follow-up:
↓ 154
@
14 years ago
Replying to ryan:
How about a "no children" flag?
There already is one: 'include_children'. See the inline docs on get_tax_sql().
#153
in reply to:
↑ 150
@
14 years ago
Replying to ryan:
And maybe if the tax_query has only one query in it we can forego requiring the outer array().
For simple queries, there still is the 'tax' => 'tag_1,tag_2' syntax.
#154
in reply to:
↑ 152
@
14 years ago
Replying to scribu:
Replying to ryan:
How about a "no children" flag?
There already is one: 'include_children'. See the inline docs on get_tax_sql().
Yeah. See what I ended up using in the snippet above. Don't include children is one of the biggest requests so far. I think the child inclusion in 3.0 was a bit broken.
#155
@
14 years ago
The biggest back compat issues right now are category_name not accepting full names, the post_type 'any' being used by default for category and post_tag queries, and the non-existant cat handing Nick mentions above.
#157
@
14 years ago
Are you sure 'category_name' was supposed to match full names? get_category_by_path() expects category slugs.
#159
@
14 years ago
$q['category_name'] = implode('/', array_map('sanitize_title', explode('/', $q['category_name'])));
The sanitize_title() allowed full names to be passed.The intention was for only slugs to be passed, but the code allowed full names to work.
#161
@
14 years ago
That fix needs more thought, but it addresses the immediate concern of requests for non-existent cats returning posts instead of 404ing.
#165
@
14 years ago
The logic behind [16526] is that WP_Query should not care about URL decoding.
Thus,
if you go to an URL like /category/foo+bar/
it will be interpreted as foo AND bar
but if you call get_posts('category_name=Foo Bar')
, it will look for foo-bar
#167
@
14 years ago
Without pretty permalinks, doing a multi-taxonomy query for cat+tag gives me a notice for undefined index “hierarchical” in line 1506 of query.php. I attach a tentative patch.
#168
@
14 years ago
After talking to scribu and nacin, here's a patch to allow a global OR in the tax_query parameter. It works like this:
query_posts(array( 'tax_query' => array( 'relation' => 'OR', array( 'taxonomy' => 'post_format', 'terms' => array('post-format-gallery'), 'field' => 'slug', ), array( 'taxonomy' => 'category', 'terms' => array('gallery'), 'field' => 'slug', ), ), ) );
The relation=>OR changes the tax_query to use OR's instead of AND's on the relevant WHERE clauses. So the above query will show any post with category of gallery or a post format of gallery.
Patch is relation-or.diff, against current trunk.
#170
@
14 years ago
Due to ambiguous condition with empty terms, we decided to not support the XOR operator.
relation-or.3.diff reflects this.
#171
@
14 years ago
WP_Query('posts_per_page=15&cat=-21072,-37613,-4533907');
Produced:
SELECT SQL_CALC_FOUND_ROWS wp_trunk_posts.* FROM wp_trunk_posts WHERE 1=1 AND ( ) AND wp_trunk_posts.post_type = 'post' AND (wp_trunk_posts.post_status = 'publish' OR wp_trunk_posts.post_status = 'private') GROUP BY wp_trunk_posts.ID ORDER BY wp_trunk_posts.post_date DESC LIMIT 0, 15
None of those exclude categories exist.
#172
@
14 years ago
Should be fixed in relation-or.4.diff.
#173
@
14 years ago
I tested for regressions but ignored the new relation stuff. Canonical redirects, 404s, includes and excludes all seem good. Queries look sane. Let's commit and see how it handles in beta 1.
#175
follow-up:
↓ 176
@
14 years ago
- Cc maorb added
Will this new blessed feature allow us also to query multiple custom fields that are assigned to a post or a CPT?
Such as - meta_key=key1&meta_value=ValueOfKey1&meta_key=key2&meta_value=valueOfKey2
Thanks
#177
@
14 years ago
- Resolution set to fixed
- Status changed from assigned to closed
Report bugs on new tickets. Resolving this as fixed.
#178
follow-up:
↓ 179
@
14 years ago
This support "x AND y
" and "x OR y
" but not "x AND (y OR z)
" and not "x AND y AND NOT z
" right?
(I know this is closed and I'm not asking to reopen, just wanted to get clarity.)
#179
in reply to:
↑ 178
;
follow-up:
↓ 180
@
14 years ago
Replying to mikeschinkel:
This support "
x AND y
" and "x OR y
" but not "x AND (y OR z)
" and not "x AND y AND NOT z
" right?
Actually it can support all of the above, with caveats.
X AND Y and X OR Y is obviously possible for all cases.
X AND Y AND NOT Z is possible, as you can set the operator parameter to NOT IN for a single taxonomy.
X AND (Y OR Z) is possible for the special case of Y and Z being both terms in the same taxonomy. The "terms" parameter uses an operator of "IN" by default, so it would really be like "cat=X and tag IN (Y,Z)", which is basically equivalent.
See http://otto42.com/81 for detailed usage info, and some examples.
#180
in reply to:
↑ 179
;
follow-up:
↓ 181
@
14 years ago
Replying to Otto42:
Replying to mikeschinkel:
This support "
x AND y
" and "x OR y
" but not "x AND (y OR z)
" and not "x AND y AND NOT z
" right?
Actually it can support all of the above, with caveats.
...
See http://otto42.com/81 for detailed usage info, and some examples.
Thanks. I actually read that link first and wasn't clear on the above.
So I'll turn the question around; are there types of logic that it cannot handle?
#181
in reply to:
↑ 180
@
14 years ago
Replying to mikeschinkel:
Thanks. I actually read that link first and wasn't clear on the above.
So I'll turn the question around; are there types of logic that it cannot handle?
Yes. It can't do arbitrary combinations of AND's and OR's across multiple taxonomies.
For example, there is no way to specify cat=A AND (tag=B OR format=C).
"Groupings" of terms are always either "IN" or "NOT IN" and must be single taxonomies. The AND/OR is global across those groupings. This is flexible enough for almost any conceivable real-life case.
And it's not really as limiting as it seems, when you consider that A AND (B OR C) can often be rewritten as (A AND B) OR (A AND C). The idea is to get the bits in your groupings to all be of the same taxonomy, and the connecting pieces to those groupings to all be of the same type (AND or OR).
Related (less ambitious): #9951