Opened 11 years ago
Closed 10 years ago
#25775 closed defect (bug) (fixed)
WP_Date_Query table prefixing
Reported by: | ew_holmes | Owned by: | boonebgorges |
---|---|---|---|
Milestone: | 4.1 | Priority: | normal |
Severity: | major | Version: | 3.7 |
Component: | Query | Keywords: | has-patch needs-testing |
Focuses: | Cc: |
Description
I believe I've found a bug within WP_Date_Query. The lack of a $wpdb
table prefix will break queries that have any joins put into them.
If the $column
specified is within the posts table, the $column
variable is not being prefixed by $wpdb->posts
as it should be.
Alternatively, if you are comparing comment_date or comment_date_gmt, $wpdb->comments
should be prefixed.
Attachments (8)
Change History (45)
#3
@
11 years ago
I think that in order to keep WP_Date_Query consistent with the other SQL-generating components, it's a good idea to prefix each field.
#6
@
11 years ago
I've tested this on a WordPress Multisite installation, with 3.7.1 installed. Anyone want to take charge on some unit testing with this? I agree with @tomauger that this adds consistency within the other Query classes (WP_Query, Tax_Query, Meta_Query).
#9
@
11 years ago
So, I found some delightful bugs when running some of the Date Query unit tests directly.
Things:
- Some unit tests produce database errors, but when they run in certain contexts, database error suppression has been triggered in a previous test and not toggled back on
- Date Query only affects the
WHERE
clause, but allows arguments that would otherwise produce aJOIN
- there is currently no code enforcing thatJOIN
, so the queries can blow up - Prefixing raises errors for unknown tables, where previously it was only an unknown column error
- If you don't pass a
column
arg when addingdate_query
to a query for comments, the default value ispost_date
, but there was no code even attempting toJOIN
- I am not sure how some of these tests ever passed.
What I Did in 25775.diff:
- Made a unit test that makes
date_query
s with each of the valid columns and then asserts RegEx on the generated SQL - in
WP_Date_Query
, add 2 properties:$join_comments
and$joins_posts
which are toggled on when the column is being validated - In
WP_Query
, ifdate_query
is an arg, check for->joins_comments
after->get_sql()
, and then way down the road, enforce theJOIN
if the resulting$join
var doesn't contain the table - In
WP_Comment_Query
, ifdate_query
is an arg, check for->joins_posts
after->get_sql()
, and then way down the road, enforce theJOIN
if the resulting$join
var doesn't contain the table
#11
@
11 years ago
- Milestone changed from 3.8 to 3.7.2
Punting from 3.8 in prep for 3.8 RC1. This seems a bit large to go in at this point. Moving to 3.7.2 in a fit of possibly fanciful optimism.
#12
@
11 years ago
Prefixing table names seems like a good idea to avoid ambiguity but I'm confused as to how you think comments were broken. When querying for comments that match certain date parameters using WP_Comment_Query
(used by get_comments()
), the column is defaulted to the correct value:
http://core.trac.wordpress.org/browser/trunk/src/wp-includes/comment.php?rev=26492#L396
Perhaps I'm forgetting something but when is the comments table JOIN
ed with anything else?
All of the unit tests that I wrote passed when I wrote them. They shouldn't have been affected by other unit tests either as I ran only my groups.
Can we get some sample code here?
#13
follow-up:
↓ 17
@
11 years ago
I understand what wonderboymusic is tackling, but I am not sure what the original reported bug is here. This seems like a 3.9 thing at this point. WP_Date_Query is new; if something isn't quite right about it, it's not a major regression, just a minor defect we can fix in a future release.
More: http://irclogs.wordpress.org/chanlog.php?channel=wordpress-dev&day=2013-12-05&sort=asc#m744879
#15
@
11 years ago
- Keywords 3.9-early added
- Milestone changed from 3.7.2 to Future Release
A full code example would be helpful, for both general confirmation and also so we can unit test it.
#16
@
11 years ago
here you go. - code has been simplified for the sake of readability:
<?php /* basic-set consists of all published posts */ $query_params=array( 'posts_per_page' => -1, 'post_status' => 'publish' ); /* custom taxonomy */ if (!empty($_REQUEST['writer'])) $query_params['writers']=esc_sql($_REQUEST['writer']); /* start_date and end_date */ $date_query=array( 'column' => 'post_date_gmt', 'inclusive' => true ); if (!empty($_REQUEST['start_date'])) { /* start_date will be the input date at 00:00:00 */ $date_query['after']=esc_sql($_REQUEST['start_date']).' 00:00:00'; } if (!empty($_REQUEST['end_date'])) { /* end_date will be the input date at 23:59:59 */ $date_query['before']=esq_sql($_REQUEST['end_date']).' 23:59:59'; } $query_params['date_query']=array( $date_query ); /* query database */ $query=query_posts($query_params);
results in the following sql statement
SELECT wp_2_posts.* FROM wp_2_posts INNER JOIN wp_2_term_relationships ON (wp_2_posts.ID = wp_2_term_relationships.object_id) LEFT JOIN wp_2_posts AS p2 ON (wp_2_posts.post_parent = p2.ID) WHERE 1=1 AND ( ( post_date_gmt >= '2011-05-17 22:00:00' AND post_date_gmt <= '2013-12-06 22:59:59' ) ) AND ( wp_2_term_relationships.term_taxonomy_id IN (5) ) AND wp_2_posts.post_type = 'post' AND ((wp_2_posts.post_status = 'publish') OR (wp_2_posts.post_status = 'inherit' AND (p2.post_status = 'publish'))) GROUP BY wp_2_posts.ID ORDER BY wp_2_posts.post_date DESC
and causes a mysql exception
#1052 - Column 'post_date_gmt' in where clause is ambiguous
I'll attach the quick-fix which works in our multi-site setup
with the fix the sql-statement looks like this:
SELECT wp_2_posts.* FROM wp_2_posts INNER JOIN wp_2_term_relationships ON (wp_2_posts.ID = wp_2_term_relationships.object_id) LEFT JOIN wp_2_posts AS p2 ON (wp_2_posts.post_parent = p2.ID) WHERE 1=1 AND ( ( wp_2_posts.post_date_gmt >= '2011-05-17 22:00:00' AND wp_2_posts.post_date_gmt <= '2013-12-06 22:59:59' ) ) AND ( wp_2_term_relationships.term_taxonomy_id IN (5) ) AND wp_2_posts.post_type = 'post' AND ((wp_2_posts.post_status = 'publish') OR (wp_2_posts.post_status = 'inherit' AND (p2.post_status = 'publish'))) GROUP BY wp_2_posts.ID ORDER BY wp_2_posts.post_date DESC
#17
in reply to:
↑ 13
@
11 years ago
Replying to nacin:
I understand what wonderboymusic is tackling, but I am not sure what the original reported bug is here. This seems like a 3.9 thing at this point.
It feels like the original ticket has been hijacked. I think the original patch just addressed a simple oversight in the way the SQL was generated in the new WP_Date_Query - namely that the table names were not being prefixed nor aliased. The original patch addresses this.
As Viper007Bond says, it's not urgent, but then on the other hand, it's a very small fix, that won't break anything, and will increase the robustness of the SQL. I don't see a downside to pushing this through for 3.7.2 or 3.8.1?
#18
follow-up:
↓ 19
@
11 years ago
It feels like the original ticket has been hijacked.
With some basic / arbitrary unit tests for prefixing, other issues were discovered - the unit tests are a pre-req for the patch.
#19
in reply to:
↑ 18
@
11 years ago
Replying to wonderboymusic:
With some basic / arbitrary unit tests for prefixing, other issues were discovered - the unit tests are a pre-req for the patch.
I hear that. Just wondering whether at that point another ticket should have been spun off for those additional issues?
#21
@
11 years ago
- Keywords needs-refresh needs-unit-tests added; has-patch removed
Discovered this thread via trying to use a date query with a tax query. +1 to doing whatever need to get this fixed — going to fix with a filter for now. I can put time into it later this week / early next if no one else is already working on it.
#22
@
10 years ago
ew_holmes: I think your patch will work but it seems kind of hacky. I think it'd be better to make $valid_columns
an associative array of column => table so that it's explicit.
wonderboymusic: As mentioned, your patch is an enhancement rather than a bug fix. Let's open a separate ticket for that.
neoxx: Your patch won't work because the class is also used on the comments table, not just the posts table.
I just attached a new patch that is similar to ew_holmes's but explicitly defines what column belongs to what table. This allows the class to be used with other tables, such as the meta tables. Custom uses is why I included the filter.
The problem with my patch is that it breaks the existing filter. The filter is super new but that's probably still bad.
I'm not sure what to do. Rename the filter? Add a new filter and try to make the existing one work as before?
Opinions welcome.
This ticket was mentioned in IRC in #wordpress-dev by Viper007Bond. View the logs.
10 years ago
#24
@
10 years ago
Viper007Bond: Perhaps a combination of our patches will address the de-hackification of my patch, the associative array idea you have, and still keep the existing filter in tact? I'll give it a thought tonight or tomorrow.
#25
@
10 years ago
- Keywords 4.0-early added; 3.9-early removed
- Milestone changed from 3.9 to Future Release
Let's regroup on these early 4dotOh
#27
@
10 years ago
Viper007Bond: I tried sitting down to work with what you had written an expand on it, but really couldn't think of a clean way of approaching it.
The filter definitely cannot be salvaged with an associative array, without some arbitrary checks later on (as mine did).
Here's an attempt to catch any existing filters that would be adding into valid columns. The only edge case I can think of not catching is someone trying to remove valid columns.
Perhaps altering the use of a filter ASAP is acceptable? Would need some Core Contrib opinions on that matter.
#28
@
10 years ago
- Keywords 4.1-early added; 4.0-early removed
- Milestone changed from 4.0 to Future Release
Seems like the spirit of this ticket has gone off the rails. Needs a consensus and looks like updated unit tests. Maybe we can hit the ground running in 4.1. Punting.
#30
@
10 years ago
Hello, I added taxonomies to the attachment post type and now get "WordPress database error: [Column 'post_date' in where clause is ambiguous]" when trying to use 'tax_query' and 'date_query' for 'post_type' => 'attachment' and 'post_status' => 'inherit'. WP 4.0
#31
@
10 years ago
- Keywords has-patch needs-testing added; needs-refresh needs-unit-tests 4.1-early removed
- Milestone changed from Future Release to 4.1
25775.patch is an attempt at fixing the problem that maintains backward compatibility with the filter and with existing use of WP_Date_Query
. Much of the patch is making some minor alterations to existing unit tests so that they work with the modified validate_column()
. Here's the executive summary:
- When you pass a value to
validate_column()
that contains a dot, it's trusted - no validation takes place. - Return results of
validate_column()
are run throughesc_sql()
, since we're no longer so strictly checking against a whitelist (though tbh the whitelist check that's currently there does not look very airtight to me). - When a non-dotted value is passed to
validate_column()
, check to see if it's a core date column ($known_columns
), and if so, add the appropriate prefix.
How this fixes the problem:
test_date_query_with_taxonomy_join()
demonstrates the problem raised in the original ticket. Since no column is passed to the query, 'post_date' is assumed; 'post_date' then validates to$wpdb->posts.post_date
.test_validate_column_with_date_query_valid_columns_filter()
shows that the filter continues to work. If you are using this filter to add a custom column, it will continue to be validated.validate_column()
won't add a table prefix, because it's not a "known column". But (a) this is not a regression - currently, if you add a custom column like this and then do a table join, you'll probably break something; and (b) devs using the filter should be encouraged to add table-prefixed columns to the list. (ie `$columns[] = 'my_table_name.my_column_name')
Some of the changes suggested by wonderboymusic https://core.trac.wordpress.org/ticket/25775#comment:9 are interesting and worth pursuing (would you expect any less? <3 <3 <3) but are beyond the scope of this ticket. I also think his patch misunderstands what WP_Comments_Query::$date_query
is supposed to do - we don't need to join against the posts table because we only care about the 'comment_date', as passed to get_sql()
here https://core.trac.wordpress.org/browser/tags/4.0/src/wp-includes/comment.php#L441. (If his tests were converting comment_date
to post_date
, it must be because he had one of the other patches on this ticket applied to his build - they broke date queries for comments.)
In the future might look for a more robust syntax for table-specific support in WP_Date_Query
. See #29823. But that's not necessary to fix the problem at hand.
#32
follow-up:
↓ 33
@
10 years ago
Thanks for your work on this! :)
One quick question: why set $dotpos
if you aren't going to use it? Why not just do false === strpos( $column, '.' )
?
#33
in reply to:
↑ 32
@
10 years ago
Replying to Viper007Bond:
Thanks for your work on this! :)
One quick question: why set
$dotpos
if you aren't going to use it? Why not just dofalse === strpos( $column, '.' )
?
Whoops - this was a remnant of a different strategy I'd tried. Fixed in 25775.2.patch.
Viper007Bond and others - I'd like some feedback on the following:
- Does the patch deal adequately with the situations listed above? Does it resolve the original problem of taxonomy/meta JOINs? Does it maintain compatibility with real-world uses of the 'date_query_valid_columns' filter that you're familiar with? I think the unit tests show that it does all of these things, but a sanity check would be nice.
- The one potential downside of my approach is that
validate_column()
does a bit less than it did before. It used to be that you could circumvent its logic only by filtering 'date_query_valid_columns'. Now you can also pass a table-prefixed value. Do we care? Note that we are *sanitizing* the value (ie against SQL injection). The "validation" in question is merely a developer service: we try to make sure that no one attempts to do a date_query against an incompatible column. My personal opinion is that it's actually *better* to do less validation, because it means that passing a bad column name will result in SQL errors rather than mysteriously falling back to the value of 'post_date', which might silently return unintended results. So, with my patch, the real purpose of the method is not so much validation as it is convenience: it allows you to pass merely 'post_date' instead of requiring you to concatenate$wpdb->posts . '.post_date'
.
#35
@
10 years ago
Yeah, the validation was there for security reasons not to prevent invalid columns from being used. A whitelist seemed like the best way to do it.
#36
@
10 years ago
nacin, Viper007Bond - Thanks, guys. You're right.
So, the current state of affairs is that we check column names against a whitelist, but the whitelist is filtered, and no sanitization happens after that. So a plugin author could do damage by using this filter, but of course an installed plugin can do all sorts of damage. In the name of being conservative, 25775.3.patch improves the current situation by stripping forbidden characters from column names - even those added to the whitelist via filter.
reworking of validate_column to prepend the table name