Make WordPress Core

Opened 11 years ago

Closed 10 years ago

#25775 closed defect (bug) (fixed)

WP_Date_Query table prefixing

Reported by: ew_holmes's profile ew_holmes Owned by: boonebgorges's profile boonebgorges
Milestone: 4.1 Priority: normal
Severity: major Version: 3.7
Component: Query Keywords: has-patch needs-testing
Focuses: Cc:

Description

I believe I've found a bug within WP_Date_Query. The lack of a $wpdb table prefix will break queries that have any joins put into them.

If the $column specified is within the posts table, the $column variable is not being prefixed by $wpdb->posts as it should be.

Alternatively, if you are comparing comment_date or comment_date_gmt, $wpdb->comments should be prefixed.

Attachments (8)

date.php.patch (1.2 KB) - added by ew_holmes 11 years ago.
reworking of validate_column to prepend the table name
25775.diff (6.3 KB) - added by wonderboymusic 11 years ago.
date.php.2.patch (331 bytes) - added by neoxx 11 years ago.
25775.viper007bond.patch (1.9 KB) - added by Viper007Bond 10 years ago.
Changes existing filter behavior. :( See comments.
25775.ew_holmes.patch (2.4 KB) - added by ew_holmes 10 years ago.
Attempt to extend Viper007Bond's patch.
25775.patch (8.0 KB) - added by boonebgorges 10 years ago.
25775.2.patch (8.0 KB) - added by boonebgorges 10 years ago.
25775.3.patch (8.3 KB) - added by boonebgorges 10 years ago.

Download all attachments as: .zip

Change History (45)

@ew_holmes
11 years ago

reworking of validate_column to prepend the table name

#1 @ew_holmes
11 years ago

  • Keywords has-patch needs-testing dev-feedback added; needs-patch removed

#2 @ew_holmes
11 years ago

  • Cc eric@… added

#3 @tomauger
11 years ago

I think that in order to keep WP_Date_Query consistent with the other SQL-generating components, it's a good idea to prefix each field.

#4 @tomauger
11 years ago

  • Cc tomaugerdotcom@… added

#5 @SergeyBiryukov
11 years ago

  • Milestone changed from Awaiting Review to 3.8
  • Version changed from 3.7.1 to 3.7

#6 @ew_holmes
11 years ago

I've tested this on a WordPress Multisite installation, with 3.7.1 installed. Anyone want to take charge on some unit testing with this? I agree with @tomauger that this adds consistency within the other Query classes (WP_Query, Tax_Query, Meta_Query).

#7 @wonderboymusic
11 years ago

  • Keywords needs-unit-tests added; needs-testing dev-feedback removed

#8 @neoxx
11 years ago

  • Cc mail@… added

#9 @wonderboymusic
11 years ago

So, I found some delightful bugs when running some of the Date Query unit tests directly.

Things:

  • Some unit tests produce database errors, but when they run in certain contexts, database error suppression has been triggered in a previous test and not toggled back on
  • Date Query only affects the WHERE clause, but allows arguments that would otherwise produce a JOIN - there is currently no code enforcing that JOIN, so the queries can blow up
  • Prefixing raises errors for unknown tables, where previously it was only an unknown column error
  • If you don't pass a column arg when adding date_query to a query for comments, the default value is post_date, but there was no code even attempting to JOIN- I am not sure how some of these tests ever passed.

What I Did in 25775.diff​:

  • Made a unit test that makes date_querys with each of the valid columns and then asserts RegEx on the generated SQL
  • in WP_Date_Query, add 2 properties: $join_comments and $joins_posts which are toggled on when the column is being validated
  • In WP_Query, if date_query is an arg, check for ->joins_comments after ->get_sql(), and then way down the road, enforce the JOIN if the resulting $join var doesn't contain the table
  • In WP_Comment_Query, if date_query is an arg, check for ->joins_posts after ->get_sql(), and then way down the road, enforce the JOIN if the resulting $join var doesn't contain the table

#10 @nunomorgadinho
11 years ago

  • Cc nuno.morgadinho@… added
  • Keywords needs-unit-tests removed

#11 @ryan
11 years ago

  • Milestone changed from 3.8 to 3.7.2

Punting from 3.8 in prep for 3.8 RC1. This seems a bit large to go in at this point. Moving to 3.7.2 in a fit of possibly fanciful optimism.

#12 @Viper007Bond
11 years ago

Prefixing table names seems like a good idea to avoid ambiguity but I'm confused as to how you think comments were broken. When querying for comments that match certain date parameters using WP_Comment_Query (used by get_comments()), the column is defaulted to the correct value:

http://core.trac.wordpress.org/browser/trunk/src/wp-includes/comment.php?rev=26492#L396

Perhaps I'm forgetting something but when is the comments table JOINed with anything else?

All of the unit tests that I wrote passed when I wrote them. They shouldn't have been affected by other unit tests either as I ran only my groups.

Can we get some sample code here?

Last edited 11 years ago by Viper007Bond (previous) (diff)

#13 follow-up: @nacin
11 years ago

I understand what wonderboymusic is tackling, but I am not sure what the original reported bug is here. This seems like a 3.9 thing at this point. WP_Date_Query is new; if something isn't quite right about it, it's not a major regression, just a minor defect we can fix in a future release.

More: http://irclogs.wordpress.org/chanlog.php?channel=wordpress-dev&day=2013-12-05&sort=asc#m744879

#14 @neoxx
11 years ago

For instance querying for a custom taxonomy and a date will fail.

#15 @nacin
11 years ago

  • Keywords 3.9-early added
  • Milestone changed from 3.7.2 to Future Release

A full code example would be helpful, for both general confirmation and also so we can unit test it.

#16 @neoxx
11 years ago

here you go. - code has been simplified for the sake of readability:

<?php
/*
basic-set consists of
all published posts
*/

$query_params=array(
        'posts_per_page' => -1,
        'post_status' => 'publish'
);

/*
custom taxonomy
*/

if (!empty($_REQUEST['writer']))
        $query_params['writers']=esc_sql($_REQUEST['writer']);

/*
start_date and end_date
*/

$date_query=array(
        'column' => 'post_date_gmt',
        'inclusive' => true
);

if (!empty($_REQUEST['start_date'])) {

        /*
        start_date will be the
        input date at 00:00:00
        */

        $date_query['after']=esc_sql($_REQUEST['start_date']).' 00:00:00';
}

if (!empty($_REQUEST['end_date'])) {

        /*
        end_date will be the
        input date at 23:59:59
        */

        $date_query['before']=esq_sql($_REQUEST['end_date']).' 23:59:59';
}

$query_params['date_query']=array(
        $date_query
);

/*
query database
*/

$query=query_posts($query_params);

results in the following sql statement

SELECT wp_2_posts.* FROM wp_2_posts INNER JOIN wp_2_term_relationships ON (wp_2_posts.ID = wp_2_term_relationships.object_id) LEFT JOIN wp_2_posts AS p2 ON (wp_2_posts.post_parent = p2.ID) WHERE 1=1 AND ( ( post_date_gmt >= '2011-05-17 22:00:00' AND post_date_gmt <= '2013-12-06 22:59:59' ) ) AND ( wp_2_term_relationships.term_taxonomy_id IN (5) ) AND wp_2_posts.post_type = 'post' AND ((wp_2_posts.post_status = 'publish') OR (wp_2_posts.post_status = 'inherit' AND (p2.post_status = 'publish'))) GROUP BY wp_2_posts.ID ORDER BY wp_2_posts.post_date DESC

and causes a mysql exception

#1052 - Column 'post_date_gmt' in where clause is ambiguous

I'll attach the quick-fix which works in our multi-site setup

with the fix the sql-statement looks like this:

SELECT wp_2_posts.* FROM wp_2_posts INNER JOIN wp_2_term_relationships ON (wp_2_posts.ID = wp_2_term_relationships.object_id) LEFT JOIN wp_2_posts AS p2 ON (wp_2_posts.post_parent = p2.ID) WHERE 1=1 AND ( ( wp_2_posts.post_date_gmt >= '2011-05-17 22:00:00' AND wp_2_posts.post_date_gmt <= '2013-12-06 22:59:59' ) ) AND ( wp_2_term_relationships.term_taxonomy_id IN (5) ) AND wp_2_posts.post_type = 'post' AND ((wp_2_posts.post_status = 'publish') OR (wp_2_posts.post_status = 'inherit' AND (p2.post_status = 'publish'))) GROUP BY wp_2_posts.ID ORDER BY wp_2_posts.post_date DESC
Last edited 11 years ago by DrewAPicture (previous) (diff)

@neoxx
11 years ago

#17 in reply to: ↑ 13 @tomauger
11 years ago

Replying to nacin:

I understand what wonderboymusic is tackling, but I am not sure what the original reported bug is here. This seems like a 3.9 thing at this point.

It feels like the original ticket has been hijacked. I think the original patch just addressed a simple oversight in the way the SQL was generated in the new WP_Date_Query - namely that the table names were not being prefixed nor aliased. The original patch addresses this.

As Viper007Bond says, it's not urgent, but then on the other hand, it's a very small fix, that won't break anything, and will increase the robustness of the SQL. I don't see a downside to pushing this through for 3.7.2 or 3.8.1?

Last edited 11 years ago by tomauger (previous) (diff)

#18 follow-up: @wonderboymusic
11 years ago

It feels like the original ticket has been hijacked.

With some basic / arbitrary unit tests for prefixing, other issues were discovered - the unit tests are a pre-req for the patch.

#19 in reply to: ↑ 18 @tomauger
11 years ago

Replying to wonderboymusic:

With some basic / arbitrary unit tests for prefixing, other issues were discovered - the unit tests are a pre-req for the patch.

I hear that. Just wondering whether at that point another ticket should have been spun off for those additional issues?

#20 @wonderboymusic
11 years ago

  • Milestone changed from Future Release to 3.9

#21 @danielbachhuber
11 years ago

  • Keywords needs-refresh needs-unit-tests added; has-patch removed

Discovered this thread via trying to use a date query with a tax query. +1 to doing whatever need to get this fixed — going to fix with a filter for now. I can put time into it later this week / early next if no one else is already working on it.

@Viper007Bond
10 years ago

Changes existing filter behavior. :( See comments.

#22 @Viper007Bond
10 years ago

ew_holmes: I think your patch will work but it seems kind of hacky. I think it'd be better to make $valid_columns an associative array of column => table so that it's explicit.

wonderboymusic: As mentioned, your patch is an enhancement rather than a bug fix. Let's open a separate ticket for that.

neoxx: Your patch won't work because the class is also used on the comments table, not just the posts table.


I just attached a new patch that is similar to ew_holmes's but explicitly defines what column belongs to what table. This allows the class to be used with other tables, such as the meta tables. Custom uses is why I included the filter.

The problem with my patch is that it breaks the existing filter. The filter is super new but that's probably still bad.

I'm not sure what to do. Rename the filter? Add a new filter and try to make the existing one work as before?

Opinions welcome.

This ticket was mentioned in IRC in #wordpress-dev by Viper007Bond. View the logs.


10 years ago

#24 @ew_holmes
10 years ago

Viper007Bond: Perhaps a combination of our patches will address the de-hackification of my patch, the associative array idea you have, and still keep the existing filter in tact? I'll give it a thought tonight or tomorrow.

#25 @wonderboymusic
10 years ago

  • Keywords 4.0-early added; 3.9-early removed
  • Milestone changed from 3.9 to Future Release

Let's regroup on these early 4dotOh

#26 @wonderboymusic
10 years ago

  • Milestone changed from Future Release to 4.0

I have some ideas for this

@ew_holmes
10 years ago

Attempt to extend Viper007Bond's patch.

#27 @ew_holmes
10 years ago

Viper007Bond: I tried sitting down to work with what you had written an expand on it, but really couldn't think of a clean way of approaching it.

The filter definitely cannot be salvaged with an associative array, without some arbitrary checks later on (as mine did).

Here's an attempt to catch any existing filters that would be adding into valid columns. The only edge case I can think of not catching is someone trying to remove valid columns.

Perhaps altering the use of a filter ASAP is acceptable? Would need some Core Contrib opinions on that matter.

#28 @DrewAPicture
10 years ago

  • Keywords 4.1-early added; 4.0-early removed
  • Milestone changed from 4.0 to Future Release

Seems like the spirit of this ticket has gone off the rails. Needs a consensus and looks like updated unit tests. Maybe we can hit the ground running in 4.1. Punting.

#29 @danielbachhuber
10 years ago

#29432 was marked as a duplicate.

#30 @webbistro
10 years ago

Hello, I added taxonomies to the attachment post type and now get "WordPress database error: [Column 'post_date' in where clause is ambiguous]" when trying to use 'tax_query' and 'date_query' for 'post_type' => 'attachment' and 'post_status' => 'inherit'. WP 4.0

@boonebgorges
10 years ago

#31 @boonebgorges
10 years ago

  • Keywords has-patch needs-testing added; needs-refresh needs-unit-tests 4.1-early removed
  • Milestone changed from Future Release to 4.1

25775.patch is an attempt at fixing the problem that maintains backward compatibility with the filter and with existing use of WP_Date_Query. Much of the patch is making some minor alterations to existing unit tests so that they work with the modified validate_column(). Here's the executive summary:

  • When you pass a value to validate_column() that contains a dot, it's trusted - no validation takes place.
  • Return results of validate_column() are run through esc_sql(), since we're no longer so strictly checking against a whitelist (though tbh the whitelist check that's currently there does not look very airtight to me).
  • When a non-dotted value is passed to validate_column(), check to see if it's a core date column ($known_columns), and if so, add the appropriate prefix.

How this fixes the problem:

  • test_date_query_with_taxonomy_join() demonstrates the problem raised in the original ticket. Since no column is passed to the query, 'post_date' is assumed; 'post_date' then validates to $wpdb->posts.post_date.
  • test_validate_column_with_date_query_valid_columns_filter() shows that the filter continues to work. If you are using this filter to add a custom column, it will continue to be validated. validate_column() won't add a table prefix, because it's not a "known column". But (a) this is not a regression - currently, if you add a custom column like this and then do a table join, you'll probably break something; and (b) devs using the filter should be encouraged to add table-prefixed columns to the list. (ie `$columns[] = 'my_table_name.my_column_name')

Some of the changes suggested by wonderboymusic https://core.trac.wordpress.org/ticket/25775#comment:9 are interesting and worth pursuing (would you expect any less? <3 <3 <3) but are beyond the scope of this ticket. I also think his patch misunderstands what WP_Comments_Query::$date_query is supposed to do - we don't need to join against the posts table because we only care about the 'comment_date', as passed to get_sql() here https://core.trac.wordpress.org/browser/tags/4.0/src/wp-includes/comment.php#L441. (If his tests were converting comment_date to post_date, it must be because he had one of the other patches on this ticket applied to his build - they broke date queries for comments.)

In the future might look for a more robust syntax for table-specific support in WP_Date_Query. See #29823. But that's not necessary to fix the problem at hand.

Last edited 10 years ago by boonebgorges (previous) (diff)

#32 follow-up: @Viper007Bond
10 years ago

Thanks for your work on this! :)

One quick question: why set $dotpos if you aren't going to use it? Why not just do false === strpos( $column, '.' ) ?

#33 in reply to: ↑ 32 @boonebgorges
10 years ago

Replying to Viper007Bond:

Thanks for your work on this! :)

One quick question: why set $dotpos if you aren't going to use it? Why not just do false === strpos( $column, '.' ) ?

Whoops - this was a remnant of a different strategy I'd tried. Fixed in 25775.2.patch.

Viper007Bond and others - I'd like some feedback on the following:

  1. Does the patch deal adequately with the situations listed above? Does it resolve the original problem of taxonomy/meta JOINs? Does it maintain compatibility with real-world uses of the 'date_query_valid_columns' filter that you're familiar with? I think the unit tests show that it does all of these things, but a sanity check would be nice.
  2. The one potential downside of my approach is that validate_column() does a bit less than it did before. It used to be that you could circumvent its logic only by filtering 'date_query_valid_columns'. Now you can also pass a table-prefixed value. Do we care? Note that we are *sanitizing* the value (ie against SQL injection). The "validation" in question is merely a developer service: we try to make sure that no one attempts to do a date_query against an incompatible column. My personal opinion is that it's actually *better* to do less validation, because it means that passing a bad column name will result in SQL errors rather than mysteriously falling back to the value of 'post_date', which might silently return unintended results. So, with my patch, the real purpose of the method is not so much validation as it is convenience: it allows you to pass merely 'post_date' instead of requiring you to concatenate $wpdb->posts . '.post_date'.

#34 @nacin
10 years ago

Fairly important note: esc_sql() is useless, security-wise, for column names.

#35 @Viper007Bond
10 years ago

Yeah, the validation was there for security reasons not to prevent invalid columns from being used. A whitelist seemed like the best way to do it.

#36 @boonebgorges
10 years ago

nacin, Viper007Bond - Thanks, guys. You're right.

So, the current state of affairs is that we check column names against a whitelist, but the whitelist is filtered, and no sanitization happens after that. So a plugin author could do damage by using this filter, but of course an installed plugin can do all sorts of damage. In the name of being conservative, 25775.3.patch improves the current situation by stripping forbidden characters from column names - even those added to the whitelist via filter.

#37 @boonebgorges
10 years ago

  • Owner set to boonebgorges
  • Resolution set to fixed
  • Status changed from new to closed

In 29933:

Use table aliases for columns in SQL generated by WP_Date_Query.

The use of non-aliased column names (eg 'post_date' instead of 'wp_posts.post_date')
in WP_Date_Query causes SQL notices and other failures when queries involve
table joins, such as date_query combined with tax_query or meta_query. This
changeset modifies WP_Date_Query::validate_column() to add the table alias when
it can be detected from the column name (ie, in the case of core columns).

A side effect of this change is that it is now possible to use WP_Date_Query
to build WHERE clauses across multiple tables, though there is currently no
core support for the automatic generation of the necessary JOIN clauses. See

Props ew_holmes, wonderboymusic, neoxx, Viper007Bond, boonebgorges.
Fixes #25775.

Note: See TracTickets for help on using tickets.