Make WordPress Core

Opened 17 years ago

Closed 17 years ago

Last modified 14 years ago

#4553 closed task (blessed) (fixed)

Consider using local prepared-statement/sprintf()-like system for last-second SQL escaping

Reported by: markjaquith's profile markjaquith Owned by: markjaquith's profile markjaquith
Milestone: 2.5 Priority: high
Severity: normal Version: 2.3
Component: Security Keywords: sql prepared statement sprintf injection security early
Focuses: Cc:

Description

See: #4545 comments for background.

nbachiyski:


We can also make a prepared statement-like/printf-like method of wpdb, which can handle escaping internally and get rid of the few lines, before every query, spent in escaping.



Example:

$result = $wpdb->get_results(
	$wpdb->prepare("SELECT something FROM $wpdb->tablename WHERE foo = '%s' LIMIT %d", $unslashed_value, $unslashed_uninted_limit)
);

Benefits:

  • Works well with last-second escaping of data as proposed in #4545
  • Backwards compatible
  • Makes for VERY obvious escaping -- helps us find SQL injection holes
  • Reduces a lot of $wpdb->escape(); lines
  • Allows original unescaped data used in query to remain unescaped in the function. No need to have $var and $var_sql floating around. Unescaped data is more usable.

Attachments (1)

prepared.diff (889 bytes) - added by markjaquith 17 years ago.

Download all attachments as: .zip

Change History (23)

#1 @markjaquith
17 years ago

  • Owner changed from anonymous to markjaquith
  • Status changed from new to assigned
<?php

function prepare($args=NULL) {
	if ( NULL === $args )
		return;
	$args = func_get_args();
	$query = array_shift($args);
	array_walk($args, 'escape_by_ref');
	return call_user_func_array('sprintf', array_merge($query, $args));
}

function escape_by_ref(&$a) {
	//global $wpdb;
	//$a = $wpdb->escape($a);
	$a = addslashes($a);
}

$dangerous_string = "It's raining";
$untrusted_limit = "10STRING";

echo prepare("SELECT foo FROM sometable WHERE bar = '%s' LIMIT %d", $dangerous_string, $untrusted_limit);
?>

Output:

SELECT foo FROM sometable WHERE bar = 'It\'s raining' LIMIT 10

Thoughts?

#2 follow-up: @markjaquith
17 years ago

See patch (introduces the new method).

I've started going through WP core and moving to this type of escaping. Thoughts so far:

  • This will result in a net decrease in code
  • It's not hard to make the change
  • It makes places that lack adequate escaping GLARINGLY obvious.
  • While going through, we can mark functions that expect pre-escaped data, for fixing in 2.4, with // pre-escaped
  • You escape literal % symbols with %% (two in a row). You have to remember this for queries that use LIKE

#3 @markjaquith
17 years ago

(In [5778]) Introducing "prepare", a WPDB method for sprintf()-prepared SQL statements. see #4553. Implementation details to follow.

#4 @markjaquith
17 years ago

Old:

$var = $wpdb->escape($var);
$var2 = $wpdb->escape($var2);
$limit = (int) $limit;
$wpdb->query("UPDATE $wpdb->tablename SET foo = '$var' WHERE blah = '$var2' LIMIT $limit");

New

$wpdb->query($wpdb->prepare("UPDATE $wpdb->tablename SET foo = '%s' WHERE blah = '%s' LIMIT %d", $var, $var2, $limit));

Notes:

  • Variables passed to $wpdb->prepare() should not be pre-escaped.
  • All uses of '%s' should either be quoted (for values) or strictly filtered through a list of allowed values (for things like %s representing a column name for ORDER BY). If the latter, this should be done as close to the SQL query as possible so that we can verify that it is not a SQL injection hole.
  • sprintf() takes care of int-casting %d variables.
  • As "%" is a special char, it is recommended that you concatenate that char into a string when doing LIKE queries instead of putting the "%" character into the query (as you will have to escape it with %% if you use it in the query).

Questions, concerns? Would be good to raise them before we start changing things.

#5 in reply to: ↑ 2 ; follow-up: @Nazgul
17 years ago

Replying to markjaquith:

  • While going through, we can mark functions that expect pre-escaped data, for fixing in 2.4, with // pre-escaped

Why wait till 2.4 to fix those? We'll have to go over all the code for this change anyway, so why not fix it while we're at it? (Tracking of those changes could and should be done in separate tickets linking to this one though)

Replying to markjaquith:

  • All uses of '%s' should either be quoted (for values) or strictly filtered through a list of allowed values (for things like %s representing a column name for ORDER BY). If the latter, this should be done as close to the SQL query as possible so that we can verify that it is not a SQL injection hole.

We could quote %s automatically if it isn't and introduce a %t for column names. The prepare function could validate those and turn them into %s for sprintf if they are valid or 'abort' otherwise.

#6 in reply to: ↑ 5 @markjaquith
17 years ago

Replying to Nazgul:

Why wait till 2.4 to fix those? We'll have to go over all the code for this change anyway, so why not fix it while we're at it? (Tracking of those changes could and should be done in separate tickets linking to this one though)

Because it will break a few plugins, and there's not enough lead time. We can, however, look for functions that expect pre-escaped date and then do stripslashes() on the data and then mark that for removal in 2.4

We could quote %s automatically if it isn't and introduce a %t for column names. The prepare function could validate those and turn them into %s for sprintf if they are valid or 'abort' otherwise.

I considered that, but it'd require us to write our own sprintf()-like code in order to figure out which parameter corresponds to the %t. That seems clunky and slow. If only sprintf() had callback functionality. Unless you see an elegant workaround for that. I'm not too worried... those unquoted %s's are going to stick out like a sore thumb (we could grep for them, even) and we'll be able to scan them and verify that they are filtered appropriately.

#7 follow-up: @nbachiyski
17 years ago

  • About the automated quoting: in most of the cases it will cause no problems. And in the rare cases, in which we don't have to quote some part, we can just escape it manually and insert it directly as an interpolated variable (like the table names now).
  • sprintf converts non-int values to zero. Is it the desired behaviour?
  • Couldn't we have a method, which both supports arguments and runs the query? Something like:
    $wpdb->smartnamehere("UPDATE t SET foo = '%s'", $foo);
    

#8 in reply to: ↑ 7 ; follow-up: @Otto42
17 years ago

Replying to nbachiyski:

  • Couldn't we have a method, which both supports arguments and runs the query? Something like:
    $wpdb->smartnamehere("UPDATE t SET foo = '%s'", $foo);
    

+1. Calling query(prepare(...)) seems like something you're going to be doing a lot. Making a function to do all this at once seems like an obvious move. Also, it will discourage direct use of query by plugin authors.

I have no idea what to name it. execute? dbgetf? ;)

#9 @markjaquith
17 years ago

About the automated quoting: in most of the cases it will cause no problems. And in the rare cases, in which we don't have to quote some part, we can just escape it manually and insert it directly as an interpolated variable (like the table names now).

That might be good. But we'd have to either look out (manually or via regex) for '%s', because otherwise '%s' would turn into ''%s'' (that's two grouping of single quotes) which is dangerous. Also, without a lot of expensive regex, we'll have to limit ourselves to simple %s ... no getting fancy with sprintf() features.

And actually, using naked %s might be close to actual SQL prepared statements, in which they use a naked question mark. So yeah, that sounds like a good idea. I'll work on that.

Couldn't we have a method, which both supports arguments and runs the query?

Not really, because the existing methods take multiple arguments. And there are a lot of methods. get_var(), get_row(), query(), get_results(), get_col() So I think it is best to have one escaping function.

sprintf converts non-int values to zero. Is it the desired behaviour?

<?php
printf("This is a sprintf() int: %d\n", '10BLAH');
printf("This is a PHP int: %s\n", (int) '10BLAH');
printf("This is a sprintf() int: %d\n", 'foo');
printf("This is an PHP int: %s", (int) 'foo');
?>

Output:

This is a sprintf() int: 10
This is a PHP int: 10
This is a sprintf() int: 0
This is an PHP int: 0

Behaves the same as PHP (int) casting in those situations.

#10 @markjaquith
17 years ago

(In [5779]) Automatically quote strings in $wpdb->prepare(). Use vsprintf(). see #4553

#11 @markjaquith
17 years ago

Okay, now %s gets quoted automatically, after first being unquoted, just to be sure.

New:

$wpdb->query($wpdb->prepare("UPDATE $wpdb->tablename SET foo = %s WHERE blah = %s LIMIT %d", $var, $var2, $limit));

Do we have a function that can be used to sanitize column names? A-Za-z0-9_\. should be fine. It's more restrictive than MySQL is, but it'd just be used internally.

#12 @JeremyVisser
17 years ago

Mark, regarding [5779], the automatic un-quoter does not un-quote double-quoted (") strings.

#13 @markjaquith
17 years ago

(In [5791]) Undo pre-doublequoting in prepare(). Props JeremyVisser. see #4553

#14 @Nazgul
17 years ago

We did the groundwork for this one in 2.3. Are we going to convert all queries in 2.3 as well, or push that to 2.4?

#15 @ryan
17 years ago

  • Milestone changed from 2.3 to 2.4
  • Priority changed from normal to high

Looks like we're pushing to 2.4. Let's get this in first thing for 2.4.

#16 @foolswisdom
17 years ago

  • Keywords early added

#17 @markjaquith
17 years ago

(In [6173]) prepare() for wp-includes/ bookmark.php, canonical.php, comment.php, comment-template.php. see #4553

#18 @markjaquith
17 years ago

I'm going to be working my way though wp-includes alphabetically, then wp-admin. I'm also looking for places where prepare() can't be cleanly implemented (place that expect data to already be slashed). I'm making up those areas like so:

// expected_slashed ($variable1, $variable2)

That way, if we decide to change those functions so they expect unslashed data, we know where to look.

Also, I skipped prepare()ing a few complicated queries constructed by a lot of branched PHP logic.

#19 @markjaquith
17 years ago

(In [6180]) prepare() for wp-includes/ link-template.php, post.php, general-template.php, pluggable.php, functions.php. see #4553

#20 in reply to: ↑ 8 @robmil
17 years ago

Replying to Otto42:

I have no idea what to name it. execute? dbgetf? ;)

+1 for "execute". It fits in with the general "prepare->execute" nomenclature used by most other prepared statement implementations.

#21 @lloydbudd
17 years ago

It seems it is time to close this ticket as fixed for 2.5 and create a new ticket to continue any remaining work in 2.6 .

#22 @Nazgul
17 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

Closing as fixed for now for the 2.5 release, as Lloyd suggests.

The remaining work can be done in a new ticket for 2.6

Note: See TracTickets for help on using tickets.