Make WordPress Core

Opened 8 years ago

Closed 7 years ago

#34222 closed defect (bug) (invalid)

Impossible to correctly make a REGEXP meta query's value safe from untrusted input

Reported by: johnbillion's profile johnbillion Owned by:
Milestone: Priority: normal
Severity: normal Version:
Component: Query Keywords:
Focuses: Cc:

Description

Given a meta query that places untrusted input into the value field and uses REGEXP for its comparison, it's not possible to make the query safe if it contains preg quotable characters (such as $) without invalidating the query.

Example:

add_action( 'init', function() {

	$unsafe = wp_unslash( $_GET['foo'] );
	$safe   = preg_quote( $unsafe );
	$value  = '^' . $safe;

	new WP_Query( array(
		'meta_query' => array(
			array(
				'key'     => 'foo',
				'value'   => $value,
				'compare' => 'REGEXP',
			),
		),
	) );

} );

The above code uses preg_quote() to escape the user input, as it'll be passed to the query as REGEXP '{value}' and needs to be escaped. In WP_Meta_Query::get_sql_for_clause(), the value then gets inserted into the SQL query using wpdb::prepare(), which results in escaping of the slash that preg_quote() adds to the value, thus invalidating the query.

Given the URL example.com?foo=$100, the following SQL will be generated. Note the double escaped $ character.

SELECT SQL_CALC_FOUND_ROWS wp_posts.ID
FROM wp_posts 
INNER JOIN wp_postmeta
ON ( wp_posts.ID = wp_postmeta.post_id )
WHERE 1=1 
AND ( ( wp_trunk_postmeta.meta_key = 'foo'
AND CAST(wp_trunk_postmeta.meta_value AS CHAR) REGEXP '^\\$100' ) )
[snip]

tl;dr You can't use preg_quote() on a value that subsequently gets inserted into an SQL query using wpdb::prepare().

Attachments (1)

34222-mysql.jpg (39.1 KB) - added by boonebgorges 8 years ago.

Download all attachments as: .zip

Change History (3)

#1 @boonebgorges
8 years ago

In my testing, I'm seeing the same SQL as you're reporting. But the SQL is correctly matching posts. As far as I can see, in WHERE meta_value REGEXP '^\\$100', MySQL inteprets the first slash as escaping the second one as a literal slash, and it then interprets the literal slash as escaping the $, which inside of a REGEXP means that it's not parsed as a regex special character. See attached screenshot.

#2 @johnbillion
7 years ago

  • Keywords dev-feedback removed
  • Milestone Awaiting Review deleted
  • Resolution set to invalid
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.