Make WordPress Core

Opened 9 years ago

Closed 9 years ago

#32090 closed defect (bug) (fixed)

Bug: Can not update tables in non-wordpress-installation databases when the query contains non ASCII characters

Reported by: willstedt's profile willstedt Owned by: pento's profile pento
Milestone: 4.2.2 Priority: normal
Severity: normal Version: 4.1.2
Component: Database Keywords: fixed-major
Focuses: Cc:

Description

When writing a query that connects to a database other than the database used to install Wordpress, and tries to store a string value which is non-ascii, the update is never performed.

This bug was implemented in the 4.1.2 security update and exists also in the 4.2 update.

Here is an example query that fails to execute:
UPDATE mydatabase.wp_customers SET firstname='Örjan', lastname = 'Åkesson' WHERE friendid = '55'

The reason for the bug is that the table name is used to get the collation from the table when non-ascii characters are being stored in the table.
The regular expression used to get the table name in the function get_table_from_query doesn't allow a dot (.) in the table name and therefore returns only the database name without the trailing table name.
When the function is called for, the example above would return the database name (mydatabase) instead of the full table name with database (mydatabase.wp_customers).

The solution to this part is to allow a dot in the table name, as written below.

wp-db.php (2693):

if ( preg_match( '/^\s*(?:'
	. 'SELECT.*?\s+FROM'
	. '|INSERT(?:\s+LOW_PRIORITY|\s+DELAYED|\s+HIGH_PRIORITY)?(?:\s+IGNORE)?(?:\s+INTO)?'
	. '|REPLACE(?:\s+LOW_PRIORITY|\s+DELAYED)?(?:\s+INTO)?'
	. '|UPDATE(?:\s+LOW_PRIORITY)?(?:\s+IGNORE)?'
	. '|DELETE(?:\s+LOW_PRIORITY|\s+QUICK|\s+IGNORE)*(?:\s+FROM)?'
	. ')\s+`?([\w-.]+)`?/is', $query, $maybe ) ) {
		return $maybe[1];
	}

When the database name is returned instead of the table name, the function fails to execute the SQL: SHOW FULL COLUMNS FROM ´mydatabase´ in the get_table_charset function (as you can't get columns from a database). The function then returns wpdb_get_table_charset_failure and the update is never performed.

When allowing a dot in the regex, the query "SHOW FULL COLUMNS FROM ´$table´" has to be updated, as the apostrophes shouldn't be used when a database name is given. Therefore the function get_table_charset also has to be updated with something like this:

wp-db.php (2236):

		if (strpos($table,'.') === false) {
		    $table = "`$table`";
		}
		$results = $this->get_results( "SHOW FULL COLUMNS FROM $table" );

For all systems that allows multilingual stuff sorted in different databases, this bug really destroys the possibility to use Wordpress altogether as there is no workaround except changing the bug directly in the core. I have done the above fix as a temporary solution so that our employees can work, but the system will of course break every time Wordpress gets updated - until the bug is fixed.

Attachments (6)

wp-db.diff (2.3 KB) - added by willstedt 9 years ago.
Patch to fix the bug #32090 - also fixes the lacking support for UTF8 characters in schema names, as such naming is allowed ackording to https://dev.mysql.com/doc/refman/5.5/en/identifiers.html
32090-41.patch (2.7 KB) - added by johnbillion 9 years ago.
4.1
32090-40.patch (2.7 KB) - added by johnbillion 9 years ago.
4.0
32090-39.patch (2.7 KB) - added by johnbillion 9 years ago.
3.9
32090-38.patch (2.7 KB) - added by johnbillion 9 years ago.
3.8
32090-37.patch (2.7 KB) - added by johnbillion 9 years ago.
3.7

Download all attachments as: .zip

Change History (30)

#1 @SergeyBiryukov
9 years ago

  • Milestone changed from Awaiting Review to 4.2.1

#2 follow-up: @ewerkstatt
9 years ago

Same problem here with german umlauts in fieldnames, although UTF-8 is used as characterset (#32104)

Last edited 9 years ago by SergeyBiryukov (previous) (diff)

#3 in reply to: ↑ 2 @willstedt
9 years ago

Replying to ewerkstatt:

Same problem here with german umlauts in fieldnames, although UTF-8 is used as characterset (#32104)

It doesn't matter which charset you have - Wordpress never gets so far that it can look up the charset. The bug is where they try to reverse engineer the SQL query in order to find the charset. They missed the scenario when the lookup is performed on databases outside the one which Wordpress is installed in.

Luckily it is very easy to fix and if you need a quick fix (like I did with an office with 30 employees that suddenly couldn't work), you can edit the wp-db.php file and do the two changes in my bug report.

#4 @ewerkstatt
9 years ago

Thank you. My hope is that the Wordpress team will cover this, since I do not like the idea of custom changes in core files ;-)

#5 @pento
9 years ago

#32108 was marked as a duplicate.

#6 @pento
9 years ago

Another note for when we fix this - I'm not sure that \w is the correct thing to use in the regex, to match table names. Per the MySQL docs, there's a large range of characters that are valid in object names. The PHP docs imply that \w only matches a locale-specific set of characters.

This ticket was mentioned in Slack in #core by hinaloe. View the logs.


9 years ago

#8 follow-ups: @DrewAPicture
9 years ago

  • Keywords needs-patch added

@willstedt: Care to add a patch?

#9 in reply to: ↑ 8 @willstedt
9 years ago

Replying to DrewAPicture:

@willstedt: Care to add a patch?

Sure, it seems fairly straight forward to provide a patch. I'll have time on Wednesday - will then also read up on the allowed characters for the schema and table names and fix that in the regex at the same time (as @pento noticed MySQL allowes many more characters than
\w).

#10 @JRGould
9 years ago

This bug is also experienced when executing a CREATE TABLE query where the field names or comments contain non-ascii characters. However, the dot fix doesn't seem like it would work in this scenario since the table doesn't exist in the database yet.

Example query that fails due to wpdb_get_table_charset_failure:

CREATE TABLE IF NOT EXISTS `foo2` (
`foo_id` int(11) NOT NULL COMMENT 'fübar'
) ENGINE=InnoDB DEFAULT CHARSET=utf8;

#11 @ewerkstatt
9 years ago

True, that's what we also experience. In order to import data (which needs also the CREATE command) we constantly have to use the wp-db.php from 4.1.1

I hope, a solution is provided soon.

#12 @sgissinger
9 years ago

The bug still exists in version 4.2.1 and for people who want to quickfix it, you should look at line 2808 and line 2279.
I applied it and it works.

@willstedt
9 years ago

Patch to fix the bug #32090 - also fixes the lacking support for UTF8 characters in schema names, as such naming is allowed ackording to https://dev.mysql.com/doc/refman/5.5/en/identifiers.html

#13 in reply to: ↑ 8 @willstedt
9 years ago

  • Keywords has-patch added; needs-patch removed

Replying to DrewAPicture:

@willstedt: Care to add a patch?

Patch added. Also fixed the bug that didn't allow UTF8 characters in table or schema names. Do I need to do anything more now, or will this automatically be included in the next release?

#14 @pento
9 years ago

In 32368:

WPDB: Allow queries to reference tables in the dbname.tablename format, and allow table names to contain any valid character, rather than just ASCII.

Props pento, willstedt for the initial patch.

See #32090.

#15 follow-up: @pento
9 years ago

  • Keywords fixed-major added; has-patch removed
  • Owner set to pento
  • Status changed from new to assigned

Hey @willstedt, sorry I didn't get to work through this patch with you - we're aiming for 4.2.2 to be released in the next day or two, so I needed to get it committed.

Anyway, I have some feedback on your initial patch, if you're interested:

  • We can't use inline functions, as WordPress still supports PHP 5.2.
  • When searching for the table name, your patch was looking for any UTF-8 character, where MySQL only allows 1 and 2 byte UTF-8 characters.
  • The /u modifier is quite slow. Unless we absolutely have to, it's good to avoid using it in functions that need to be fast.
  • No unit tests.

That said, it was a great first pass, it put me in the right direction to getting it fixed. Thanks for the contribution, I hope to see you sticking around! :-)

#16 @pento
9 years ago

In 32370:

WPDB: Allow queries to reference tables in the dbname.tablename format, and allow table names to contain any valid character, rather than just ASCII.

Merge of [32368] to the 4.2 branch.

Props pento, willstedt for the initial patch.

See #32090.

#17 @pento
9 years ago

Leaving open for merge to the 3.7-4.1 branches, after #32165 is done.

#18 in reply to: ↑ 15 @willstedt
9 years ago

Replying to pento:

Hey @willstedt, sorry I didn't get to work through this patch with you - we're aiming for 4.2.2 to be released in the next day or two, so I needed to get it committed.

Anyway, I have some feedback on your initial patch, if you're interested:

  • We can't use inline functions, as WordPress still supports PHP 5.2.
  • When searching for the table name, your patch was looking for any UTF-8 character, where MySQL only allows 1 and 2 byte UTF-8 characters.
  • The /u modifier is quite slow. Unless we absolutely have to, it's good to avoid using it in functions that need to be fast.
  • No unit tests.

That said, it was a great first pass, it put me in the right direction to getting it fixed. Thanks for the contribution, I hope to see you sticking around! :-)

Thanks for your feedback - I am glad you cut me some slack as this is the first time ever I contribute to an open source project :)

I know I made it a bit more tolerant than the actual MySql spec, but I thought that didn't matter as that particular function is built to retrieve the table name and not verify the SQL. But I am happy you are trying to get the code to be perfect :)

@johnbillion
9 years ago

4.1

@johnbillion
9 years ago

4.0

@johnbillion
9 years ago

3.9

@johnbillion
9 years ago

3.8

@johnbillion
9 years ago

3.7

#19 @johnbillion
9 years ago

In 32412:

WPDB: Allow queries to reference tables in the dbname.tablename format, and allow table names to contain any valid character, rather than just ASCII.

Merge of [32368] to the 4.1 branch.

Props pento, willstedt for the initial patch.

See #32090.

#20 @johnbillion
9 years ago

In 32413:

WPDB: Allow queries to reference tables in the dbname.tablename format, and allow table names to contain any valid character, rather than just ASCII.

Merge of [32368] to the 4.0 branch.

Props pento, willstedt for the initial patch.

See #32090.

#21 @johnbillion
9 years ago

In 32414:

WPDB: Allow queries to reference tables in the dbname.tablename format, and allow table names to contain any valid character, rather than just ASCII.

Merge of [32368] to the 3.9 branch.

Props pento, willstedt for the initial patch.

See #32090.

#22 @johnbillion
9 years ago

In 32415:

WPDB: Allow queries to reference tables in the dbname.tablename format, and allow table names to contain any valid character, rather than just ASCII.

Merge of [32368] to the 3.8 branch.

Props pento, willstedt for the initial patch.

See #32090.

#23 @johnbillion
9 years ago

In 32416:

WPDB: Allow queries to reference tables in the dbname.tablename format, and allow table names to contain any valid character, rather than just ASCII.

Merge of [32368] to the 3.7 branch.

Props pento, willstedt for the initial patch.

See #32090.

#24 @samuelsidler
9 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

Fixed for 4.2.2.

Note: See TracTickets for help on using tickets.