Opened 9 years ago
Closed 9 years ago
#32090 closed defect (bug) (fixed)
Bug: Can not update tables in non-wordpress-installation databases when the query contains non ASCII characters
Reported by: | willstedt | Owned by: | pento |
---|---|---|---|
Milestone: | 4.2.2 | Priority: | normal |
Severity: | normal | Version: | 4.1.2 |
Component: | Database | Keywords: | fixed-major |
Focuses: | Cc: |
Description
When writing a query that connects to a database other than the database used to install Wordpress, and tries to store a string value which is non-ascii, the update is never performed.
This bug was implemented in the 4.1.2 security update and exists also in the 4.2 update.
Here is an example query that fails to execute:
UPDATE mydatabase.wp_customers SET firstname='Örjan', lastname = 'Åkesson' WHERE friendid = '55'
The reason for the bug is that the table name is used to get the collation from the table when non-ascii characters are being stored in the table.
The regular expression used to get the table name in the function get_table_from_query doesn't allow a dot (.) in the table name and therefore returns only the database name without the trailing table name.
When the function is called for, the example above would return the database name (mydatabase) instead of the full table name with database (mydatabase.wp_customers).
The solution to this part is to allow a dot in the table name, as written below.
wp-db.php (2693):
if ( preg_match( '/^\s*(?:' . 'SELECT.*?\s+FROM' . '|INSERT(?:\s+LOW_PRIORITY|\s+DELAYED|\s+HIGH_PRIORITY)?(?:\s+IGNORE)?(?:\s+INTO)?' . '|REPLACE(?:\s+LOW_PRIORITY|\s+DELAYED)?(?:\s+INTO)?' . '|UPDATE(?:\s+LOW_PRIORITY)?(?:\s+IGNORE)?' . '|DELETE(?:\s+LOW_PRIORITY|\s+QUICK|\s+IGNORE)*(?:\s+FROM)?' . ')\s+`?([\w-.]+)`?/is', $query, $maybe ) ) { return $maybe[1]; }
When the database name is returned instead of the table name, the function fails to execute the SQL: SHOW FULL COLUMNS FROM ´mydatabase´ in the get_table_charset function (as you can't get columns from a database). The function then returns wpdb_get_table_charset_failure and the update is never performed.
When allowing a dot in the regex, the query "SHOW FULL COLUMNS FROM ´$table´" has to be updated, as the apostrophes shouldn't be used when a database name is given. Therefore the function get_table_charset also has to be updated with something like this:
wp-db.php (2236):
if (strpos($table,'.') === false) { $table = "`$table`"; } $results = $this->get_results( "SHOW FULL COLUMNS FROM $table" );
For all systems that allows multilingual stuff sorted in different databases, this bug really destroys the possibility to use Wordpress altogether as there is no workaround except changing the bug directly in the core. I have done the above fix as a temporary solution so that our employees can work, but the system will of course break every time Wordpress gets updated - until the bug is fixed.
Attachments (6)
Change History (30)
#3
in reply to:
↑ 2
@
9 years ago
Replying to ewerkstatt:
Same problem here with german umlauts in fieldnames, although UTF-8 is used as characterset (#32104)
It doesn't matter which charset you have - Wordpress never gets so far that it can look up the charset. The bug is where they try to reverse engineer the SQL query in order to find the charset. They missed the scenario when the lookup is performed on databases outside the one which Wordpress is installed in.
Luckily it is very easy to fix and if you need a quick fix (like I did with an office with 30 employees that suddenly couldn't work), you can edit the wp-db.php file and do the two changes in my bug report.
#4
@
9 years ago
Thank you. My hope is that the Wordpress team will cover this, since I do not like the idea of custom changes in core files ;-)
#6
@
9 years ago
Another note for when we fix this - I'm not sure that \w
is the correct thing to use in the regex, to match table names. Per the MySQL docs, there's a large range of characters that are valid in object names. The PHP docs imply that \w
only matches a locale-specific set of characters.
This ticket was mentioned in Slack in #core by hinaloe. View the logs.
9 years ago
#9
in reply to:
↑ 8
@
9 years ago
Replying to DrewAPicture:
@willstedt: Care to add a patch?
Sure, it seems fairly straight forward to provide a patch. I'll have time on Wednesday - will then also read up on the allowed characters for the schema and table names and fix that in the regex at the same time (as @pento noticed MySQL allowes many more characters than
\w
).
#10
@
9 years ago
This bug is also experienced when executing a CREATE TABLE query where the field names or comments contain non-ascii characters. However, the dot fix doesn't seem like it would work in this scenario since the table doesn't exist in the database yet.
Example query that fails due to wpdb_get_table_charset_failure:
CREATE TABLE IF NOT EXISTS `foo2` ( `foo_id` int(11) NOT NULL COMMENT 'fübar' ) ENGINE=InnoDB DEFAULT CHARSET=utf8;
#11
@
9 years ago
True, that's what we also experience. In order to import data (which needs also the CREATE command) we constantly have to use the wp-db.php from 4.1.1
I hope, a solution is provided soon.
#12
@
9 years ago
The bug still exists in version 4.2.1 and for people who want to quickfix it, you should look at line 2808 and line 2279.
I applied it and it works.
@
9 years ago
Patch to fix the bug #32090 - also fixes the lacking support for UTF8 characters in schema names, as such naming is allowed ackording to https://dev.mysql.com/doc/refman/5.5/en/identifiers.html
#13
in reply to:
↑ 8
@
9 years ago
- Keywords has-patch added; needs-patch removed
Replying to DrewAPicture:
@willstedt: Care to add a patch?
Patch added. Also fixed the bug that didn't allow UTF8 characters in table or schema names. Do I need to do anything more now, or will this automatically be included in the next release?
#15
follow-up:
↓ 18
@
9 years ago
- Keywords fixed-major added; has-patch removed
- Owner set to pento
- Status changed from new to assigned
Hey @willstedt, sorry I didn't get to work through this patch with you - we're aiming for 4.2.2 to be released in the next day or two, so I needed to get it committed.
Anyway, I have some feedback on your initial patch, if you're interested:
- We can't use inline functions, as WordPress still supports PHP 5.2.
- When searching for the table name, your patch was looking for any UTF-8 character, where MySQL only allows 1 and 2 byte UTF-8 characters.
- The
/u
modifier is quite slow. Unless we absolutely have to, it's good to avoid using it in functions that need to be fast. - No unit tests.
That said, it was a great first pass, it put me in the right direction to getting it fixed. Thanks for the contribution, I hope to see you sticking around! :-)
#18
in reply to:
↑ 15
@
9 years ago
Replying to pento:
Hey @willstedt, sorry I didn't get to work through this patch with you - we're aiming for 4.2.2 to be released in the next day or two, so I needed to get it committed.
Anyway, I have some feedback on your initial patch, if you're interested:
- We can't use inline functions, as WordPress still supports PHP 5.2.
- When searching for the table name, your patch was looking for any UTF-8 character, where MySQL only allows 1 and 2 byte UTF-8 characters.
- The
/u
modifier is quite slow. Unless we absolutely have to, it's good to avoid using it in functions that need to be fast.- No unit tests.
That said, it was a great first pass, it put me in the right direction to getting it fixed. Thanks for the contribution, I hope to see you sticking around! :-)
Thanks for your feedback - I am glad you cut me some slack as this is the first time ever I contribute to an open source project :)
I know I made it a bit more tolerant than the actual MySql spec, but I thought that didn't matter as that particular function is built to retrieve the table name and not verify the SQL. But I am happy you are trying to get the code to be perfect :)
Same problem here with german umlauts in fieldnames, although UTF-8 is used as characterset (#32104)