#53623 closed defect (bug) (fixed)
MariaDB 10.6 renamed utf8 to utf8mb3
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Milestone: | 6.1 | Priority: | normal |
| Severity: | normal | Version: | |
| Component: | Database | Keywords: | has-patch |
| Focuses: | Cc: |
Description
See MariaDB ticket MDEV-8334 "Rename utf8 to utf8mb3"
Which results in charset tests are now failing.
Attachments (1)
Change History (16)
This ticket was mentioned in Slack in #forums by yui. View the logs.
4 years ago
This ticket was mentioned in Slack in #hosting-community by yui. View the logs.
4 years ago
#4
@
4 years ago
MySQL 8.0.26 also has related changes: https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-26.html
These statements now report utf8mb3 rather than utf8 when writing character set names: EXPLAIN, SHOW CREATE PROCEDURE, SHOW CREATE EVENT.
Stored program definitions retrieved from the data dictionary now report utf8mb3 rather than utf8 in character set references. This affects any output produced from those definitions, such as SHOW CREATE statements.
This error message now reports utf8mb3 rather than utf8 when writing character set names: ER_INVALID_CHARACTER_STRING. (Bug #32233614, Bug #32392077, Bug #32392209, Bug #32428538, Bug #32428598)
#6
follow-up:
↓ 7
@
4 years ago
Tested with:
- PHP 7.1 -> 8.1
- MariaDB 10.6
This makes some test errors:
- Tests_DB_Charset::test_set_charset_changes_the_connection_collation
Failed asserting that two strings are identical. --- Expected +++ Actual @@ @@ -'utf8_general_ci' +'utf8mb3_general_ci'
- Tests_DB_Charset::test_get_column_charset::test_get_column_charset with data set #5
Failed asserting that two strings are identical. --- Expected +++ Actual @@ @@ -'utf8' +'utf8mb3'
- Tests_DB_Charset::test_get_column_charset::test_get_column_charset with data set #6
Failed asserting that two strings are identical. --- Expected +++ Actual @@ @@ -'utf8' +'utf8mb3'
- Tests_DB_Charset::test_table_collation_check::test_table_collation_check with data set #0
('CREATE TABLE table_collation_..._bin )', true, 'SELECT * FROM table_collation... a='😈'', 'DROP TABLE IF EXISTS table_co...heck_0', array('SELECT * FROM table_collation...='foo'', 'SHOW FULL TABLES LIKE table_c...heck_0', 'DESCRIBE table_collation_check_0', 'DESC table_collation_check_0', 'EXPLAIN SELECT * FROM table_c...heck_0')) Failed asserting that false is identical to true.
- Tests_DB_Charset::test_table_collation_check::test_table_collation_check with data set #1
('CREATE TABLE table_collation_...l_ci )', true, 'SELECT * FROM table_collation... a='😈'', 'DROP TABLE IF EXISTS table_co...heck_1', array('SELECT * FROM table_collation...='foo'', 'SHOW FULL TABLES LIKE table_c...heck_1', 'DESCRIBE table_collation_check_1', 'DESC table_collation_check_1', 'EXPLAIN SELECT * FROM table_c...heck_1')) Failed asserting that false is identical to true.
- Tests_DB_Charset::test_table_collation_check::test_table_collation_check with data set #4
('CREATE TABLE table_collation_... INT )', true, 'SELECT * FROM table_collation... a='😈'', 'DROP TABLE IF EXISTS table_co...heck_4', array('SELECT * FROM table_collation...='foo'', 'SHOW FULL TABLES LIKE table_c...heck_4', 'DESCRIBE table_collation_check_4', 'DESC table_collation_check_4', 'EXPLAIN SELECT * FROM table_c...heck_4')) Failed asserting that false is identical to true.
Also... should WordPress set the default charset to "utf8mb4"?
#7
in reply to:
↑ 6
@
4 years ago
Replying to JavierCasares:
Also... should WordPress set the default charset to "utf8mb4"?
I believe that would be best discussed in #48285. Related: #45697.
#8
follow-up:
↓ 11
@
4 years ago
- utf8mb3: is at MariaDB 10.2 included.
- utf8 -> utf8mb3: forced at MariaDB 10.6.
Checking MySQL and MariaDB versions, all supported versions have support for utf8mb3, so we should update "utf8" for "utf8mb3" by default and do some testing.
This ticket was mentioned in Slack in #hosting-community by skithund. View the logs.
4 years ago
This ticket was mentioned in Slack in #hosting-community by javier. View the logs.
4 years ago
#11
in reply to:
↑ 8
@
3 years ago
- Keywords has-patch added; needs-patch removed
- Milestone changed from Future Release to 6.1
Replying to ayeshrajans:
MySQL 8.0.26 also has related changes: https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-26.html
These statements now report
utf8mb3rather thanutf8when writing character set names:EXPLAIN,SHOW CREATE PROCEDURE,SHOW CREATE EVENT.
Stored program definitions retrieved from the data dictionary now report
utf8mb3rather thanutf8in character set references. This affects any output produced from those definitions, such asSHOW CREATEstatements.
This error message now reports
utf8mb3rather thanutf8when writing character set names:ER_INVALID_CHARACTER_STRING.
Thanks! These changes are indeed related, but they don't appear to cause the test failures here.
In my testing, the current tests still pass on MySQL up until version 8.0.29, which is no longer available for download, but has some more character set support changes. The tests start failing on MySQL 8.0.30, with the same six failures as listed in comment:6.
From MySQL 8.0.30 release notes:
Important Change: A previous change renamed character sets having deprecated names prefixed with
utf8_to useutf8mb3_instead. In this release, we rename theutf8_collations as well, using theutf8mb3_prefix; this is to make the collation names consistent with those of the character sets, not to rely any longer on the deprecated collation names, and to clarify the distinction betweenutf8mb3andutf8mb4. The names using theutf8mb3_prefix are now used exclusively for these collations in the output ofSHOWstatements such asSHOW CREATE TABLE, as well as in the values displayed in the columns of Information Schema tables including theCOLLATIONSandCOLUMNStables.
Replying to JavierCasares:
- utf8mb3: is at MariaDB 10.2 included.
- utf8 -> utf8mb3: forced at MariaDB 10.6.
Checking MySQL and MariaDB versions, all supported versions have support for utf8mb3, so we should update "utf8" for "utf8mb3" by default and do some testing.
It is worth noting that WordPress does automatically upgrade to utf8mb4 when possible, see comment:1:ticket:48285.
Reading the MariaDB ticket MDEV-8334 Rename utf8 to utf8mb3:
In long terms we want the name
utf8mean the full featured UTF-8.
We'll do a few preparatory steps:
- Change the main name of the 3-byte character set from
utf8toutf8mb3and makeutf8alias forutf8mb3. This will change allSHOWandINFORMATION_SCHEMAoutput to displayutf8mb3instead ofutf8, as well as changemysqldumpto dumputf8mb3instead of justutf8.- Add a new server option, say
--utf8-is-utf8mb3, which will betrueby default, but the DBA will be able to change it to false and thus makeutf8meanutf8mb4.- A few releases later we'll change
--utf8-is-utf8mb3to befalseby default.Or
- Do not add any new server options and
- Add a new
old_modevalue for revertingutf8toutf8mb3when the default will meanutf8mb4.
The latter appears to be implemented in MariaDB 10.6.1.
Also reading the MySQL note on The utf8mb3 Character Set (3-Byte UTF-8 Unicode Encoding):
Historically, MySQL has used
utf8as an alias forutf8mb3; beginning with MySQL 8.0.28,utf8mb3is used exclusively in the output ofSHOWstatements and in Information Schema tables when this character set is meant.
At some point in the future
utf8is expected to become a reference toutf8mb4. To avoid ambiguity about the meaning ofutf8, consider specifyingutf8mb4explicitly for character set references instead ofutf8.
You should also be aware that the
utf8mb3character set is deprecated and you should expect it to be removed in a future MySQL release. Please useutf8mb4instead.
If the long-term goal of both projects is to make utf8 an alias for utf8mb4 as mentioned above, it seems like utf8mb3 is an intermediate step, and there is no need for WordPress to use that as the default charset at this time, since it already uses utf8mb4 when possible.
I believe the only changes required here would be:
- Adding
utf8mb3_binandutf8mb3_general_cito the list of safe collations recognized bywpdb::check_safe_collation(). This would be the only change for WordPress core. - Adding some conditional version checking for the expected test results as suggested in comment:1. This would only affect the unit tests.
See 53623.diff. Tested on:
- MariaDB 10.6.8
- MySQL 8.0.25
- MySQL 8.0.27
- MySQL 8.0.28
- MySQL 8.0.29
- MySQL 8.0.30
#12
@
3 years ago
- Owner set to SergeyBiryukov
- Resolution set to fixed
- Status changed from new to closed
In 53918:
#13
@
3 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
The tests now pass on PHP 8.0.x + MariaDB 10.6.1+, but still fail on PHP 7.4.x + MariaDB 10.6.1+.
I forgot about MariaDB version being reported differently between PHP versions, see comment:33:ticket:49364:
- PHP 8.0.21:
10.6.8-MariaDB - PHP 7.4.30:
5.5.5-10.6.8-MariaDB
Reopening to correct the version check for setting the $utf8_is_utf8mb3 flag.
Very nice find :)
I suppose we can conditionally
assertSameby checking the db version and the name.From what I see,
self::$server_infocontainsMariaDB, andself::$_wpdb->db_version()returns the version. I don't have a MariaDB test setup at the moment, and will try to put forth a patch this weekend. Just wanted to share my 2 cents in the meantime.