#53623 closed defect (bug) (fixed)
MariaDB 10.6 renamed utf8 to utf8mb3
Reported by: |
|
Owned by: |
|
---|---|---|---|
Milestone: | 6.1 | Priority: | normal |
Severity: | normal | Version: | |
Component: | Database | Keywords: | has-patch |
Focuses: | Cc: |
Description
See MariaDB ticket MDEV-8334 "Rename utf8 to utf8mb3"
Which results in charset tests are now failing.
Attachments (1)
Change History (16)
This ticket was mentioned in Slack in #forums by yui. View the logs.
4 years ago
This ticket was mentioned in Slack in #hosting-community by yui. View the logs.
4 years ago
#4
@
4 years ago
MySQL 8.0.26 also has related changes: https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-26.html
These statements now report utf8mb3 rather than utf8 when writing character set names: EXPLAIN, SHOW CREATE PROCEDURE, SHOW CREATE EVENT.
Stored program definitions retrieved from the data dictionary now report utf8mb3 rather than utf8 in character set references. This affects any output produced from those definitions, such as SHOW CREATE statements.
This error message now reports utf8mb3 rather than utf8 when writing character set names: ER_INVALID_CHARACTER_STRING. (Bug #32233614, Bug #32392077, Bug #32392209, Bug #32428538, Bug #32428598)
#6
follow-up:
↓ 7
@
3 years ago
Tested with:
- PHP 7.1 -> 8.1
- MariaDB 10.6
This makes some test errors:
- Tests_DB_Charset::test_set_charset_changes_the_connection_collation
Failed asserting that two strings are identical. --- Expected +++ Actual @@ @@ -'utf8_general_ci' +'utf8mb3_general_ci'
- Tests_DB_Charset::test_get_column_charset::test_get_column_charset with data set #5
Failed asserting that two strings are identical. --- Expected +++ Actual @@ @@ -'utf8' +'utf8mb3'
- Tests_DB_Charset::test_get_column_charset::test_get_column_charset with data set #6
Failed asserting that two strings are identical. --- Expected +++ Actual @@ @@ -'utf8' +'utf8mb3'
- Tests_DB_Charset::test_table_collation_check::test_table_collation_check with data set #0
('CREATE TABLE table_collation_..._bin )', true, 'SELECT * FROM table_collation... a='😈'', 'DROP TABLE IF EXISTS table_co...heck_0', array('SELECT * FROM table_collation...='foo'', 'SHOW FULL TABLES LIKE table_c...heck_0', 'DESCRIBE table_collation_check_0', 'DESC table_collation_check_0', 'EXPLAIN SELECT * FROM table_c...heck_0')) Failed asserting that false is identical to true.
- Tests_DB_Charset::test_table_collation_check::test_table_collation_check with data set #1
('CREATE TABLE table_collation_...l_ci )', true, 'SELECT * FROM table_collation... a='😈'', 'DROP TABLE IF EXISTS table_co...heck_1', array('SELECT * FROM table_collation...='foo'', 'SHOW FULL TABLES LIKE table_c...heck_1', 'DESCRIBE table_collation_check_1', 'DESC table_collation_check_1', 'EXPLAIN SELECT * FROM table_c...heck_1')) Failed asserting that false is identical to true.
- Tests_DB_Charset::test_table_collation_check::test_table_collation_check with data set #4
('CREATE TABLE table_collation_... INT )', true, 'SELECT * FROM table_collation... a='😈'', 'DROP TABLE IF EXISTS table_co...heck_4', array('SELECT * FROM table_collation...='foo'', 'SHOW FULL TABLES LIKE table_c...heck_4', 'DESCRIBE table_collation_check_4', 'DESC table_collation_check_4', 'EXPLAIN SELECT * FROM table_c...heck_4')) Failed asserting that false is identical to true.
Also... should WordPress set the default charset to "utf8mb4"?
#7
in reply to:
↑ 6
@
3 years ago
Replying to JavierCasares:
Also... should WordPress set the default charset to "utf8mb4"?
I believe that would be best discussed in #48285. Related: #45697.
#8
follow-up:
↓ 11
@
3 years ago
- utf8mb3: is at MariaDB 10.2 included.
- utf8 -> utf8mb3: forced at MariaDB 10.6.
Checking MySQL and MariaDB versions, all supported versions have support for utf8mb3, so we should update "utf8" for "utf8mb3" by default and do some testing.
This ticket was mentioned in Slack in #hosting-community by skithund. View the logs.
3 years ago
This ticket was mentioned in Slack in #hosting-community by javier. View the logs.
3 years ago
#11
in reply to:
↑ 8
@
2 years ago
- Keywords has-patch added; needs-patch removed
- Milestone changed from Future Release to 6.1
Replying to ayeshrajans:
MySQL 8.0.26 also has related changes: https://dev.mysql.com/doc/relnotes/mysql/8.0/en/news-8-0-26.html
These statements now report
utf8mb3
rather thanutf8
when writing character set names:EXPLAIN
,SHOW CREATE PROCEDURE
,SHOW CREATE EVENT
.
Stored program definitions retrieved from the data dictionary now report
utf8mb3
rather thanutf8
in character set references. This affects any output produced from those definitions, such asSHOW CREATE
statements.
This error message now reports
utf8mb3
rather thanutf8
when writing character set names:ER_INVALID_CHARACTER_STRING
.
Thanks! These changes are indeed related, but they don't appear to cause the test failures here.
In my testing, the current tests still pass on MySQL up until version 8.0.29, which is no longer available for download, but has some more character set support changes. The tests start failing on MySQL 8.0.30, with the same six failures as listed in comment:6.
From MySQL 8.0.30 release notes:
Important Change: A previous change renamed character sets having deprecated names prefixed with
utf8_
to useutf8mb3_
instead. In this release, we rename theutf8_
collations as well, using theutf8mb3_
prefix; this is to make the collation names consistent with those of the character sets, not to rely any longer on the deprecated collation names, and to clarify the distinction betweenutf8mb3
andutf8mb4
. The names using theutf8mb3_
prefix are now used exclusively for these collations in the output ofSHOW
statements such asSHOW CREATE TABLE
, as well as in the values displayed in the columns of Information Schema tables including theCOLLATIONS
andCOLUMNS
tables.
Replying to JavierCasares:
- utf8mb3: is at MariaDB 10.2 included.
- utf8 -> utf8mb3: forced at MariaDB 10.6.
Checking MySQL and MariaDB versions, all supported versions have support for utf8mb3, so we should update "utf8" for "utf8mb3" by default and do some testing.
It is worth noting that WordPress does automatically upgrade to utf8mb4
when possible, see comment:1:ticket:48285.
Reading the MariaDB ticket MDEV-8334 Rename utf8 to utf8mb3:
In long terms we want the name
utf8
mean the full featured UTF-8.
We'll do a few preparatory steps:
- Change the main name of the 3-byte character set from
utf8
toutf8mb3
and makeutf8
alias forutf8mb3
. This will change allSHOW
andINFORMATION_SCHEMA
output to displayutf8mb3
instead ofutf8
, as well as changemysqldump
to dumputf8mb3
instead of justutf8
.- Add a new server option, say
--utf8-is-utf8mb3
, which will betrue
by default, but the DBA will be able to change it to false and thus makeutf8
meanutf8mb4
.- A few releases later we'll change
--utf8-is-utf8mb3
to befalse
by default.Or
- Do not add any new server options and
- Add a new
old_mode
value for revertingutf8
toutf8mb3
when the default will meanutf8mb4
.
The latter appears to be implemented in MariaDB 10.6.1.
Also reading the MySQL note on The utf8mb3 Character Set (3-Byte UTF-8 Unicode Encoding):
Historically, MySQL has used
utf8
as an alias forutf8mb3
; beginning with MySQL 8.0.28,utf8mb3
is used exclusively in the output ofSHOW
statements and in Information Schema tables when this character set is meant.
At some point in the future
utf8
is expected to become a reference toutf8mb4
. To avoid ambiguity about the meaning ofutf8
, consider specifyingutf8mb4
explicitly for character set references instead ofutf8
.
You should also be aware that the
utf8mb3
character set is deprecated and you should expect it to be removed in a future MySQL release. Please useutf8mb4
instead.
If the long-term goal of both projects is to make utf8
an alias for utf8mb4
as mentioned above, it seems like utf8mb3
is an intermediate step, and there is no need for WordPress to use that as the default charset at this time, since it already uses utf8mb4
when possible.
I believe the only changes required here would be:
- Adding
utf8mb3_bin
andutf8mb3_general_ci
to the list of safe collations recognized bywpdb::check_safe_collation()
. This would be the only change for WordPress core. - Adding some conditional version checking for the expected test results as suggested in comment:1. This would only affect the unit tests.
See 53623.diff. Tested on:
- MariaDB 10.6.8
- MySQL 8.0.25
- MySQL 8.0.27
- MySQL 8.0.28
- MySQL 8.0.29
- MySQL 8.0.30
#12
@
2 years ago
- Owner set to SergeyBiryukov
- Resolution set to fixed
- Status changed from new to closed
In 53918:
#13
@
2 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
The tests now pass on PHP 8.0.x + MariaDB 10.6.1+, but still fail on PHP 7.4.x + MariaDB 10.6.1+.
I forgot about MariaDB version being reported differently between PHP versions, see comment:33:ticket:49364:
- PHP 8.0.21:
10.6.8-MariaDB
- PHP 7.4.30:
5.5.5-10.6.8-MariaDB
Reopening to correct the version check for setting the $utf8_is_utf8mb3
flag.
Very nice find :)
I suppose we can conditionally
assertSame
by checking the db version and the name.From what I see,
self::$server_info
containsMariaDB
, andself::$_wpdb->db_version()
returns the version. I don't have a MariaDB test setup at the moment, and will try to put forth a patch this weekend. Just wanted to share my 2 cents in the meantime.