Opened 2 years ago
Last modified 2 months ago
#60002 new defect (bug)
maybe_convert_table_to_utf8mb4() not work with new version of MySQL/MariaDB.
| Reported by: |
|
Owned by: | |
|---|---|---|---|
| Milestone: | Awaiting Review | Priority: | normal |
| Severity: | normal | Version: | |
| Component: | General | Keywords: | |
| Focuses: | Cc: |
Description
From [this document](https://mariadb.com/kb/en/unicode/). They said:
From MariaDB 10.6, utf8 is by default an alias for utf8mb3
And from [this document](https://dev.mysql.com/doc/refman/8.0/en/charset-unicode-utf8mb3.html) in MySQL. They said:
Historically, MySQL has used utf8 as an alias for utf8mb3; beginning with MySQL 8.0.28, utf8mb3 is used exclusively in the output of SHOW statements and in Information Schema tables when this character set is meant.
It can be refer to [this issue](https://core.trac.wordpress.org/changeset/53918).
So, from this simple MySQL code:
CREATE TABLE testing (
id bigint(20) NOT NULL AUTO_INCREMENT,
name varchar(50) DEFAULT NULL,
PRIMARY KEY (id)
) DEFAULT CHARSET=utf8 AUTO_INCREMENT=1
If you run on MariaDB 10.6 or newer (for me it is 10.6.7) the utf8 column will be automatically changed to utf8mb3_general_ci.
But, WordPress function maybe_convert_table_to_utf8mb4() contain this condition:
<?php if ( 'utf8' !== $charset && 'utf8mb4' !== $charset ) { // Don't upgrade tables that have non-utf8 columns. return false; }
Which will be result in no change from utf8mb3_general_ci to utf8mb4_unicode_ci.
Change History (4)
#3
@
3 months ago
This cause problems when plugins are trying to create/alter their tables and cause wrong collation. Please fix.
#4
@
2 months ago
Found another possible bug.
File: wp-includes/class-wpdb.php
Line: 886
Method: determine_charset()
if ( 'utf8' === $charset ) {
$charset = 'utf8mb4';
}
Should be change to
if ( 'utf8' === $charset || 'utf8mb3' === $charset ) {
$charset = 'utf8mb4';
}
From same reason.
Historically, MySQL has used utf8 as an alias for utf8mb3; beginning with MySQL 8.0.28, utf8mb3 is used exclusively in the output of SHOW statements and in Information Schema tables when this character set is meant.
My hot fix function that improved to always use better collation (mentioned by WordPress wpdb.php file).
function rundizcomhotfix_maybe_convert_table_to_utf8mb4( $table ) {
global $wpdb;
$results = $wpdb->get_results("SHOW FULL COLUMNS FROM `$table`");
if (!$results) {
return false;
}
foreach ($results as $column) {
if ($column->Collation) {
list( $charset ) = explode('_', $column->Collation);
$charset = strtolower($charset);
if ('utf8' !== $charset && 'utf8mb3' !== $charset && 'utf8mb4' !== $charset) {
// Don't upgrade tables that have non-utf8 columns.
return false;
}
}
}
$table_details = $wpdb->get_row("SHOW TABLE STATUS LIKE '$table'");
if (!$table_details) {
return false;
}
list( $table_charset ) = explode('_', $table_details->Collation);
$table_charset = strtolower($table_charset);
if ('utf8mb4' === $table_charset) {
return true;
}
// the code above has been copied from original function.
// custom code that upgrade to best collate. ---------------------------------
$table_charset = 'utf8mb4';
$collate = 'utf8mb4_unicode_ci';
$charset_collate = $wpdb->determine_charset($table_charset, $collate);
$table_charset = $charset_collate['charset'];
$collate = $charset_collate['collate'];
unset($charset_collate);
// end custom code that upgrade to best collate. ---------------------------------
// phpcs:ignore WordPress.DB.DirectDatabaseQuery.SchemaChange
return $wpdb->query("ALTER TABLE $table CONVERT TO CHARACTER SET $table_charset COLLATE $collate");
}
Tested with MySQL 8.2, and
SHOW FULL COLUMNS FROM$tableindeed returnsutf8mb3_...as the collation name. So I think you are right, that we need to change that line to something like this:if ( 'utf8' !== $charset && 'utf8mb3' !== $charset && 'utf8mb4' !== $charset ) {