#48285 closed enhancement (fixed)
wp-config-sample.php should default to `utf8mb4` instead of `utf8` character set
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Milestone: | 6.9 | Priority: | normal |
| Severity: | minor | Version: | 5.3 |
| Component: | Database | Keywords: | has-patch add-to-field-guide |
| Focuses: | Cc: |
Description
MySQL's utf8 character encoding is not a correct implementation of the standard and doesn't work with 4-byte characters, which includes many emoji. utf8mb4 is the corrected implementation.
See https://medium.com/@adamhooper/in-mysql-never-use-utf8-use-utf8mb4-11761243e434 or just google "mysql utf8 vs utf8mb4"
It would seem wise for wp-config-sample.php to default then to utf8mb4 instead of utf8 so that new installations have the improved character set.
Change History (16)
#2
@
4 years ago
It's also worth noting that wp-admin/setup-config.php does write DB_CHARSET as utf8mb4 instead of utf8 if the server supports that, see [31349] / #21212 and comment:2:ticket:33122.
#3
@
4 years ago
Right now, with the latest MySQL 8.0 and MariaDB 10.6 versions, there is no "utf8" because hey changed it for "utf8mb3".
If we want to support all the inernational language charset, WordPress should support by default "utf8mb4" (supported by all WordPress-SQL supported databases).
This ticket was mentioned in PR #2214 on WordPress/wordpress-develop by bchecketts.
4 years ago
#4
- Keywords has-patch added
Change DB_CHARSET in wp-config-sample.php from utf8 to utf8mb4
Trac ticket: https://core.trac.wordpress.org/ticket/48285
#5
follow-up:
↓ 7
@
4 years ago
- Keywords has-patch removed
Wordpress requirements listed at https://wordpress.org/about/requirements/ indicate that MySQL version 5.7 is required.
The utf8mb4 character set was released in MySQL version 5.5.3 in 2010. (See page 159 of https://downloads.mysql.com/docs/mysql-5.5-relnotes-en.pdf. The MySQL Release Notes on mysql.com no longer to back to v5.5).
Pull Request at https://github.com/WordPress/wordpress-develop/pull/2214
#7
in reply to:
↑ 5
@
4 years ago
Replying to bchecketts:
WordPress requirements listed at https://wordpress.org/about/requirements/ indicate that MySQL version 5.7 is required.
Please note that MySQL 5.7 or greater is the recommended version, not required. It was updated in [meta11407] after the discussion in comment:11:ticket:41490.
The required versions are mentioned a bit further down the page and have not changed in a while:
Note: If you are in a legacy environment where you only have older PHP or MySQL versions, WordPress also works with PHP 5.6.20+ and MySQL 5.0+, but these versions have reached official End Of Life and as such may expose your site to security vulnerabilities.
#8
follow-up:
↓ 10
@
4 years ago
- utf8mb4 introduction: MySQL 5.5 and MariaDB 5.5 (it's at MariaDB 10.2 for sure).
- utf8mb3: is at MariaDB 10.2 included.
- utf8 -> utf8mb3: forced at MariaDB 10.6.
Based on this, yes, new versions of WordPress may have utf8mb4 by default.
This ticket was mentioned in Slack in #hosting-community by javier. View the logs.
4 years ago
#10
in reply to:
↑ 8
@
3 years ago
Replying to JavierCasares:
Right now, with the latest MySQL 8.0 and MariaDB 10.6 versions, there is no "utf8" because hey changed it for "utf8mb3".
Thanks! This should now be addressed in #53623.
If we want to support all the international language charset, WordPress should support by default "utf8mb4" (supported by all WordPress-SQL supported databases).
As noted in comment:1 and comment:2, WordPress does automatically upgrade to utf8mb4 when possible.
Replying to JavierCasares:
Based on this, yes, new versions of WordPress may have utf8mb4 by default.
I might be missing something, but as noted in comment:7, WordPress still has MySQL 5.0 as a minimum requirement at this time, which did not include utf8mb4. So it looks like until the minimum version is bumped to MySQL 5.5, it is neither safe nor required to change the default charset in wp-config-sample.php.
On a related note, reading the MariaDB ticket MDEV-8334 Rename utf8 to utf8mb3:
In long terms we want the name
utf8mean the full featured UTF-8.
We'll do a few preparatory steps:
- Change the main name of the 3-byte character set from
utf8toutf8mb3and makeutf8alias forutf8mb3. This will change allSHOWandINFORMATION_SCHEMAoutput to displayutf8mb3instead ofutf8, as well as changemysqldumpto dumputf8mb3instead of justutf8.- Add a new server option, say
--utf8-is-utf8mb3, which will betrueby default, but the DBA will be able to change it to false and thus makeutf8meanutf8mb4.- A few releases later we'll change
--utf8-is-utf8mb3to befalseby default.Or
- Do not add any new server options and
- Add a new
old_modevalue for revertingutf8toutf8mb3when the default will meanutf8mb4.
The latter appears to be implemented in MariaDB 10.6.1.
Also reading the MySQL note on The utf8mb3 Character Set (3-Byte UTF-8 Unicode Encoding):
Historically, MySQL has used
utf8as an alias forutf8mb3; beginning with MySQL 8.0.28,utf8mb3is used exclusively in the output ofSHOWstatements and in Information Schema tables when this character set is meant.
At some point in the future
utf8is expected to become a reference toutf8mb4. To avoid ambiguity about the meaning ofutf8, consider specifyingutf8mb4explicitly for character set references instead ofutf8.
You should also be aware that the
utf8mb3character set is deprecated and you should expect it to be removed in a future MySQL release. Please useutf8mb4instead.
If the long-term goal of both projects is to make utf8 an alias for utf8mb4 as mentioned above, the default charset in wp-config-sample.php may not technically need any changes at all, though it still might be a good idea to explicitly change it to utf8mb4 when the minimum version is bumped to MySQL 5.5.
#11
@
3 months ago
@SergeyBiryukov we have passed the minimum required version bump. it seems like this patch may be ready to go?
This ticket was mentioned in PR #9452 on WordPress/wordpress-develop by @SergeyBiryukov.
3 months ago
#13
Trac ticket: https://core.trac.wordpress.org/ticket/48285
Previously: #21212, #32105, #32405, #33122.
Thanks for the ticket!
On both new and existing WordPress installs, WordPress will automatically upgrade the tables to
utf8mb4if the server supports that, and whenDB_CHARSETis defined asutf8, it will automatically switch toutf8mb4instead.wp-config-sample.phpstill needs to default toutf8though, as not all sites can supportutf8mb4.