Opened 5 years ago
Last modified 2 years ago
#48285 assigned enhancement
wp-config-sample.php should default to `utf8mb4` instead of `utf8` character set
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | minor | Version: | 5.3 |
Component: | Database | Keywords: | has-patch |
Focuses: | Cc: |
Description
MySQL's utf8
character encoding is not a correct implementation of the standard and doesn't work with 4-byte characters, which includes many emoji. utf8mb4
is the corrected implementation.
See https://medium.com/@adamhooper/in-mysql-never-use-utf8-use-utf8mb4-11761243e434 or just google "mysql utf8 vs utf8mb4"
It would seem wise for wp-config-sample.php
to default then to utf8mb4
instead of utf8
so that new installations have the improved character set.
Change History (10)
#2
@
3 years ago
It's also worth noting that wp-admin/setup-config.php
does write DB_CHARSET
as utf8mb4
instead of utf8
if the server supports that, see [31349] / #21212 and comment:2:ticket:33122.
#3
@
3 years ago
Right now, with the latest MySQL 8.0 and MariaDB 10.6 versions, there is no "utf8" because hey changed it for "utf8mb3".
If we want to support all the inernational language charset, WordPress should support by default "utf8mb4" (supported by all WordPress-SQL supported databases).
This ticket was mentioned in PR #2214 on WordPress/wordpress-develop by bchecketts.
3 years ago
#4
- Keywords has-patch added
Change DB_CHARSET
in wp-config-sample.php from utf8
to utf8mb4
Trac ticket: https://core.trac.wordpress.org/ticket/48285
#5
follow-up:
↓ 7
@
3 years ago
- Keywords has-patch removed
Wordpress requirements listed at https://wordpress.org/about/requirements/ indicate that MySQL version 5.7 is required.
The utf8mb4
character set was released in MySQL version 5.5.3 in 2010. (See page 159 of https://downloads.mysql.com/docs/mysql-5.5-relnotes-en.pdf. The MySQL Release Notes on mysql.com no longer to back to v5.5).
Pull Request at https://github.com/WordPress/wordpress-develop/pull/2214
#7
in reply to:
↑ 5
@
3 years ago
Replying to bchecketts:
WordPress requirements listed at https://wordpress.org/about/requirements/ indicate that MySQL version 5.7 is required.
Please note that MySQL 5.7 or greater is the recommended version, not required. It was updated in [meta11407] after the discussion in comment:11:ticket:41490.
The required versions are mentioned a bit further down the page and have not changed in a while:
Note: If you are in a legacy environment where you only have older PHP or MySQL versions, WordPress also works with PHP 5.6.20+ and MySQL 5.0+, but these versions have reached official End Of Life and as such may expose your site to security vulnerabilities.
#8
follow-up:
↓ 10
@
3 years ago
- utf8mb4 introduction: MySQL 5.5 and MariaDB 5.5 (it's at MariaDB 10.2 for sure).
- utf8mb3: is at MariaDB 10.2 included.
- utf8 -> utf8mb3: forced at MariaDB 10.6.
Based on this, yes, new versions of WordPress may have utf8mb4 by default.
This ticket was mentioned in Slack in #hosting-community by javier. View the logs.
3 years ago
#10
in reply to:
↑ 8
@
2 years ago
Replying to JavierCasares:
Right now, with the latest MySQL 8.0 and MariaDB 10.6 versions, there is no "utf8" because hey changed it for "utf8mb3".
Thanks! This should now be addressed in #53623.
If we want to support all the international language charset, WordPress should support by default "utf8mb4" (supported by all WordPress-SQL supported databases).
As noted in comment:1 and comment:2, WordPress does automatically upgrade to utf8mb4
when possible.
Replying to JavierCasares:
Based on this, yes, new versions of WordPress may have utf8mb4 by default.
I might be missing something, but as noted in comment:7, WordPress still has MySQL 5.0 as a minimum requirement at this time, which did not include utf8mb4
. So it looks like until the minimum version is bumped to MySQL 5.5, it is neither safe nor required to change the default charset in wp-config-sample.php
.
On a related note, reading the MariaDB ticket MDEV-8334 Rename utf8 to utf8mb3:
In long terms we want the name
utf8
mean the full featured UTF-8.
We'll do a few preparatory steps:
- Change the main name of the 3-byte character set from
utf8
toutf8mb3
and makeutf8
alias forutf8mb3
. This will change allSHOW
andINFORMATION_SCHEMA
output to displayutf8mb3
instead ofutf8
, as well as changemysqldump
to dumputf8mb3
instead of justutf8
.- Add a new server option, say
--utf8-is-utf8mb3
, which will betrue
by default, but the DBA will be able to change it to false and thus makeutf8
meanutf8mb4
.- A few releases later we'll change
--utf8-is-utf8mb3
to befalse
by default.Or
- Do not add any new server options and
- Add a new
old_mode
value for revertingutf8
toutf8mb3
when the default will meanutf8mb4
.
The latter appears to be implemented in MariaDB 10.6.1.
Also reading the MySQL note on The utf8mb3 Character Set (3-Byte UTF-8 Unicode Encoding):
Historically, MySQL has used
utf8
as an alias forutf8mb3
; beginning with MySQL 8.0.28,utf8mb3
is used exclusively in the output ofSHOW
statements and in Information Schema tables when this character set is meant.
At some point in the future
utf8
is expected to become a reference toutf8mb4
. To avoid ambiguity about the meaning ofutf8
, consider specifyingutf8mb4
explicitly for character set references instead ofutf8
.
You should also be aware that the
utf8mb3
character set is deprecated and you should expect it to be removed in a future MySQL release. Please useutf8mb4
instead.
If the long-term goal of both projects is to make utf8
an alias for utf8mb4
as mentioned above, the default charset in wp-config-sample.php
may not technically need any changes at all, though it still might be a good idea to explicitly change it to utf8mb4
when the minimum version is bumped to MySQL 5.5.
Previously: #21212, #32105, #32405, #33122.
Thanks for the ticket!
On both new and existing WordPress installs, WordPress will automatically upgrade the tables to
utf8mb4
if the server supports that, and whenDB_CHARSET
is defined asutf8
, it will automatically switch toutf8mb4
instead.wp-config-sample.php
still needs to default toutf8
though, as not all sites can supportutf8mb4
.