Opened 8 years ago
Closed 4 years ago
#37956 closed defect (bug) (worksforme)
DB_COLLATE doesn't override $collate when defining utf8mb4_unicode_ci in wp-config
Reported by: | MikeGillihan | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 4.6 |
Component: | Database | Keywords: | |
Focuses: | Cc: |
Description
As of 4.6 when I spin up a site locally it's created using utf8mb4_unicode_520_ci as the dafault collation. However, this is causing issues whenever I push to a live server that does not yet support the newer version.
If a constant is specifically defined in wp-config.php, should it override the default behavior? Currently, if utf8mb4_unicode_ci is defined it is still upgraded to utf8mb4_unicode_520_ci regardless of the constant.
Attachments (1)
Change History (8)
#1
follow-ups:
↓ 2
↓ 3
@
8 years ago
- Milestone Awaiting Review deleted
- Resolution set to wontfix
- Status changed from new to closed
#2
in reply to:
↑ 1
@
8 years ago
Replying to pento:
The workaround for this is to either have your development and production environments match, or to include a step in your data migration process to change the collation.
This is frustrating. May I suggest a constant that would override the override? Something like:
wp-config.php
define('DB_COLLATE_OVERRIDE', [true|false]); // default true
#3
in reply to:
↑ 1
@
8 years ago
Replying to pento:
This behaviour is on purpose - if
DB_COLLATE
is defined asutf8mb4_unicode_ci
, but the server supportsutf8mb4_unicode_520_ci
, it's better to use the latter, in much the same way as settingDB_CHARSET
toutf8
will be automatically upgraded toutf8mb4
when possible.
Sorry, I missed your response @pento. Thanks for being so speedy!
I understand the behavior is intended and I agree that 520 is better. My point was more about the fact that the core file overrides the global constant defined in wp-config.php.
It's a bit abstract, but if I incorrectly define DB_NAME, the constant is respected and it breaks the install. Why then, do we not respect the DB_COLLATE constant?
@discern While it could achieve the desired result, it feels a bit heavy. The patch I provided just adds a conditional wrapper that fully respects the constant.
#4
@
8 years ago
- Resolution wontfix deleted
- Status changed from closed to reopened
I want to echo the sentiment here, that WP should be respecting the DB_COLLATE/DB_CHARSET if explicitly set.
There is an open bug in mysql where utf8mb4_unicode_520_ci doesn't correctly distinguish certain Japanese characters (https://bugs.mysql.com/bug.php?id=79977).
So when I use get_page_by_title
in Wordpress with a dakuten, it will incorrectly return a result if they share a base (searching for ぺ will return a post with へ). Simple solution is to not use the 520 collation.
It would be ok to say it's a bug in mysql (which is true), but I should be able to select the best charset/collate for my particular use case.
This ticket was mentioned in Slack in #core by noisysocks. View the logs.
4 years ago
#7
@
4 years ago
- Milestone Awaiting Review deleted
- Resolution set to worksforme
- Status changed from reopened to closed
This was followed up on in a triage session today.
Since the report, MySQL have resolved the bug by adding utf8mb4_ja_0900_as_cs
and utf8mb4_ja_0900_as_cs_ks
character sets. These can be used to resolve the issue.
As there is an upstream resolution, it was decided to close this ticket.
This behaviour is on purpose - if
DB_COLLATE
is defined asutf8mb4_unicode_ci
, but the server supportsutf8mb4_unicode_520_ci
, it's better to use the latter, in much the same way as settingDB_CHARSET
toutf8
will be automatically upgraded toutf8mb4
when possible.The workaround for this is to either have your development and production environments match, or to include a step in your data migration process to change the collation.