Opened 5 years ago
Closed 5 years ago
#48044 closed defect (bug) (fixed)
Site Health use of "emoticons" in UTFMB4 check is ambiguous
Reported by: |
|
Owned by: |
|
---|---|---|---|
Milestone: | 5.3 | Priority: | normal |
Severity: | minor | Version: | |
Component: | Site Health | Keywords: | has-patch needs-copy-review |
Focuses: | ui-copy | Cc: |
Description
The UTFMB4 test says:
UTF8MB4 is a database storage attribute that makes sure your site can store non-English text and other strings (for instance emoticons) without unexpected problems.
Emoticons, by definition, are (emphasis mine):
a pictorial representation of a facial expression using characters—usually punctuation marks, numbers, and letters—to express a person's feelings or mood
Emoticons might be referring to the more technical alternative meaning of unicode block and if so, it seems oddly technical to include it here while also not being completely accurate as written.
Attachments (3)
Change History (20)
#2
in reply to:
↑ 1
@
5 years ago
- Keywords good-first-bug added
- Milestone changed from Awaiting Review to 5.3
Replying to johnjamesjacoby:
UTF8MB4 is the database format WordPress prefers because it safely supports the widest set of characters and letters, specifically for publishing in languages other than American English, including Emoji.
I'd suggest "database encoding" instead of "database format", looks good to me otherwise.
#3
@
5 years ago
Database encoding is more accurate and even less confusing I also agree.
Why do we use "languages" in there, though? Emojis, numbers symbols, flags, etc are not from a particular language.
#4
@
5 years ago
I think "Emoji" refers to "the widest set of characters and letters" here, rather than "languages". Perhaps that could be made clearer, though? I would also drop "American", it seems unnecessary specific :)
UTF8MB4 is the database encoding WordPress prefers because it safely supports the widest set of characters and letters, including Emoji, specifically for publishing in languages other than English.
This ticket was mentioned in Slack in #core-site-health by afragen. View the logs.
5 years ago
This ticket was mentioned in Slack in #core by desrosj. View the logs.
5 years ago
This ticket was mentioned in Slack in #core-site-health by afragen. View the logs.
5 years ago
This ticket was mentioned in Slack in #core by david.baumwald. View the logs.
5 years ago
#13
@
5 years ago
- Keywords 2nd-opinion good-first-bug removed
- Owner set to garrett-eclipse
- Status changed from new to accepted
Thanks for the patch @chetan200891 I've taken into account @johnjamesjacoby & @SergeyBiryukov & @ayeshrajans for refresh 48044.2.diff.
I've left in review to get thoughts on the current string below;
'UTF8MB4 is the character encoding WordPress prefers for database storage because it safely supports the widest set of characters and letters, including Emoji, enabling better support for non-English languages.'
Note: I did switch to use 'character encoding' over 'database encoding' as it is more accurate.
Thoughts? If we can get consensus this may be able to make 5.3 beta3
#15
@
5 years ago
Sorry, I realized some minor inaccuracies as I read the definitions it's a character set
used for database storage
and a character set is a set of characters and encodings
.
To account for this I refreshed in 48044.3.diff
New string for review;
'UTF8MB4 is the character set WordPress prefers for database storage because it safely supports the widest set of characters and encodings, including Emoji, enabling better support for non-English languages.'
#16
@
5 years ago
Thanks @ayeshrajans I tweaked it slightly in comment#15 after reading some more definitions online.
One quick suggestion: