Make WordPress Core

Opened 6 years ago

Last modified 6 years ago

#44386 new enhancement

Problem with utf8mb4_unicode_ci collation for arabic content

Reported by: array064's profile array064 Owned by:
Milestone: Awaiting Review Priority: normal
Severity: major Version: 4.9.6
Component: Database Keywords: needs-testing
Focuses: Cc:


I see that since version 4.6, WordPress uses utf8mb4_unicode_ci as the default collation. I see this in the determine_charset function in the /wp-includes/wp-db.php file (CMIIW).

In my experience, it looks like utf8mb4_unicode_ci has problems with content that uses arabic letters.


I created a tag with the name:


And I created another tag with the name:


Then when I do a tag search (via wp-admin), with keyword:


the search results that appear are:




tags. Whereas it should appear only tag:


according to the search keyword.

This becomes a problem when a post wants to use the tag


, but can not be due to existing tag


My guess is not a bug from WordPress, but a bug from MySQL.

For information, perhaps this link is a related issue:


Change History (1)

#1 @array064
6 years ago

I forgot to write this:

The above problem does not occur if using utf8mb4_general_ci (or utf8_general_ci) as collaction.

So when installing WordPress, I use the above collation on wp-config.php and MySQL, for some of my websites containing Arabic text.

Note: See TracTickets for help on using tickets.