WordPress.org

Make WordPress Core

Opened 5 years ago

Closed 19 months ago

#33731 closed defect (bug) (invalid)

Post's tags do not support accented characters

Reported by: Cimmo Owned by:
Milestone: Priority: normal
Severity: normal Version: 4.3
Component: Taxonomy Keywords: reporter-feedback
Focuses: Cc:

Description

WordPress 4.3

Steps:

  • create a post
  • add a tag that contains accented letter like 'ü'
  • save

Result:

  • tags do not support utf-8 correctly? accented letter is corrupted

Expected result:

  • tags correctly support utf-8 and accented letters

Attachments (5)

33731.png (4.5 KB) - added by SergeyBiryukov 5 years ago.
33731.2.png (5.1 KB) - added by SergeyBiryukov 5 years ago.
tags.png (10.2 KB) - added by Cimmo 5 years ago.
Tags after post has been saved.
Screenshot from 2015-09-14 20:25:01.png (20.0 KB) - added by mehulkaklotar 5 years ago.
The category szél should be added but it is denying the because of the category szel.
33731.diff (756 bytes) - added by mehulkaklotar 5 years ago.
check at sql level BINARY will let the cateogry szél be added even if the category szel category is added

Download all attachments as: .zip

Change History (20)

#1 @SergeyBiryukov
5 years ago

  • Component changed from Posts, Post Types to Taxonomy
  • Summary changed from Post's tags do not support accented charactars to Post's tags do not support accented characters

@SergeyBiryukov
5 years ago

#2 @SergeyBiryukov
5 years ago

Hi @Cimmo, thanks for the report.

I could not reproduce the issue, the tag is displayed correctly after saving: 33731.png.

What is the database collation on your install?

#3 follow-up: @Cimmo
5 years ago

You are right, just that letter works, try this tag 'Üürikomisjoni' and save the post.
Anyway:

All tables are:
Database: MyISAM
Collection: utf8mb4_unicode_ci

Database was defaulted to latin1_swedish_ci not sure why, I changed it now to utf8mb4_unicode_ci and the problem persists.

#4 @SergeyBiryukov
5 years ago

"Üürikomisjoni" works too: 33731.2.png.

Does the issue still happen with all plugins disabled and a default theme (Twenty Fifteen or Twenty Fourteen) activated?

#5 @Cimmo
5 years ago

This is with twenty fourteen theme and deactivated all plugins. Still reproduces.
To be mentioned that in the post itself the problem does not reproduce, only on tags.

Last edited 5 years ago by Cimmo (previous) (diff)

@Cimmo
5 years ago

Tags after post has been saved.

#6 in reply to: ↑ 3 ; follow-up: @swissspidy
5 years ago

  • Keywords reporter-feedback added

Replying to Cimmo:

All tables are:
Database: MyISAM
Collection: utf8mb4_unicode_ci

Database was defaulted to latin1_swedish_ci not sure why, I changed it now to utf8mb4_unicode_ci and the problem persists.

Really sounds like something's off with the database config.
What about the collation of the name field in the wp_terms table? This should be utf8mb4_unicode_ci too. Which version of MySQL are you using?

#7 @swissspidy
5 years ago

#33864 was marked as a duplicate.

@mehulkaklotar
5 years ago

The category szél should be added but it is denying the because of the category szel.

@mehulkaklotar
5 years ago

check at sql level BINARY will let the cateogry szél be added even if the category szel category is added

#8 @mehulkaklotar
5 years ago

  • Keywords has-patch dev-feedback added

#9 @swissspidy
5 years ago

  • Keywords needs-unit-tests added

#10 @SergeyBiryukov
5 years ago

  • Keywords has-patch dev-feedback needs-unit-tests removed

@mehulkaklotar, could you move the patch over to #33864?

This ticket seems to be unrelated and looks more like a configuration issue (broken encoding for AJAX requests).

#11 in reply to: ↑ 6 ; follow-up: @Cimmo
5 years ago

Replying to swissspidy:

Really sounds like something's off with the database config.
What about the collation of the name field in the wp_terms table? This should be utf8mb4_unicode_ci too. Which version of MySQL are you using?

'name' has utf8mb4_unicode_ci collation
MySQL 5.5.44

Anything wrong with it?

#12 @mehulkaklotar
5 years ago

Moving the patch to the #33864.

#13 in reply to: ↑ 11 @SergeyBiryukov
5 years ago

Replying to Cimmo:

'name' has utf8mb4_unicode_ci collation
MySQL 5.5.44

Anything wrong with it?

Nope. Could you try adding AddDefaultCharset utf-8 at the top of your .htaccess file and see if that helps?

#14 @swissspidy
4 years ago

@Cimmo Is this still an issue on your site? Have you tried updating the .htaccess file like suggested above?

#15 @boonebgorges
19 months ago

  • Milestone Awaiting Review deleted
  • Resolution set to invalid
  • Status changed from new to closed

As it appears that this problem is due to a charset configuration issue, and since the ticket hasn't been confirmed since the last update several years ago, I believe the ticket is safe to close.

Note: See TracTickets for help on using tickets.