WordPress.org

Make WordPress Core

Opened 7 years ago

Closed 6 years ago

#17738 closed defect (bug) (fixed)

remove_accents() can't handle Vietnamese vowels

Reported by: tgeorge Owned by: nacin
Milestone: 3.4 Priority: normal
Severity: normal Version: 3.1.3
Component: Formatting Keywords: has-patch
Focuses: Cc:

Description

replace_accents() can't handle many of the vowels present in Vietnamese. For the complete list of vowels:

http://en.wikipedia.org/wiki/Vietnamese_alphabet#Tone_marks

Here are the precise vowels that replace_accents() can't handle currently:

ẰằẦầỀềỒồỜờỪừỲỳẢảẲẳẨẩẺẻỂểỈỉỎỏỔổỞởỦủỬửỶỷẴẵẪẫẼẽỄễỖỗỠỡỮữỸỹẮắẤấẾếỐốỚớỨứẠạẶặẬậẸẹỆệỊịỌọỘộỢợỤụỰựỴỵ

And here are those same vowels without accents:

AaAaEeOoOoUuYyAaAaAaEeEeIiOoOoOoUuUuYyAaAaEeEeOoOoUuYyAaAaEeOoOoUuAaAaAaEeEeIiOoOoOoUuUuYy

Attachments (4)

17738.patch (3.8 KB) - added by SergeyBiryukov 6 years ago.
17738.tests.patch (1.5 KB) - added by ampt 6 years ago.
17738.tests.2.patch (1.6 KB) - added by ampt 6 years ago.
Updated tests
17738.tests.3.patch (1.2 KB) - added by SergeyBiryukov 6 years ago.

Download all attachments as: .zip

Change History (21)

#1 @tgeorge
7 years ago

  • Summary changed from replace_accents() can't handle Vietnamese vowels to remove_accents() can't handle Vietnamese vowels

I meant "remove_accents()", not "replace_accents()". Sorry! The "remove_accents()" function is defined in formatting.php.

#2 @toscho
7 years ago

  • Cc info@… added

#3 @johnbillion
7 years ago

  • Cc johnbillion@… added

#4 @tgeorge
6 years ago

  • Cc tgeorge added

There are four additional vowels that remove_accents() can't handle. I forgot them in my original message:

ƠơƯư

And here are those same vowels without accents:

OoUu

#5 follow-up: @SergeyBiryukov
6 years ago

  • Keywords has-patch added

I've made a patch, but it's a huge chunk of characters, and I wonder if this should rather be included into Vietnamese package as a filter.

Perhaps remove_accents() needs a filter for this, so that replacements in sanitize_title() could only occur with save context.

#6 in reply to: ↑ 5 @nacin
6 years ago

  • Milestone changed from Awaiting Review to 3.3

Replying to SergeyBiryukov:

Perhaps remove_accents() needs a filter for this, so that replacements in sanitize_title() could only occur with save context.

remove_accents() is already only called there on save context, so this should be good.

#7 @SergeyBiryukov
6 years ago

I meant hooking into sanitize_title() from wp-content/languages/vi.php.
I missed that context is passed to sanitize_title filter, so that's currently possible too.

#8 @SergeyBiryukov
6 years ago

  • Keywords needs-unit-tests added

@ampt
6 years ago

#9 @ampt
6 years ago

Add unit tests, this patch works on its own, but probably should be incorporated into the tests in #9591

@ampt
6 years ago

Updated tests

#10 @ampt
6 years ago

Updated tests to apply to [UT 471]

#11 @ampt
6 years ago

Before patch: Tests: 12, Assertions: 13, Failures: 6.

With attachment:17738.patch OK (12 tests, 13 assertions)

#12 @duck_
6 years ago

  • Milestone changed from 3.3 to Future Release

When version is 3.1.3 and the ticket needs-unit-tests it is not going to make 3.3. Punting.

#13 @SergeyBiryukov
6 years ago

  • Keywords has-unit-tests added; needs-unit-tests removed

#14 @SergeyBiryukov
6 years ago

  • Keywords needs-unit-tests added; has-unit-tests removed

Per IRC chat, it's better to keep the keyword until the tests are reviewed and committed.

#15 @SergeyBiryukov
6 years ago

  • Milestone changed from Future Release to 3.4

#16 @SergeyBiryukov
6 years ago

  • Keywords needs-unit-tests removed

#17 @nacin
6 years ago

  • Owner set to nacin
  • Resolution set to fixed
  • Status changed from new to closed

In [20687]:

Add Vietnamese vowels to remove_accents(). props SergeyBiryukov. fixes #17738.

Note: See TracTickets for help on using tickets.