WordPress.org

Make WordPress Core

Opened 3 years ago

Closed 2 years ago

#17738 closed defect (bug) (fixed)

remove_accents() can't handle Vietnamese vowels

Reported by: tgeorge Owned by: nacin
Milestone: 3.4 Priority: normal
Severity: normal Version: 3.1.3
Component: Formatting Keywords: has-patch
Focuses: Cc:

Description

replace_accents() can't handle many of the vowels present in Vietnamese. For the complete list of vowels:

http://en.wikipedia.org/wiki/Vietnamese_alphabet#Tone_marks

Here are the precise vowels that replace_accents() can't handle currently:

ẰằẦầỀềỒồỜờỪừỲỳẢảẲẳẨẩẺẻỂểỈỉỎỏỔổỞởỦủỬửỶỷẴẵẪẫẼẽỄễỖỗỠỡỮữỸỹẮắẤấẾếỐốỚớỨứẠạẶặẬậẸẹỆệỊịỌọỘộỢợỤụỰựỴỵ

And here are those same vowels without accents:

AaAaEeOoOoUuYyAaAaAaEeEeIiOoOoOoUuUuYyAaAaEeEeOoOoUuYyAaAaEeOoOoUuAaAaAaEeEeIiOoOoOoUuUuYy

Attachments (4)

17738.patch (3.8 KB) - added by SergeyBiryukov 3 years ago.
17738.tests.patch (1.5 KB) - added by ampt 3 years ago.
17738.tests.2.patch (1.6 KB) - added by ampt 2 years ago.
Updated tests
17738.tests.3.patch (1.2 KB) - added by SergeyBiryukov 2 years ago.

Download all attachments as: .zip

Change History (21)

comment:1 tgeorge3 years ago

  • Summary changed from replace_accents() can't handle Vietnamese vowels to remove_accents() can't handle Vietnamese vowels

I meant "remove_accents()", not "replace_accents()". Sorry! The "remove_accents()" function is defined in formatting.php.

comment:2 toscho3 years ago

  • Cc info@… added

comment:3 johnbillion3 years ago

  • Cc johnbillion@… added

comment:4 tgeorge3 years ago

  • Cc tgeorge added

There are four additional vowels that remove_accents() can't handle. I forgot them in my original message:

ƠơƯư

And here are those same vowels without accents:

OoUu

SergeyBiryukov3 years ago

comment:5 follow-up: SergeyBiryukov3 years ago

  • Keywords has-patch added

I've made a patch, but it's a huge chunk of characters, and I wonder if this should rather be included into Vietnamese package as a filter.

Perhaps remove_accents() needs a filter for this, so that replacements in sanitize_title() could only occur with save context.

comment:6 in reply to: ↑ 5 nacin3 years ago

  • Milestone changed from Awaiting Review to 3.3

Replying to SergeyBiryukov:

Perhaps remove_accents() needs a filter for this, so that replacements in sanitize_title() could only occur with save context.

remove_accents() is already only called there on save context, so this should be good.

comment:7 SergeyBiryukov3 years ago

I meant hooking into sanitize_title() from wp-content/languages/vi.php.
I missed that context is passed to sanitize_title filter, so that's currently possible too.

comment:8 SergeyBiryukov3 years ago

  • Keywords needs-unit-tests added

ampt3 years ago

comment:9 ampt3 years ago

Add unit tests, this patch works on its own, but probably should be incorporated into the tests in #9591

ampt2 years ago

Updated tests

comment:10 ampt2 years ago

Updated tests to apply to [UT 471]

comment:11 ampt2 years ago

Before patch: Tests: 12, Assertions: 13, Failures: 6.

With attachment:17738.patch OK (12 tests, 13 assertions)

comment:12 duck_2 years ago

  • Milestone changed from 3.3 to Future Release

When version is 3.1.3 and the ticket needs-unit-tests it is not going to make 3.3. Punting.

comment:13 SergeyBiryukov2 years ago

  • Keywords has-unit-tests added; needs-unit-tests removed

comment:14 SergeyBiryukov2 years ago

  • Keywords needs-unit-tests added; has-unit-tests removed

Per IRC chat, it's better to keep the keyword until the tests are reviewed and committed.

comment:15 SergeyBiryukov2 years ago

  • Milestone changed from Future Release to 3.4

comment:16 SergeyBiryukov2 years ago

  • Keywords needs-unit-tests removed

comment:17 nacin2 years ago

  • Owner set to nacin
  • Resolution set to fixed
  • Status changed from new to closed

In [20687]:

Add Vietnamese vowels to remove_accents(). props SergeyBiryukov. fixes #17738.

Note: See TracTickets for help on using tickets.