Make WordPress Core

Opened 10 years ago

Closed 10 years ago

#37086 closed defect (bug) (fixed)

Remove Middle Dot (U+00B7) from URL (for Catalan only?)

Reported by: xavivars's profile xavivars Owned by: ocean90's profile ocean90
Milestone: 4.6 Priority: normal
Severity: normal Version:
Component: Formatting Keywords: has-patch has-unit-tests commit
Focuses: Cc:

Description (last modified by ocean90)

Currently, remove_accents() converts all characters to an ASCII equivalent so it looks "nice" as a URLs without the need of escaping characters (and, thus, showing % as part of the links).

However, the middle dot (U+00B7) is not removed. Middle dot is used in Catalan between two L (like this l·l).

Quoting from wikipedia:

The flown dot (Catalan: punt volat) is used in Catalan between two Ls in cases where each belongs to a separate syllable, for example cel·la, "cell". This distinguishes such "geminate Ls" (ela geminada), which are pronounced [ɫː], from "double L" (doble ela), which are written without the flown dot and are pronounced [ʎ].

On top of non being consistent (all other Catalan diacritics are removed), not removing this character has some side-effects, because there are some URL libraries that don't take it into account (like the one Twitter uses: see https://twitter.com/VilaWeb/status/738348674137399296).

My proposal is to remove that char when it appears between two l.

Attachments (5)

formatting.php.ca-only.patch (475 bytes) - added by xavivars 10 years ago.
Removes middle dot when Catalan is set as a language
formatting.php.all-lang.patch (440 bytes) - added by xavivars 10 years ago.
Removes middle dot for all languages
formatting.php.patch (487 bytes) - added by xavivars 10 years ago.
Formatting.php patch
RemoveAccents.php.patch (815 bytes) - added by xavivars 10 years ago.
RemoveAccents.php test patch
37086.patch (2.0 KB) - added by SergeyBiryukov 10 years ago.

Download all attachments as: .zip

Change History (15)

@xavivars
10 years ago

Removes middle dot when Catalan is set as a language

@xavivars
10 years ago

Removes middle dot for all languages

#1 @swissspidy
10 years ago

  • Keywords has-patch added

#2 @ocean90
10 years ago

  • Description modified (diff)
  • Keywords needs-refresh needs-unit-tests added
  • Milestone changed from Awaiting Review to Future Release

@xavivars Thanks for your patches. The replacement should only be done for Catalan. Removing the dots can maybe handled by sanitize_title_with_dashes().

Can you make sure that the patches are relative to the root directory? And there should be a unit test for this change in /tests/phpunit/tests/formatting/RemoveAccents.php.

#3 @xavivars
10 years ago

@ocean90: should the patches be relative to the root directory of which repo? I've found contradictory information (sometimes pointing to develop.svn.wordpress.org and some other times pointing to core.svn).

I'll also add unit tests for that.

#4 @swissspidy
10 years ago

develop.svn.wordpress.org (or develop.git.wordpress.org) would be the correct repository for patches.

@xavivars
10 years ago

Formatting.php patch

@xavivars
10 years ago

RemoveAccents.php test patch

#5 @xavivars
10 years ago

@ocean90: I don't think I agree the removal of those dots should be done at sanitize_title_with_dashes. The middot it affects how the L are pronounced, and in fact, the first case was already covered in the same remove_accents method (I've removed it from the new formating.php.patch). However, if you think those changes belong better to sanitize_title_with_dashes, I'm open to discuss about that.

#6 @xavivars
10 years ago

  • Keywords has-unit-tests dev-feedback added; needs-refresh needs-unit-tests removed

#7 @SergeyBiryukov
10 years ago

  • Milestone changed from Future Release to 4.6

#8 @SergeyBiryukov
10 years ago

  • Keywords commit added; dev-feedback removed

37086.patch combines the patch and the test and also updates the docs.

I think it's fine to handle this in remove_accents().

#9 @xavivars
10 years ago

Is there anything pending for this ticket to be closed that I can help with?

#10 @ocean90
10 years ago

  • Owner set to ocean90
  • Resolution set to fixed
  • Status changed from new to closed

In 37853:

I18N: Add support for the Catalan flown dot in remove_accents().

Props xavivars, SergeyBiryukov.
Fixes #37086.

Note: See TracTickets for help on using tickets.