Make WordPress Core

Opened 16 years ago

Closed 12 years ago

#7897 closed enhancement (fixed)

Commas should be internationalized

Reported by: huji's profile huji Owned by: nbachiyski's profile nbachiyski
Milestone: 3.4 Priority: normal
Severity: normal Version: 2.9
Component: I18N Keywords: needs-patch
Focuses: Cc:


There are many places where commas should be internationalized. A very typical example is this: You have to separate tags using "," (Latin comma). RTL languages like Arabic and Persian use a different character for comma ("،") so there should be away to internationalize it let Wordpress understand it correctly, per content language.

Attachments (2)

arabic-comma.diff (3.5 KB) - added by AhmadSherif 15 years ago.
7897.diff (10.9 KB) - added by nacin 12 years ago.

Download all attachments as: .zip

Change History (24)

#1 @DD32
16 years ago

The theme controls the seperators, There is no reason why a given theme cannot internationalise itself, or the theme author internationalise the theme.
A user can also replace the characters in question in their choice of theme if they choose.

#2 @huji
16 years ago

Correct; however, at least the default theme should support this.

#3 @nbachiyski
16 years ago

  • Resolution set to invalid
  • Status changed from new to closed

Reopen if there are places, where the text is i18n-ed and a coma can't be translated.

#4 @huji
16 years ago

  • Resolution invalid deleted
  • Status changed from closed to reopened

I have to reopen it. It was not about themes in the first place. It was about dashboard which is i18n-ed. When you right a new post, you can add tags, but you have to separate them with Latin commas; if the interface is in a non-Latin language, there should be a way to use non-Latin commas instead.

#5 @janeforshort
15 years ago

  • Milestone changed from 2.8 to Future Release

Punting due to feature freeze. Reconsider with next release.

#6 @Denis-de-Bernardy
15 years ago

  • Keywords needs-patch added

#7 @Denis-de-Bernardy
15 years ago

  • Milestone changed from Future Release to 2.9

#8 @AhmadSherif
15 years ago

The patch assume that there's a message called "," which contains the localized comma.

Two notes on that patch:

  • The patch doesn't contain diff for wp-admin/js/post.js 'cause I don't know how you guys are doing that compression stuff.
  • Some changes in post-dev.js might look ugly (or it's ugly), but I made it that way to ease any possible debugging.

#9 @janeforshort
15 years ago

  • Type changed from defect (bug) to enhancement

#10 @westi
15 years ago

  • Keywords has-patch reporter-feedback added; needs-patch removed
  • Version set to 2.9

This looks like a worthwhile fix.

Couple of questions/observations:

  • Why do we need the evals() in the javascript code?
  • It looks like there are still some hardcoded commas in the javascript code as well.

#11 @ryan
15 years ago

  • Milestone changed from 2.9 to 3.0

#12 @nacin
14 years ago

  • Keywords needs-patch added; has-patch reporter-feedback removed

#13 @nacin
14 years ago

We should be able to yank the code in question in westi's comment, this looks pretty good otherwise.

#14 @nacin
14 years ago

  • Owner changed from anonymous to nbachiyski
  • Status changed from reopened to assigned

#15 @Denis-de-Bernardy
14 years ago

  • Milestone changed from 3.0 to Future Release

punting pending new patch

#16 @uadroid
14 years ago

  • Cc uadroid added

Is there any progress on this? I'm using and I have to switch my keyboard to Latin when ever I want to separate tags.

It's 'ARABIC COMMA' (U+060C)

#17 @nacin
12 years ago

  • Milestone changed from Future Release to 3.4

12 years ago

#18 @nacin
12 years ago

7897.diff approaches this holistically. It works for bulk edit, quick edit, and the tags meta box (including without JS). It also works for both an internationalized comma and the existing standard comma, intermingled, in case they are different, or if force of habit causes someone to continue to use the standard comma.

Standardization occurs in very specific places.

Rendering tag strings (PHP displays):

  • get_terms_to_edit(), which joins tags with a comma and escapes the result, is not modified. This is for compatibility reasons with plugins, who may expect to do a PHP explode or JS split on this data.
  • get_inline_data(), which uses get_terms_to_edit(), is not modified. It currently does a replacement on the result from get_terms_to_edit(), adding a space after each comma. Again, this is not modified for compatibility reasons.
  • What is modified is the tags meta box code. In this case, the terms string is going directly into a textarea. So, we find all commas and replace it with the internationalized comma, along with a trailing space. (This did not occur in 3.3, resulting in "apple,pear,orange" with no JS.)

Saving tag strings (PHP saves):

  • wp_set_post_terms() is where the explode-by-commas may take place. It will now first find all L10n commas and replace them with standard commas before continuing with the explode. This is done in part to allow both commas to work, and in part because we can't reliably trim() an html entity as well as we can trim a leading or trailing comma.
  • bulk_edit_posts() is where the other explode-by-commas takes place on save. It will now do the same thing as wp_set_post_terms(), improving its cleaning regex in the process.

Working with tag strings (in JS):

  • In, care is taken to properly clean/sanitize the new string. Since the comma is sometimes used in a character class (tagBox.clean), and an html entity could cause problems here, the L10n comma is first replaced with a standard comma, and then switched back when done.
  • In, there is a find-replace for converting the standard comma to the L10n comma. This is because it is relying on a standard comma-separated term string from get_inline_data() (see above for more).
  • In and, the only other changes are to specify the localized separator for tag suggestions, and properly split or join on the new localized comma.

Tag suggestions:

  • In ajax-tag-search, L10n commas are first replaced with standard commas before exploding.

This was the code I was using to test:

add_filter( 'gettext_with_context', 'nacin_test_l10n_comma', 10, 3 );
function nacin_test_l10n_comma( $translated, $text, $context ) {
	if ( $text == ',' && $context == 'tag delimiter' )
		return '`';
	return $translated;

Let's soak this in trunk.

#19 @nacin
12 years ago

In [19853]:

Allow localized commas to be used as tag separators. see #7897.

#20 @nacin
12 years ago

It should be noted that there is an assumption made about localized commas — that, like the standard comma, they use word-comma-space-word.

I'll be looking into this with the translators to see if this needs to be modified. It will not take much, only a new translated string that is leveraged in a few places.

#21 @nacin
12 years ago

The existing locales that have translated ", " (with space) do include a trailing space after their character. Arabic, Persian, Burmese, Chinese (Taiwan), Chinese (China), Uighur, and Kurdish.

#22 @nacin
12 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

Pending bug reports, closing as fixed.

Note: See TracTickets for help on using tickets.