Make WordPress Core

Opened 15 years ago

Closed 9 years ago

Last modified 9 years ago

#10373 closed defect (bug) (fixed)

Proper number formatting related to i18n

Reported by: honzaskypala's profile honza.skypala Owned by: sergeybiryukov's profile SergeyBiryukov
Milestone: 4.5 Priority: normal
Severity: normal Version: 2.8.1
Component: I18N Keywords: has-patch 4.5-early
Focuses: Cc:

Description

Hi there,

Technical description of the situation: WordPress function number_format_i18n() uses native PHP function number_format() for formatting numbers. Unfortunately this PHP function is not able to handle separators (both decimal and thousands separators) that fall into one of the following categories:

  • separator would contain more than one char, e.g. " "; in this case only the first char from the supplied string is used, in the example given above it would be "&".
  • separator is ASCII > 128, e.g. ASCII 160;

Impact: Although this is not a native bug to WordPress, the problem source is in PHP, the impact is caused also in WordPress. This situation is problematic for users in Eastern Europe (Czech Republic, Russia, etc.) where the standard thousands separator is " " (space). The users in these countries usually don't want to use real space (ASCII 32) as thousands separator, as it could be word wrapped and the number would be saparated into two lines. Unfortunatelly supplying both variants of non-breakable space (i.e. " " or ASCII 160) fails.

Proposed solution: unforunatelly this bug is reported in PHP for years and nothing happens there. Because of this situation I suggest fixing such problem in the next level, i.e. in WordPress. This means to create a custom function for formatting numbers, with identical input params as number_format(), just that this one would be able to handle separator types mentioned above.

I have such change already deployed on my WordPress blog for several months, so I am going to attach a diff file for the latest SVN state dealing with this issue.

Honza

Attachments (4)

functions.php.diff (2.4 KB) - added by honza.skypala 15 years ago.
Diff solving the issue described in the ticket.
10373.diff (1.3 KB) - added by SergeyBiryukov 9 years ago.
10373.2.diff (1.4 KB) - added by SergeyBiryukov 9 years ago.
10373.3.diff (1.7 KB) - added by SergeyBiryukov 9 years ago.

Download all attachments as: .zip

Change History (42)

@honza.skypala
15 years ago

Diff solving the issue described in the ticket.

#1 @Denis-de-Bernardy
15 years ago

Given the very specific nature of the problem it might make more sense to just do:

$number = str_replace(' ', ' ', $number);

where we use it...

#2 @Denis-de-Bernardy
15 years ago

  • Keywords has-patch added
  • Milestone changed from Unassigned to 2.9

#3 @ryan
15 years ago

  • Milestone changed from 2.9 to Future Release

#4 @westi
15 years ago

  • Cc westi added

#5 @westi
15 years ago

Cross reference #10555

#6 @hakre
15 years ago

  • Milestone changed from Future Release to 3.0

#7 @westi
15 years ago

  • Owner changed from nbachiyski to westi
  • Status changed from new to accepted

#8 follow-up: @nbachiyski
15 years ago

I agree with Denis that in this specific case translators should add a number_format_i18n filter in the <locale>.php file and fix the problem there. I vote for wontfix.

#9 @nbachiyski
15 years ago

  • Resolution set to wontfix
  • Status changed from accepted to closed

#10 @nacin
15 years ago

  • Milestone 3.0 deleted

#11 @nbachiyski
15 years ago

One day I will start deleting milestones, too.

#12 in reply to: ↑ 8 @pavelevap
15 years ago

  • Cc pavelevap@… added

Replying to nbachiyski:

I agree with Denis that in this specific case translators should add a number_format_i18n filter in the <locale>.php file and fix the problem there. I vote for wontfix.

What do you mean by <locale>.php file?

#13 @nbachiyski
15 years ago

If you put a php file named after your locale in wp-content/languages/ it will be loaded automatically if the current locale matches the basename of the php file.

You can use it like a locale-specific plugin.

#14 @pavelevap
15 years ago

OK, thank you. So, my file will be cs_CZ.php and it works the same way as usual plugin? Any required header? Is it documented somewhere? And one more question - how can I distribute it with translated Czech version? Can I insert it to dist folder, for example http://svn.automattic.com/wordpress-i18n/cs_CZ/trunk/dist/? In dist folder I will create folder structure with this file dist/wp-content/languages/cs_CS.php? Thank you very much for your answers...

#15 @nbachiyski
15 years ago

Yes, it works like a plugin. You don't need anything special. Putting it in dist/ is perfcetly fine. Here is mine, for example:

http://svn.automattic.com/wordpress-i18n/bg_BG/trunk/dist/wp-content/languages/bg_BG.php

#16 @pavelevap
15 years ago

Great. I was not sure... Thank you for your help!

#17 @lkraav
13 years ago

  • Cc lkraav added

#18 @nacin
13 years ago

  • Keywords reporter-feedback added; number formating i18n removed
  • Milestone set to Future Release
  • Resolution wontfix deleted
  • Status changed from closed to reopened

PHP 5.4.0 allows multiple bytes to be used for decimal points and thousands separators. Of course, we won't have 5.4.0 as a minimum version requirement for probably a decade. I am re-opening this to reflect our decision to begin incorporating locale-specific changes in core. (#19603, others.)

Which locales specifically need multiple characters to properly represent numbers?

#19 @pavelevap
13 years ago

For example Czech and probably also other languages with "space" as thousands separator (non-breakable space needed).

#20 @xibe
10 years ago

Forgot to comment here at the time.
France does need that. It did burn us (as translators) in 3.9 and 3.5, and will possibly burn us again in the future.
It would be nice to have this fixed.

@SergeyBiryukov
9 years ago

#21 @SergeyBiryukov
9 years ago

  • Keywords reporter-feedback removed
  • Milestone changed from Future Release to 4.4

10373.diff implements comment:1. Uses the actual character, in case number_format_i18n() is used in a context where entities are not allowed.

#22 @SergeyBiryukov
9 years ago

Tested on PHP 5.2.17, works as expected.

10373.2.diff also replaces &nbsp; and &#160; from translations with the actual character.

We could also probably add html_entity_decode() in case some locales need multiple characters other that space for thousands separators, but it looks like a fix for the non-breaking space is sufficient for now.

#23 @SergeyBiryukov
9 years ago

  • Keywords commit added
  • Owner changed from westi to SergeyBiryukov
  • Status changed from reopened to accepted

#24 @SergeyBiryukov
9 years ago

  • Resolution set to fixed
  • Status changed from accepted to closed

In 35372:

In WP_Locale::init(), replace space as a thousands separator with a non-breaking space.

Also replace '&nbsp; and &#160; HTML entities with the actual character, since PHP < 5.4.0 does not allow multiple bytes to be used for decimal points or thousands separators.

Fixes #10373.

#25 follow-up: @ocean90
9 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

echo number_format( 1000, 0, ',', "\xA0" ); returns on PHP 5.6.14 and 5.5.40 1�000 for me. Looking at http://www.fileformat.info/info/unicode/char/00a0/index.htm and it seems like the replacement should be "\xC2\xA0".

Edit: Looking at SimplePie_Misc::windows_1252_to_utf8() and it looks like "\xA0" is Windows-1252 and "\xC2\xA0" is UTF-8.

Last edited 9 years ago by ocean90 (previous) (diff)

This ticket was mentioned in Slack in #meta by sergey. View the logs.


9 years ago

#27 in reply to: ↑ 25 @ocean90
9 years ago

  • Keywords needs-patch added; has-patch commit removed

#28 @ocean90
9 years ago

In 35522:

Revert [35372] because of possible encoding issues, needs more investigation.

See #10373.

#29 @ocean90
9 years ago

  • Milestone changed from 4.4 to Future Release

#30 @SergeyBiryukov
9 years ago

  • Keywords has-patch 4.5-early added; needs-patch removed

10373.3.diff is another, more thoroughly tested approach:

  • If PHP version is 5.4.0 or greater, a regular space as a thousands separator is replaced with a non-breaking space, decoded using the blog_charset option. We could probably skip html_entity_decode() and use the &nbsp; entity directly, but I think that might cause issues in themes that do weird things like using substr() instead of wp_trim_words() for the excerpt.
  • If PHP version is 5.2.x or 5.3.x, &nbsp; or &#160; as a thousands separator is replaced with a regular space, as multiple bytes are not supported there.

#31 @SergeyBiryukov
9 years ago

  • Milestone changed from Future Release to 4.5

#32 @SergeyBiryukov
9 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 35884:

I18N: In WP_Locale::init(), replace space as a thousands separator with a non-breaking space.

  • If PHP version is 5.4.0 or greater, a regular space as a thousands separator is replaced with a non-breaking space, decoded using the blog_charset option.
  • If PHP version is 5.2.x or 5.3.x, &nbsp; or &#160; as a thousands separator is replaced with a regular space, as multiple bytes are not supported there.

Fixes #10373.

#33 @danielbachhuber
9 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Sorry @SergeyBiryukov :( WP-CLI found another issue for you.

PHP Fatal error:  Call to a member function suppress_errors() on null in /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/option.php on line 102
    PHP Stack trace:
    PHP   1. {main}() /tmp/wp-cli-phar/wp:0
    PHP   2. include() /tmp/wp-cli-phar/wp:4
    PHP   3. include() phar:///tmp/wp-cli-phar/wp/php/boot-phar.php:5
    PHP   4. WP_CLI\Runner->start() phar:///tmp/wp-cli-phar/wp/php/wp-cli.php:21
    PHP   5. WP_CLI\Runner->load_wordpress() phar:///tmp/wp-cli-phar/wp/php/WP_CLI/Runner.php:697
    PHP   6. require() phar:///tmp/wp-cli-phar/wp/php/WP_CLI/Runner.php:739
    PHP   7. require_wp_db() phar:///tmp/wp-cli-phar/wp/php/wp-settings-cli.php:100
    PHP   8. wpdb->__construct() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/load.php:369
    PHP   9. wpdb->db_connect() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/wp-db.php:658
    PHP  10. wpdb->select() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/wp-db.php:1560
    PHP  11. wp_load_translations_early() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/wp-db.php:1054
    PHP  12. WP_Locale->__construct() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/load.php:863
    PHP  13. WP_Locale->init() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/locale.php:360
    PHP  14. get_option() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/locale.php:194

In short, it's not safe to use get_option() inside of WP_Locale::init() because WPDB calls wp_load_translations_early() when it can't connect to the database.

#34 @SergeyBiryukov
9 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 35951:

I18N: After [35884], remove html_entity_decode() for the thousands separator in WP_Locale::init(), use the &nbsp; entity directly.

We can't get_option( 'blog_charset' ) here, because the database may not be available.

Fixes #10373.

#35 @Reynard
9 years ago

  • Severity changed from normal to minor

I would like to add this, yet I am not as familiar with what to do in this situation.

Should I just donwload functions.php.diff and 10373.3.diff and copy and paste the included code unto functions.php and locale.php?

Thank you!

#36 follow-up: @grapplerulrich
9 years ago

@Reynard - From the looks of it only was edited. locale.php. You can just copy the whole file here. https://github.com/WordPress/WordPress/blob/master/wp-includes/locale.php It seems like this will be released in WP 4.5.

#37 @ocean90
9 years ago

  • Severity changed from minor to normal

#38 in reply to: ↑ 36 @Reynard
9 years ago

Replying to grapplerulrich:

@Reynard - From the looks of it only was edited. locale.php. You can just copy the whole file here. https://github.com/WordPress/WordPress/blob/master/wp-includes/locale.php It seems like this will be released in WP 4.5.

Hm, I see. I was not able to accomplish anything from it. Just to recap. Would this modification allow me to present numbers which have decimals as commas? For example: 1,280,000.00 would become: 1.280.000 ?

Note: See TracTickets for help on using tickets.