WordPress.org

Make WordPress Core

Opened 8 years ago

Closed 18 months ago

Last modified 17 months ago

#10373 closed defect (bug) (fixed)

Proper number formatting related to i18n

Reported by: honza.skypala Owned by: SergeyBiryukov
Milestone: 4.5 Priority: normal
Severity: normal Version: 2.8.1
Component: I18N Keywords: has-patch 4.5-early
Focuses: Cc:

Description

Hi there,

Technical description of the situation: WordPress function number_format_i18n() uses native PHP function number_format() for formatting numbers. Unfortunately this PHP function is not able to handle separators (both decimal and thousands separators) that fall into one of the following categories:

  • separator would contain more than one char, e.g. " "; in this case only the first char from the supplied string is used, in the example given above it would be "&".
  • separator is ASCII > 128, e.g. ASCII 160;

Impact: Although this is not a native bug to WordPress, the problem source is in PHP, the impact is caused also in WordPress. This situation is problematic for users in Eastern Europe (Czech Republic, Russia, etc.) where the standard thousands separator is " " (space). The users in these countries usually don't want to use real space (ASCII 32) as thousands separator, as it could be word wrapped and the number would be saparated into two lines. Unfortunatelly supplying both variants of non-breakable space (i.e. " " or ASCII 160) fails.

Proposed solution: unforunatelly this bug is reported in PHP for years and nothing happens there. Because of this situation I suggest fixing such problem in the next level, i.e. in WordPress. This means to create a custom function for formatting numbers, with identical input params as number_format(), just that this one would be able to handle separator types mentioned above.

I have such change already deployed on my WordPress blog for several months, so I am going to attach a diff file for the latest SVN state dealing with this issue.

Honza

Attachments (4)

functions.php.diff (2.4 KB) - added by honza.skypala 8 years ago.
Diff solving the issue described in the ticket.
10373.diff (1.3 KB) - added by SergeyBiryukov 20 months ago.
10373.2.diff (1.4 KB) - added by SergeyBiryukov 20 months ago.
10373.3.diff (1.7 KB) - added by SergeyBiryukov 19 months ago.

Download all attachments as: .zip

Change History (42)

@honza.skypala
8 years ago

Diff solving the issue described in the ticket.

#1 @Denis-de-Bernardy
8 years ago

Given the very specific nature of the problem it might make more sense to just do:

$number = str_replace(' ', ' ', $number);

where we use it...

#2 @Denis-de-Bernardy
8 years ago

  • Keywords has-patch added
  • Milestone changed from Unassigned to 2.9

#3 @ryan
8 years ago

  • Milestone changed from 2.9 to Future Release

#4 @westi
7 years ago

  • Cc westi added

#5 @westi
7 years ago

Cross reference #10555

#6 @hakre
7 years ago

  • Milestone changed from Future Release to 3.0

#7 @westi
7 years ago

  • Owner changed from nbachiyski to westi
  • Status changed from new to accepted

#8 follow-up: @nbachiyski
7 years ago

I agree with Denis that in this specific case translators should add a number_format_i18n filter in the <locale>.php file and fix the problem there. I vote for wontfix.

#9 @nbachiyski
7 years ago

  • Resolution set to wontfix
  • Status changed from accepted to closed

#10 @nacin
7 years ago

  • Milestone 3.0 deleted

#11 @nbachiyski
7 years ago

One day I will start deleting milestones, too.

#12 in reply to: ↑ 8 @pavelevap
7 years ago

  • Cc pavelevap@… added

Replying to nbachiyski:

I agree with Denis that in this specific case translators should add a number_format_i18n filter in the <locale>.php file and fix the problem there. I vote for wontfix.

What do you mean by <locale>.php file?

#13 @nbachiyski
7 years ago

If you put a php file named after your locale in wp-content/languages/ it will be loaded automatically if the current locale matches the basename of the php file.

You can use it like a locale-specific plugin.

#14 @pavelevap
7 years ago

OK, thank you. So, my file will be cs_CZ.php and it works the same way as usual plugin? Any required header? Is it documented somewhere? And one more question - how can I distribute it with translated Czech version? Can I insert it to dist folder, for example http://svn.automattic.com/wordpress-i18n/cs_CZ/trunk/dist/? In dist folder I will create folder structure with this file dist/wp-content/languages/cs_CS.php? Thank you very much for your answers...

#15 @nbachiyski
7 years ago

Yes, it works like a plugin. You don't need anything special. Putting it in dist/ is perfcetly fine. Here is mine, for example:

http://svn.automattic.com/wordpress-i18n/bg_BG/trunk/dist/wp-content/languages/bg_BG.php

#16 @pavelevap
7 years ago

Great. I was not sure... Thank you for your help!

#17 @lkraav
5 years ago

  • Cc lkraav added

#18 @nacin
5 years ago

  • Keywords reporter-feedback added; number formating i18n removed
  • Milestone set to Future Release
  • Resolution wontfix deleted
  • Status changed from closed to reopened

PHP 5.4.0 allows multiple bytes to be used for decimal points and thousands separators. Of course, we won't have 5.4.0 as a minimum version requirement for probably a decade. I am re-opening this to reflect our decision to begin incorporating locale-specific changes in core. (#19603, others.)

Which locales specifically need multiple characters to properly represent numbers?

#19 @pavelevap
5 years ago

For example Czech and probably also other languages with "space" as thousands separator (non-breakable space needed).

#20 @xibe
3 years ago

Forgot to comment here at the time.
France does need that. It did burn us (as translators) in 3.9 and 3.5, and will possibly burn us again in the future.
It would be nice to have this fixed.

#21 @SergeyBiryukov
20 months ago

  • Keywords reporter-feedback removed
  • Milestone changed from Future Release to 4.4

10373.diff implements comment:1. Uses the actual character, in case number_format_i18n() is used in a context where entities are not allowed.

#22 @SergeyBiryukov
20 months ago

Tested on PHP 5.2.17, works as expected.

10373.2.diff also replaces &nbsp; and &#160; from translations with the actual character.

We could also probably add html_entity_decode() in case some locales need multiple characters other that space for thousands separators, but it looks like a fix for the non-breaking space is sufficient for now.

#23 @SergeyBiryukov
20 months ago

  • Keywords commit added
  • Owner changed from westi to SergeyBiryukov
  • Status changed from reopened to accepted

#24 @SergeyBiryukov
20 months ago

  • Resolution set to fixed
  • Status changed from accepted to closed

In 35372:

In WP_Locale::init(), replace space as a thousands separator with a non-breaking space.

Also replace '&nbsp; and &#160; HTML entities with the actual character, since PHP < 5.4.0 does not allow multiple bytes to be used for decimal points or thousands separators.

Fixes #10373.

#25 follow-up: @ocean90
19 months ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

echo number_format( 1000, 0, ',', "\xA0" ); returns on PHP 5.6.14 and 5.5.40 1�000 for me. Looking at http://www.fileformat.info/info/unicode/char/00a0/index.htm and it seems like the replacement should be "\xC2\xA0".

Edit: Looking at SimplePie_Misc::windows_1252_to_utf8() and it looks like "\xA0" is Windows-1252 and "\xC2\xA0" is UTF-8.

Last edited 19 months ago by ocean90 (previous) (diff)

This ticket was mentioned in Slack in #meta by sergey. View the logs.


19 months ago

#27 in reply to: ↑ 25 @ocean90
19 months ago

  • Keywords needs-patch added; has-patch commit removed

#28 @ocean90
19 months ago

In 35522:

Revert [35372] because of possible encoding issues, needs more investigation.

See #10373.

#29 @ocean90
19 months ago

  • Milestone changed from 4.4 to Future Release

#30 @SergeyBiryukov
18 months ago

  • Keywords has-patch 4.5-early added; needs-patch removed

10373.3.diff is another, more thoroughly tested approach:

  • If PHP version is 5.4.0 or greater, a regular space as a thousands separator is replaced with a non-breaking space, decoded using the blog_charset option. We could probably skip html_entity_decode() and use the &nbsp; entity directly, but I think that might cause issues in themes that do weird things like using substr() instead of wp_trim_words() for the excerpt.
  • If PHP version is 5.2.x or 5.3.x, &nbsp; or &#160; as a thousands separator is replaced with a regular space, as multiple bytes are not supported there.

#31 @SergeyBiryukov
18 months ago

  • Milestone changed from Future Release to 4.5

#32 @SergeyBiryukov
18 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 35884:

I18N: In WP_Locale::init(), replace space as a thousands separator with a non-breaking space.

  • If PHP version is 5.4.0 or greater, a regular space as a thousands separator is replaced with a non-breaking space, decoded using the blog_charset option.
  • If PHP version is 5.2.x or 5.3.x, &nbsp; or &#160; as a thousands separator is replaced with a regular space, as multiple bytes are not supported there.

Fixes #10373.

#33 @danielbachhuber
18 months ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Sorry @SergeyBiryukov :( WP-CLI found another issue for you.

PHP Fatal error:  Call to a member function suppress_errors() on null in /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/option.php on line 102
    PHP Stack trace:
    PHP   1. {main}() /tmp/wp-cli-phar/wp:0
    PHP   2. include() /tmp/wp-cli-phar/wp:4
    PHP   3. include() phar:///tmp/wp-cli-phar/wp/php/boot-phar.php:5
    PHP   4. WP_CLI\Runner->start() phar:///tmp/wp-cli-phar/wp/php/wp-cli.php:21
    PHP   5. WP_CLI\Runner->load_wordpress() phar:///tmp/wp-cli-phar/wp/php/WP_CLI/Runner.php:697
    PHP   6. require() phar:///tmp/wp-cli-phar/wp/php/WP_CLI/Runner.php:739
    PHP   7. require_wp_db() phar:///tmp/wp-cli-phar/wp/php/wp-settings-cli.php:100
    PHP   8. wpdb->__construct() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/load.php:369
    PHP   9. wpdb->db_connect() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/wp-db.php:658
    PHP  10. wpdb->select() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/wp-db.php:1560
    PHP  11. wp_load_translations_early() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/wp-db.php:1054
    PHP  12. WP_Locale->__construct() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/load.php:863
    PHP  13. WP_Locale->init() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/locale.php:360
    PHP  14. get_option() /tmp/wp-cli-test-run-566e1192a6b4b1.97422551/wp-includes/locale.php:194

In short, it's not safe to use get_option() inside of WP_Locale::init() because WPDB calls wp_load_translations_early() when it can't connect to the database.

#34 @SergeyBiryukov
18 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 35951:

I18N: After [35884], remove html_entity_decode() for the thousands separator in WP_Locale::init(), use the &nbsp; entity directly.

We can't get_option( 'blog_charset' ) here, because the database may not be available.

Fixes #10373.

#35 @Reynard
17 months ago

  • Severity changed from normal to minor

I would like to add this, yet I am not as familiar with what to do in this situation.

Should I just donwload functions.php.diff and 10373.3.diff and copy and paste the included code unto functions.php and locale.php?

Thank you!

#36 follow-up: @grapplerulrich
17 months ago

@Reynard - From the looks of it only was edited. locale.php. You can just copy the whole file here. https://github.com/WordPress/WordPress/blob/master/wp-includes/locale.php It seems like this will be released in WP 4.5.

#37 @ocean90
17 months ago

  • Severity changed from minor to normal

#38 in reply to: ↑ 36 @Reynard
17 months ago

Replying to grapplerulrich:

@Reynard - From the looks of it only was edited. locale.php. You can just copy the whole file here. https://github.com/WordPress/WordPress/blob/master/wp-includes/locale.php It seems like this will be released in WP 4.5.

Hm, I see. I was not able to accomplish anything from it. Just to recap. Would this modification allow me to present numbers which have decimals as commas? For example: 1,280,000.00 would become: 1.280.000 ?

Note: See TracTickets for help on using tickets.