Make WordPress Core

Opened 11 years ago

Closed 10 years ago

#28162 closed defect (bug) (fixed)

seems_utf8() fails with mbstring.func_overload enabled

Reported by: sergeybiryukov's profile SergeyBiryukov Owned by: sergeybiryukov's profile SergeyBiryukov
Milestone: 4.0 Priority: normal
Severity: normal Version: 1.2.1
Component: Formatting Keywords: has-patch commit
Focuses: Cc:

Description

seems_utf8() tests fail with mbstring.func_overload enabled:

S:\home\wordpress\develop>phpunit tests/phpunit/tests/formatting/SeemsUtf8.php
Installing...
Running as single site... To run multisite, use -c tests/phpunit/multisite.xml
Not running ajax tests... To execute these, use --group ajax.
PHPUnit 3.6.11 by Sebastian Bergmann.

Configuration read from S:\home\wordpress\develop\phpunit.xml.dist

..F.FI

Time: 15 seconds, Memory: 24.25Mb

There were 2 failures:

1) Tests_Formatting_SeemsUtf8::test_returns_true_for_utf8_strings with data set #2 ('საქართველო')

This also leads to broken post slugs:

  1. Create a post titled "абвгдеёжзийклмнопрстуфхцчшщъыьэюя".
  2. Note the auto-generated slug: "dhdhdhdhdhdhundhdhdhdhdhdhdhdhdhdhnennnfnnnnnnnsnnoennzn".

Attachments (7)

28162.patch (519 bytes) - added by SergeyBiryukov 11 years ago.
28162.2.patch (780 bytes) - added by SergeyBiryukov 11 years ago.
28162.3.patch (2.1 KB) - added by SergeyBiryukov 11 years ago.
28162.4.patch (3.4 KB) - added by SergeyBiryukov 11 years ago.
28162.5.patch (3.7 KB) - added by SergeyBiryukov 10 years ago.
28162.6.patch (3.8 KB) - added by SergeyBiryukov 10 years ago.
Refreshed
28162.7.patch (3.1 KB) - added by SergeyBiryukov 10 years ago.

Download all attachments as: .zip

Change History (18)

#1 @SergeyBiryukov
11 years ago

  • Keywords has-patch commit added

28162.patch uses mbstring_binary_safe_encoding() before calling strlen(), which fixes both the tests and the slug generation.

#2 follow-up: @SergeyBiryukov
11 years ago

mbstring.func_overload also reduces our usual 33 character limit for non-English slugs to 16 characters:

  1. Create a post titled "абвгдеёжзийклмнопрстуфхцчшщъыьэюя".
  2. Note the auto-generated slug (with the patch): "абвгдеёжзийклмно".

Leaving a note here to create a separate ticket for that.

#3 in reply to: ↑ 2 @SergeyBiryukov
11 years ago

Replying to SergeyBiryukov:

Leaving a note here to create a separate ticket for that.

Or we could fix both issues in one go. utf8_uri_encode() needs the same change: 28162.2.patch.

#4 @SergeyBiryukov
11 years ago

utf8_uri_encode() tests currently fail with mbstring.func_overload enabled too.

S:\home\wordpress\develop>phpunit tests/phpunit/tests/formatting/Utf8UriEncode.php
Installing...
Running as single site... To run multisite, use -c tests/phpunit/multisite.xml
Not running ajax tests... To execute these, use --group ajax.
PHPUnit 3.6.11 by Sebastian Bergmann.

Configuration read from S:\home\wordpress\develop\phpunit.xml.dist

FFFFF.....

Time: 9 seconds, Memory: 24.50Mb

There were 5 failures:

1) Tests_Formatting_Utf8UriEncode::test_percent_encodes_non_reserved_characters with data set #0 ('章子怡', '%e7%ab%a0%e5%ad%90%e6%80%a1')
%a1')
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'%e7%ab%a0%e5%ad%90%e6%80%a1'
+'%e7%ab%a0'

S:\home\wordpress\develop\tests\phpunit\tests\formatting\Utf8UriEncode.php:15
S:\usr\local\php5\phpunit:46

...

28162.2.patch fixes them.

#5 @SergeyBiryukov
11 years ago

28162.4.patch also fixes failures in Tests_Image_Meta::test_utf8_iptc_tags() and Tests_POMO_MO::test_export_mo_file():

There were 2 failures:

1) Tests_Image_Meta::test_utf8_iptc_tags
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'This is a comment. / Это комментарий. / Βλέπετε ένα σχόλιο.'
+''

S:\home\wordpress\develop\tests\phpunit\tests\image\meta.php:130
S:\usr\local\php5\phpunit:46

S:\usr\local\php5\phpunit:46

2) Tests_POMO_MO::test_export_mo_file
Failed asserting that two objects are equal.
--- Expected
+++ Actual
@@ @@
     'translations' => Array (
-        0 => 'розов'
+        0 => 'пу'
     )
     'translator_comments' => ''
     'extracted_comments' => ''
     'references' => Array ()
     'flags' => Array ()
 )

S:\home\wordpress\develop\tests\phpunit\tests\pomo\mo.php:96
S:\usr\local\php5\phpunit:46

#6 @SergeyBiryukov
11 years ago

To summarize, I get 12 failures when running test suite with mbstring.func_overload enabled without a patch, and zero failures with 28162.4.patch.

Looking at the patch, it would be nice if we had a binary-safe wrapper for strlen(), similar to the one we have in POMO_Reader: tags/3.9/src/wp-includes/pomo/streams.php#L70.

#7 @SergeyBiryukov
10 years ago

28162.5.patch adds a binary-safe wrapper for strlen().

@SergeyBiryukov
10 years ago

Refreshed

#8 @SergeyBiryukov
10 years ago

  • Owner set to SergeyBiryukov
  • Resolution set to fixed
  • Status changed from new to closed

In 28806:

Introduce a binary-safe wrapper for strlen() and use it in seems_utf8(), utf8_uri_encode(), and wp_read_image_metadata().

Use binary-safe POMO_Reader::strlen() in MO::export_to_file_handle().

fixes #28162.

#9 @nacin
10 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

This looks good. Is it worth re-considering 28162.4.patch? That'd be the patch without a wrapper. It's minimally used; and often having a wrapper will result in performance degradation as it'll get used in a loop. That doesn't occur here within these functions, but it could occur elsewhere in the future.

#10 @SergeyBiryukov
10 years ago

It seemed a bit cleaner, but I see your point. 28162.7.patch removes the wrapper.

#11 @SergeyBiryukov
10 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 28808:

Remove mbstring_binary_safe_strlen(). Use mbstring_binary_safe_encoding() and reset_mbstring_encoding() directly.

fixes #28162.

Note: See TracTickets for help on using tickets.