Opened 15 years ago
Closed 13 years ago
#12251 closed defect (bug) (worksforme)
mb_substr() works strangely in some environment.
Reported by: |
|
Owned by: |
|
---|---|---|---|
Milestone: | Priority: | high | |
Severity: | normal | Version: | 2.9.2 |
Component: | Charset | Keywords: | |
Focuses: | Cc: |
Description
http://wordpress.org/support/topic/357562
First of all, this is not a P2 theme bug; this is WP core bug.
I use English WordPress, but I post a post in Korean.
Summarizing above link: mb_substr() and mb_strlen() shows malfunction when dealing with non-English characters, because encoding parameter is not specified. This happens when PHP MB extension is enabled, because backward-compatibility function _mb_substr() automatically assumes the encoding as UTF-8 but the extension does not.
Following should be adopted to 2.9.3:
FILE: /wp-admin/includes/plugin-install.php, line 332
FROM:
$description = mb_substr($description, 0, 400) . '…';
TO:
$description = mb_substr($description, 0, 400, 'UTF-8') . '…';
FILE: /wp-admin/includes/post.php, line 1037
FROM:
if ( mb_strlen($post_name) > 30 ) {
TO:
if ( mb_strlen($post_name, 'UTF-8') > 30 ) {
FILE: /wp-admin/includes/post.php, line 1038
FROM:
$post_name_abridged = mb_substr($post_name, 0, 14). '…' . mb_substr($post_name, -14);
TO:
$post_name_abridged = mb_substr($post_name, 0, 14, 'UTF-8'). '…' . mb_substr($post_name, -14, 14, 'UTF-8');
FILE: /wp-includes/formatting.php, line 2708
FROM:
$str = mb_substr( $str, 0, $count );
TO:
$str = mb_substr( $str, 0, $count, 'UTF-8' );
Personally, this is very inconvenient: it gives me stress of seeing Unicode Replacement Character.
Change History (12)
#3
@
15 years ago
This is not a regression. I'm moving it to 3.0 for now. Any commits can be backported to the 2.9 and 2.8 branches if desired.
#5
@
15 years ago
- Milestone changed from 3.0 to 3.1
using mb_internal_encoding
sounds like a good option to me.. But I dont know enough about character sets and i'm not in a position to test it well enough.
I'm moving to 3.1 as its not something I would feel comfortable changing close to the end of a dev cycle. If someone who understands the issues at heart here, and can test and take it on, I'm all for this angle however.
#6
@
15 years ago
WordPress 3.0 already uses mb_internal_encoding()
in wp-includes/load.php
. This ticket is probably fixed now.
#8
@
15 years ago
- Keywords reporter-feedback added
- Milestone changed from Awaiting Triage to Awaiting Review
wp_set_internal_encoding() is new, but the code isn't.
#10
@
13 years ago
- Keywords reporter-feedback removed
- Resolution set to fixed
- Status changed from new to closed
I think this issue can be closed as fixed. In case this is still reproduceable, feel free to re-open.
#11
@
13 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
mb_internal_encoding()
was added in [7140], two years before this ticket was created:
http://core.trac.wordpress.org/browser/tags/2.5/wp-settings.php#L354
I guess "worksforme" would be a proper resolution.
I’d rather see something like this very early in WP: