WordPress.org

Make WordPress Core

Opened 5 years ago

Closed 5 years ago

#9616 closed defect (bug) (fixed)

Trunk throws warnings on php4 sites

Reported by: Denis-de-Bernardy Owned by:
Milestone: 2.8 Priority: high
Severity: blocker Version: 2.8
Component: Warnings/Notices Keywords: has-patch needs-testing dev-feedback
Focuses: Cc:

Description

On my customer's site, it was due to the RSS widgets grabbing eBay feeds. But there are other calls to html_entity_decode() around.

Symptom:

Trying to decode HTML entities into UTF-8 results in the following error message:

Warning: cannot yet handle MBCS in html_entity_decode()!

The line is repeated about 200 times, then html_entity_decode just uses ISO-8859-1 charset.

The related php bugs:

Also occurs in BB:

http://bbpress.org/forums/topic/warning-cannot-yet-handle-mbcs-in-html_entity_decode

Based on this thread, there is a potential fix:

Attachments (3)

4-21-2009 10-56-22 PM.png (19.5 KB) - added by Denis-de-Bernardy 5 years ago.
another example…
String.inc.php (24.2 KB) - added by Denis-de-Bernardy 5 years ago.
on a separate note, this class has a couple of useful functions
9616.diff (5.7 KB) - added by Denis-de-Bernardy 5 years ago.

Download all attachments as: .zip

Change History (10)

comment:1 Denis-de-Bernardy5 years ago

  • Keywords needs-patch dev-feedback added

comment:3 Denis-de-Bernardy5 years ago

yeah, I'm aware, but wp is still supporting it. plus, it might fix your plugin too. ;-)

comment:4 ryan5 years ago

  • Component changed from General to Warnings/Notices
  • Owner anonymous deleted

Denis-de-Bernardy5 years ago

another example...

comment:5 Denis-de-Bernardy5 years ago

Their implementation is in classes/core/String.inc.php, and goes like this:

function html2utf($str) {
	// convert named entities to numeric entities
	$str = strtr($str, String::getHTMLEntities());

	// use PCRE-aware replace function to replace numeric entities
	$str = String::regexp_replace('~&#x([0-9a-f]+);~ei', 'String::code2utf(hexdec("\\1"))', $str);
	$str = String::regexp_replace('~&#([0-9]+);~e', 'String::code2utf(\\1)', $str);

	return $str;
 }

Side note: best I'm aware, strtr is not mb safe, so the above code is potentially buggy in mb languages.

Denis-de-Bernardy5 years ago

on a separate note, this class has a couple of useful functions

Denis-de-Bernardy5 years ago

comment:7 ryan5 years ago

  • Resolution set to fixed
  • Status changed from new to closed

(In [11081]) Silence html_entity_decode warnings. Props Denis-de-Bernardy. fixes #9616

Note: See TracTickets for help on using tickets.