WordPress.org

Make WordPress Core

Opened 7 years ago

Closed 5 years ago

#4984 closed defect (bug) (fixed)

RSS Widget does not change encoding of RSS feed

Reported by: camilohollanda Owned by: widgets
Milestone: 2.8 Priority: normal
Severity: normal Version: 2.2.3
Component: I18N Keywords: reporter-feedback
Focuses: Cc:

Description

Widget for RSS dont identify correct coding of RSS imported. It mantains the same coding of original RSS. As I use UTF-8 in my site, ISO RSS will be corrupted. So I adapted my widgets.php

function detectUTF8($string) {

return preg_match('%(?:
[\xC2-\xDF][\x80-\xBF] # non-overlong 2-byte
|\xE0[\xA0-\xBF][\x80-\xBF] # excluding overlongs
|[\xE1-\xEC\xEE\xEF][\x80-\xBF]{2} # straight 3-byte
|\xED[\x80-\x9F][\x80-\xBF] # excluding surrogates
|\xF0[\x90-\xBF][\x80-\xBF]{2} # planes 1-3
|[\xF1-\xF3][\x80-\xBF]{3} # planes 4-15
|\xF4[\x80-\x8F][\x80-\xBF]{2} # plane 16
)+%xs', $string);

}

if(!detectUTF8($desc)) { $desc = utf8_encode($desc); }
if(!detectUTF8($summary)) { $summary = utf8_encode($summary); }
if(!detectUTF8($title)) { $title = utf8_encode($title); }

But I think that ideal would be that widgets could convert RSS to the coding of choice in the Options menu of Admin. In my case, UTF.

Sorry for my poor english,
Camilo

Change History (5)

comment:1 Otto427 years ago

The detectUTF8 function is nice (comes from here: http://www.php.net/manual/en/function.mb-detect-encoding.php ), but wouldn't the RSS document actually specify its encoding? Seems to me that just looking to see what encoding the XML says it is in would make more sense than using that to try and detect it.

comment:2 foolswisdom7 years ago

  • Milestone changed from 2.2.3 to 2.4
  • Version set to 2.2.3

comment:3 Denis-de-Bernardy7 years ago

true, but the point remains valid (I've seen this kind of bug a few times as well).

currently, the content of feeds whose encoding differs from the blog's gets rendered with the blog's encoding.

what should happen instead is, when the feed's encoding differs from the blogs, it gets converted to the blog's encoding before being processed any further. in particular rendered.

comment:4 darkdragon6 years ago

  • Component changed from General to i18n
  • Summary changed from Bug in RSS Widget to RSS Widget does not change encoding of RSS feed
  • Type changed from task to defect

comment:5 Denis-de-Bernardy5 years ago

  • Keywords reporter-feedback added; widgets utf iso removed
  • Milestone changed from 2.9 to 2.8
  • Resolution set to fixed
  • Status changed from new to closed

please reopen with feedback if this occurs with simplepie

Note: See TracTickets for help on using tickets.