﻿id	summary	reporter	owner	description	type	status	priority	milestone	component	version	severity	resolution	keywords	cc
19368	UTF-8 characters truncated mid-byte sequence in excerpt in RSS2 feed	kurtmckee		"I received [https://code.google.com/p/feedparser/issues/detail?id=306 a bug report at a project I maintain] and discovered what appears to be a bug in Wordpress 3.2.1.

The trouble is that the `description` element is being truncated in the middle of a UTF-8 multibyte character, which is producing garbage binary data. An example can be found at:

http://www.arnaudmontebourg.fr/?feed=rss2

I downloaded [http://themocracy.com/2009/07/alibi3col-free-wordpress-theme/ the site's theme] but found nothing that would affect `post_excerpt` or `the_excerpt_rss`. I then downloaded Wordpress trunk and attempted to figure out where the problem might be, but I'm unfamiliar with the Wordpress source and couldn't find anything after tracing through multiple files using grep.

I did discover that `trackback_url_list()` in `wp-includes/post.php` appears to be using a simple `substr()` call that might cause problems with multibyte characters. However, I'm more concerned with the potential for malformed feeds.

I've included a copy of the feed XML in question for longevity."	defect (bug)	new	normal	Awaiting Review	Feeds		normal			
