Make WordPress Core

Opened 8 years ago

Last modified 5 weeks ago

#6269 new defect (bug)

RSS Import Doesn't Properly Strip CDATA Tags

Reported by: sweetdeal Owned by:
Milestone: WordPress.org Priority: normal
Severity: normal Version: 2.3.3
Component: Import Keywords: dev-feedback has-patch
Focuses: Cc:


When importing an RSS feed that uses the <description> tag as opposed to <content:encoded>, I noticed that WP's RSS import doesn't strip the CDATA tags as it does for the <content:encoded>.

=========Code Lines (83-87)===============

if (!$post_content) {
// This is for feeds that put content in description
preg_match('|<description>(.*?)</description>|is', $post, $post_content);
$post_content = $wpdb->escape($this->unhtmlentities(trim($post_content[1])));


I tweaked the code to solve the problem (see below)

==========Tweaked Code===============

if (!$post_content) {
// This is for feeds that put content in description
preg_match('|<description>(.*?)</description>|is', $post, $post_content);
$post_content = str_replace(array ('<![CDATA[', ']]>'), '',$wpdb->escape($this->unhtmlentities(trim($post_content[1]))));


I'd be happy to submit a patch, except I'm not quite that savvy yet. It would be great it someone could incorporate it. Thanks.

Attachments (3)

rss.php (5.0 KB) - added by sweetdeal 8 years ago.
tweaked rss.php (import) file
6269.patch (615 bytes) - added by Viper007Bond 8 years ago.
.diff file of sweetdeal's rss.php modifications
rss-importer-rewrite.diff (19.7 KB) - added by solarissmoke 5 years ago.

Download all attachments as: .zip

Change History (24)

@sweetdeal8 years ago

tweaked rss.php (import) file

comment:1 @sweetdeal8 years ago

Just an update -- I became savvy a few seconds after writing this and uploaded the tweaked rss.php file.

@Viper007Bond8 years ago

.diff file of sweetdeal's rss.php modifications

comment:2 @Viper007Bond8 years ago

I'm not for or against this patch, I merely made a diff of the original vs. sweetdeal's copy as that's the preferred patch type.

comment:3 @sweetdeal8 years ago

Thanks Viper007. :)

comment:4 @sweetdeal8 years ago

  • Milestone changed from 2.6 to 2.5.1

comment:5 @Nazgul8 years ago

  • Keywords has-patch added
  • Milestone changed from 2.5.1 to 2.6

Resetting the milestone to 2.6.

Fixes first go into the current development release and if they're deemed important/critical enough will be backported for maintenance releases.

comment:6 @thee177 years ago

  • Component changed from General to Import
  • Owner anonymous deleted

comment:7 @Denis-de-Bernardy6 years ago

  • Keywords needs-patch added; has-patch removed

imo, better adding the CDATA stuff into the regexp and make it optional material, e.g.:


comment:8 @Denis-de-Bernardy6 years ago

forgot the ! in the above regex, too.

comment:9 @azaozz6 years ago

  • Milestone changed from 2.9 to Future Release

comment:10 @solarissmoke5 years ago

  • Keywords dev-feedback added; rss import removed

I think the entire RSS importer needs an overhaul - it should stop parsing XML using regexes and switch to simplexml like the Wordpress importer ( #5460 ). I'm happy to spend some time on this, if there is traction?

comment:11 @nacin5 years ago

I'm game for that.

comment:12 @solarissmoke5 years ago

Cool. I'll work on it and ping you once it's ready. There are a few other tickets related to RSS import which I'll try and deal with as well.

comment:13 @Viper007Bond5 years ago

It'll be nice to ditch all of this PHP4 code in favor of PHP5 sexiness.

comment:14 @solarissmoke5 years ago

  • Keywords has-patch added; needs-patch removed

Right, here is an overhauled rss-importer. It has similar structure to wordpress-importer. Also addresses #7061 and #8982. Will probably need a bit of testing as the parser is from scratch.

Last edited 5 years ago by solarissmoke (previous) (diff)

comment:15 @duck_5 years ago

Looks like you accidentally left in some testing code, the false &&:

if ( false &&extension_loaded( 'simplexml' ) ) {

comment:16 @solarissmoke5 years ago

Whoops, yes! I'll replace it in a sec.

comment:17 @solarissmoke5 years ago

Should also fix #9678 if there is indeed a bug there

comment:18 @SergeyBiryukov3 years ago

  • Milestone changed from Future Release to WordPress.org

comment:19 @Viper007Bond3 years ago

  • Milestone changed from WordPress.org to Future Release

Unless I'm mistaken, the WordPress.org milestone is for things affecting the actual WordPress.org website. This ticket is about WordPress the software, not WordPress the website.

comment:20 @SergeyBiryukov3 years ago

  • Milestone changed from Future Release to WordPress.org

"Future Release" only applies to core, and importers are no longer a part of the core.

A special component is WordPress.org. In this milestone, we manage tickets for core plugins such as the importers (under the Plugins or Import components), current and former default themes (Bundled Theme component), and the WordPress.org site (component of the same name).


comment:21 @chriscct75 weeks ago

  • Priority changed from low to normal
  • Severity changed from minor to normal
Note: See TracTickets for help on using tickets.