Opened 17 years ago
Last modified 6 years ago
#6269 new defect (bug)
RSS Import Doesn't Properly Strip CDATA Tags
Reported by: | sweetdeal | Owned by: | |
---|---|---|---|
Milestone: | WordPress.org | Priority: | normal |
Severity: | normal | Version: | 2.3.3 |
Component: | Import | Keywords: | dev-feedback has-patch |
Focuses: | Cc: |
Description
When importing an RSS feed that uses the <description> tag as opposed to <content:encoded>, I noticed that WP's RSS import doesn't strip the CDATA tags as it does for the <content:encoded>.
=========Code Lines (83-87)===============
if (!$post_content) { // This is for feeds that put content in description preg_match('|<description>(.*?)</description>|is', $post, $post_content); $post_content = $wpdb->escape($this->unhtmlentities(trim($post_content[1]))); }
=====================================
I tweaked the code to solve the problem (see below)
==========Tweaked Code===============
if (!$post_content) { // This is for feeds that put content in description preg_match('|<description>(.*?)</description>|is', $post, $post_content); $post_content = str_replace(array ('<![CDATA[', ']]>'), '',$wpdb->escape($this->unhtmlentities(trim($post_content[1])))); }
======================================
I'd be happy to submit a patch, except I'm not quite that savvy yet. It would be great it someone could incorporate it. Thanks.
Attachments (3)
Change History (24)
#1
@
17 years ago
Just an update -- I became savvy a few seconds after writing this and uploaded the tweaked rss.php file.
#2
@
17 years ago
I'm not for or against this patch, I merely made a diff of the original vs. sweetdeal's copy as that's the preferred patch type.
#5
@
17 years ago
- Keywords has-patch added
- Milestone changed from 2.5.1 to 2.6
Resetting the milestone to 2.6.
Fixes first go into the current development release and if they're deemed important/critical enough will be backported for maintenance releases.
#7
@
16 years ago
- Keywords needs-patch added; has-patch removed
imo, better adding the CDATA stuff into the regexp and make it optional material, e.g.:
|<description>(?:<\[CDATA\[)?(.*?)(?:\]\]>)?</description>|is
#10
@
14 years ago
- Keywords dev-feedback added; rss import removed
I think the entire RSS importer needs an overhaul - it should stop parsing XML using regexes and switch to simplexml like the Wordpress importer ( #5460 ). I'm happy to spend some time on this, if there is traction?
#12
@
14 years ago
Cool. I'll work on it and ping you once it's ready. There are a few other tickets related to RSS import which I'll try and deal with as well.
#14
@
14 years ago
- Keywords has-patch added; needs-patch removed
#15
@
14 years ago
Looks like you accidentally left in some testing code, the false &&
:
if ( false &&extension_loaded( 'simplexml' ) ) {
#19
@
12 years ago
- Milestone changed from WordPress.org to Future Release
Unless I'm mistaken, the WordPress.org milestone is for things affecting the actual WordPress.org website. This ticket is about WordPress the software, not WordPress the website.
#20
@
12 years ago
- Milestone changed from Future Release to WordPress.org
"Future Release" only applies to core, and importers are no longer a part of the core.
A special component is WordPress.org. In this milestone, we manage tickets for core plugins such as the importers (under the Plugins or Import components), current and former default themes (Bundled Theme component), and the WordPress.org site (component of the same name).
tweaked rss.php (import) file