Opened 13 months ago
Last modified 13 months ago
#59082 new defect (bug)
Titles and descriptions in RSS feeds need to use CDATA to encode special characters.
Reported by: | jsmoriss | Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | major | Version: | 6.3 |
Component: | Feeds | Keywords: | |
Focuses: | Cc: |
Description
Only some RSS XML tag values are enclosed in CDATA, but all values that include special characters should be enclosed in CDATA so we can encode them, like UTF8, emojis, etc. (see https://validator.w3.org/feed/).
For example the item description in feed-rss.php uses CDATA:
<description><![CDATA[<?php the_excerpt_rss(); ?>]]></description>
But the channel description does not:
<description><?php bloginfo_rss( 'description' ); ?></description>
Only some description tags use CDATA:
wordpress/wp-includes$ grep '<description>' feed* feed-rdf.php: <description><?php bloginfo_rss( 'description' ); ?></description> feed-rdf.php: <description><![CDATA[<?php the_excerpt_rss(); ?>]]></description> feed-rdf.php: <description><![CDATA[<?php the_excerpt_rss(); ?>]]></description> feed-rss2-comments.php: <description><?php bloginfo_rss( 'description' ); ?></description> feed-rss2-comments.php: <description><?php echo ent2ncr( __( 'Protected Comments: Please enter your password to view comments.' ) ); ?></description> feed-rss2-comments.php: <description><![CDATA[<?php comment_text_rss(); ?>]]></description> feed-rss2.php: <description><?php bloginfo_rss( 'description' ); ?></description> feed-rss2.php: <description><![CDATA[<?php the_excerpt_rss(); ?>]]></description> feed-rss2.php: <description><![CDATA[<?php the_excerpt_rss(); ?>]]></description> feed-rss.php: <description><?php bloginfo_rss( 'description' ); ?></description> feed-rss.php: <description><![CDATA[<?php the_excerpt_rss(); ?>]]></description>
And none of the title tags use CDATA:
wordpress/wp-includes$ grep '<title>' feed* feed-atom-comments.php: <title> feed.php: <title>' . $rss_title . '</title> feed-rdf.php: <title><?php wp_title_rss(); ?></title> feed-rdf.php: <title><?php the_title_rss(); ?></title> feed-rss2-comments.php: <title> feed-rss2-comments.php: <title> feed-rss2.php: <title><?php wp_title_rss(); ?></title> feed-rss2.php: <title><?php the_title_rss(); ?></title> feed-rss.php: <title><?php wp_title_rss(); ?></title> feed-rss.php: <title><?php the_title_rss(); ?></title>
I found this issue because a site's title contained an encoded special character and an RSS reader refused to parse the XML because the title value was not in a CDATA enclosure.
js.
Change History (2)
#2
@
13 months ago
An example HTML entity fix that passes the W3C validator without using CDATA is:
add_filter( 'get_wp_title_rss', 'fix_document_title_for_rss' ); function fix_document_title_for_rss( $itle ) { return ent2ncr( $title ); }
The get_wp_title_rss() function could be modified to apply ent2ncr() on all results from the 'get_wp_title_rss' filter - for example:
return ent2ncr( apply_filters( 'get_wp_title_rss', wp_get_document_title(), $deprecated ) );
js.
wp_title_rss()
andget_wp_title_rss()
(for example) return the value ofwp_get_document_title()
, which often includes HTML and/or UTF8 encoded characters, so the RSS XML title tags should use CDATA.I would suggest that this is incorrect:
js.