WordPress.org

Make WordPress Core

Opened 5 years ago

Closed 5 years ago

#9689 closed defect (bug) (wontfix)

SimplePie auto-detection ignores feeds with incorrect content-type in HTTP header

Reported by: Denis-de-Bernardy Owned by: westi
Milestone: 2.8 Priority: normal
Severity: major Version: 2.8
Component: Feeds Keywords: has-patch
Focuses: Cc:

Description

Reported by an end-user, tried it on my end as well and failed too:

http://www.safestyle-windows.co.uk/news-rssfeed/news.rss

Can anyone else try feed work in an RSS widget?

Attachments (2)

9689.patch (683 bytes) - added by hakre 5 years ago.
Missing Content Type added to SimplePie locator
9689.2.patch (406 bytes) - added by hakre 5 years ago.
Propper SimplePie Configuration to not auto-locate (which can fail)

Download all attachments as: .zip

Change History (22)

comment:1 hakre5 years ago

I tried the Feed in the browser: It was displayed, but Opera didn't display it as Feed.

I tried the Feed on a RSS Reader in the Dashboard: RSS Error: A feed could not be found at http://www.safestyle-windows.co.uk/news-rssfeed/news.rss

Maybe the Feed is not valid?
http://validator.w3.org/feed/check.cgi?url=http%3A%2F%2Fwww.safestyle-windows.co.uk%2Fnews-rssfeed%2Fnews.rss

Says Feed is valid but suggests to not send it with the "text/plain" media type. Maybe that is the cause of Problems?

Simplepies Demo does not find the Feed:
http://simplepie.org/demo/?feed=http%3A%2F%2Fwww.safestyle-windows.co.uk%2Fnews-rssfeed%2Fnews.rss

Feedburner has no probs to eat that thing. I will take a closer look.

hakre5 years ago

Missing Content Type added to SimplePie locator

hakre5 years ago

Propper SimplePie Configuration to not auto-locate (which can fail)

comment:2 hakre5 years ago

This is a flaw with simple pie. The automatic Feed locating mechnaism does not identify "text/plain" as a valid contenttype for a feed. Technically spoken it is not a valid contenttype.

Since the decision is based on contenttype, it could be added (as in the patch) to get it working.

WP should not fix it but SimplePie or there is another option, called force_feed which can be set with ->force_feed(true) in feed.php (second patch). That seems far more reasonable, since

comment:3 hakre5 years ago

  • Keywords has-patch added

comment:4 follow-up: DD325 years ago

  • Cc rmccue added
  • Component changed from Widgets to Feeds

-1 for forcing it.

By forcing it, You're preventing any true non-feeds from displaying an error.

This needs to be sent upstream to SimplePie, to be more robust with incorrect Feeds - That sniffer is an hopeless way to determine if its a RSS feed or not..

Hopefully rmccue will see this ticket and deal with it..

comment:5 follow-up: rmccue5 years ago

I recommend forcing it. As far as I'm aware, forcing it will simply just skip the content-type check, but any parse errors will still occur if it can't be parsed.

Of course, the real solution would be for the site to change to an actual valid content-type for RSS, but I doubt that can happen. As far as I can see, there's no point in adding an upstream patch to SP. I'm fairly sure that the RSS specification, somewhere, mentions that a correct MIME type must be used, though I can't find this.

comment:6 DD325 years ago

I recommend forcing it. As far as I'm aware, forcing it will simply just skip the content-type check, but any parse errors will still occur if it can't be parsed.

Ah.. Ok, Then Forcing it sounds alright to me then. I misunderstood the force functionality.

though I can't find this.

I think i've seen something similar, Perhaps part of the RSS2 standard, or one of the XML standards... Here's a drupal ticket about text/plain RSS feeds, some sourses that state which mime types to be used are mentioned. http://drupal.org/node/142895

comment:7 DD325 years ago

So The only downside of using force_feed() is that we loose auto-search for feeds, ie. if someone enters 'http://dd32.id.au/' instead of 'http://dd32.id.au/feed/' then they'll get no feed displayed. Pretty sure this worked under MagPie as well.

comment:8 rmccue5 years ago

Magpie itself doesn't include built-in autodiscovery, so that would have needed to be provided by WP itself. You can always run SP's autodiscovery manually if needed.

comment:9 in reply to: ↑ 5 hakre5 years ago

Replying to rmccue:

[...] As far as I can see, there's no point in adding an upstream patch to SP. I'm fairly sure that the RSS specification, somewhere, mentions that a correct MIME type must be used, though I can't find this.

Well SP actually handles the Feed. What happens here is that when the URL is requested for the first time and (only) some mimetypes are not found, SP assumes that it must be a website and tries to autodiscover. In this case it fails because it is actually a feed.

for a real improvement SP should make the autodiscover that clever if checking a document for further links to feeds to actually think about to check if the whole document already is a feed.

a workaround for a user might be to not pass the feed url but the website that links the feed. unfourtionatly with the feed URL provided i cannot test.

comment:10 in reply to: ↑ 4 ; follow-up: hakre5 years ago

Replying to DD32:

-1 for forcing it.

By forcing it, You're preventing any true non-feeds from displaying an error.

no, in this case (the feed itself is valid, see w3c validation results) if you not force you get an error even if the feed is valid.

comment:11 in reply to: ↑ 10 DD325 years ago

Replying to hakre:

Replying to DD32:

-1 for forcing it.

By forcing it, You're preventing any true non-feeds from displaying an error.

no, in this case (the feed itself is valid, see w3c validation results) if you not force you get an error even if the feed is valid.

Always read follow up replies before commenting on a early comment.

comment:12 Denis-de-Bernardy5 years ago

  • Keywords commit added; 2nd-opinion removed

@dev: please commit or wontfix, so we can remove ticket clutter.

comment:13 westi5 years ago

  • Summary changed from SimplePie doesn't like rss 0.91? to SimplePie auto-detection ignores feeds with incorrect content-type in HTTP header

comment:14 follow-up: westi5 years ago

  • Keywords reporter-feedback added; commit removed
  • Owner set to westi
  • Status changed from new to assigned

I think the best option here is to turn off auto-discovery in SimplePie.

rmccue is there any benifit of using set_autodiscovery_level(SIMPLEPIE_LOCATOR_NONE) over force_feed() or do both end up doing the same thing?

comment:15 in reply to: ↑ 14 link925 years ago

I'll change handling of text/plain (and any other bit of Content-Type sniffing) in SimplePie if and only if you can convince Adam/Ian to change http://tools.ietf.org/id/draft-abarth-mime-sniff. Treating text/plain as a privileged type (such as any syndication format) allows scripting, which opens up whole extra security holes on top of what it claims to be. text/plain most certainly is not a media type that represents any sort of XML, and treating something sent as text/plain as a feed is most certainly wrong.

Replying to westi:

I think the best option here is to turn off auto-discovery in SimplePie.

rmccue is there any benifit of using set_autodiscovery_level(SIMPLEPIE_LOCATOR_NONE) over force_feed() or do both end up doing the same thing?

The former stops auto-discovery from looking for a feed in an HTML document (via link[@rel='feed']/@href or link[@rel='alternate' and (@type='application/rss+xml' or @type='application/atom+xml')]/@href) whereas the latter forces the URL given to be treated as a feed regardless of media type.

comment:16 follow-up: Denis-de-Bernardy5 years ago

so wontfix, I take it?

comment:17 in reply to: ↑ 16 westi5 years ago

  • Keywords 2nd-opinion added; reporter-feedback removed

Replying to Denis-de-Bernardy:

so wontfix, I take it?

The wontfix is for changing the behaviour of SimplePie upstream to accept text/plain as a feed.

It looks like if we want to have the same behaviour as before we need to enable the force_feed option.

Need to get impact from other devs on there view.

comment:18 Denis-de-Bernardy5 years ago

personally, I'd suggest forcing. after all, if a user decides to grab a feed using an rss widget, we could safely assume that it's a feed's url indeed.

the idea behavior, though, would be to do:

  • if it's html, try to autodetect a feed
  • else force

that way we'd have the best of both worlds

comment:19 Denis-de-Bernardy5 years ago

  • Keywords 2nd-opinion removed

comment:20 westi5 years ago

  • Resolution set to wontfix
  • Status changed from assigned to closed

More users are going to reference a blog, instead of a feed, and expect it to work than are going to point it at a malformed feed like this.

So WONTFIX.

Note: See TracTickets for help on using tickets.