Make WordPress Core

Opened 9 years ago

Closed 4 years ago

Last modified 4 years ago

#33591 closed defect (bug) (fixed)

Media File Missing Length Causes WordPress Atom Feed to Be Invalid

Reported by: jonnybot's profile jonnybot Owned by: stevenkword's profile stevenkword
Milestone: 5.5 Priority: normal
Severity: normal Version: 4.2.4
Component: Feeds Keywords: good-first-bug has-patch has-unit-tests
Focuses: administration Cc:

Description

In wp-includes\feed.php, the atom_enclosure() function can produce invalid XML, which will break some feed readers and XML parsers that might digest the feed. This happens when the get_post_custom() function returns data that doesn't include all the metadata that it ought to for a given enclosure.

This line-

$enclosure = explode("\n", $enc);

...creates an array by splitting the enclosure meta information on newline characters ("\n"). Then, this line:

echo apply_filters( 'atom_enclosure', '<link href="' . trim( htmlspecialchars( $enclosure[0] ) ) . '" rel="enclosure" length="' . trim( $enclosure[1] ) . '" type="' . trim( $enclosure[2] ) . '" />' . "\n" );

...assumes that the url will be at position 0 in the array, the length in position 1, and the type in position 2.

If the length line is simply missing, as is the case for a particular post I have on my blog, then the next line in the metadata is some kind of configuration data that looks something like this:

a:1:{s:8:"duration";s:8:"00:05:29";}" }  

This breaks the feed in a couple of ways. First, because the length attribute is missing from the metadata, the length attribute of the resulting link tag will read "audio/mpeg" (or whatever type of file it is).

Second, it tries to put a type that contains quotes, which ends the attribute's value early and creates a broken link tag.

<link href="http://myblog.com/wp-content/uploads/2015/03/media.mp3" rel="enclosure" length="audio/mpeg" type="a:1:{s:8:"duration";s:8:"00:05:20";}" />

The broken link tag will break some feed readers and other XML parsers that are pointed at the blog's ATOM feed.

The same issue may also affect the rss_enclosure function in feed.php, though I haven't tested it for that case.

I'm not sure whether the more important bug bug is that that the ATOM feed breaks if the length is missing, or if the bug is, "Hey, the length meta data is missing for this object, how did THAT happen?"

Still, it would seem that the assumption that the length will be at the second (or 1th) position in the array is not a safe one. At the very least, it might be good to encode the output so that double quotes from the metadata don't break the XML of the feed?

Also, any pointers on sleuthing out why the metadata is missing would be much appreciated.

At the very least, it seems prudent to wrap the length (contents of $enclosure[1]) and the type ($enclosure[2]) in the htmlspecialchars() function, the same way the href attribute is.

Attachments (8)

feed-missing-length-bug.php (1.2 KB) - added by jonnybot 9 years ago.
33591.diff (1.1 KB) - added by vtieu 9 years ago.
33591.2.diff (1.7 KB) - added by vtieu 9 years ago.
Searches the enclosure for url, length, and type. Should prevent the problem.
33591.3.diff (1.7 KB) - added by vtieu 9 years ago.
Uploaded correct version.
33591.4.diff (1.9 KB) - added by vtieu 9 years ago.
Fixed coding style.
33591.5.diff (1.6 KB) - added by stevenkword 7 years ago.
33591.6.patch (6.7 KB) - added by birgire 7 years ago.
33591.6-refresh.patch (4.5 KB) - added by rebasaurus 4 years ago.
Refresh

Download all attachments as: .zip

Change History (31)

#1 @jonnybot
9 years ago

You can see the results of the attached feed-missing-length-bug.php on my site at http://news.missouristate.edu/feed-missing-length-bug.php.

#2 @stevenkword
9 years ago

  • Keywords needs-patch good-first-bug needs-unit-tests added

Thank you @jonnybot for the bug report!

I'm in agreement that mapping the array based on position is not desirable and I am in favor of a safer approach. I'm taking a look.

Last edited 9 years ago by stevenkword (previous) (diff)

#3 @stevenkword
9 years ago

  • Owner set to stevenkword
  • Status changed from new to assigned

#4 @stevenkword
9 years ago

Related to #16747

#5 @stevenkword
9 years ago

I spoke today with @vtieu on Slack, who is going to take a look at submitting a patch. We decided that the first step is to determine what the required fields for an enclosure and to map those explicitly by key, rather than by position. Then, once the required fields are mapped, we can map the existing fields returned by get_post_cstom() by position.

I think we should also take a look at address #16747 in this ticket if possible, but that shouldn't become a blocker either.

@vtieu
9 years ago

#6 @vtieu
9 years ago

33591.diff Doesn't fix the bug yet, however I've mapped the array.

#7 @stevenkword
9 years ago

@vtieu,

Thank you for the patch. Your response identifies a new problem and the next step in the process. We need unit tests to reveal the reported bug. Once we have that, we can apply a patch in an attempt to fix that problem. If the new tests fail before our patch is applied, and pass afterwards, we can be more confident that we've solved the problem. What we need now is a unit test.

#8 @stevenkword
9 years ago

This going to end up blocked by #35160 since there is not currently test coverage in core for Atom.

@vtieu
9 years ago

Searches the enclosure for url, length, and type. Should prevent the problem.

@vtieu
9 years ago

Uploaded correct version.

#9 follow-up: @vtieu
9 years ago

Here is the patch.

#10 in reply to: ↑ 9 @stevenkword
9 years ago

Replying to vtieu:

Here is the patch.

There are a couple of Coding Standards improvements we could make, but the big blocker is going to be the lack of unit tests in Core for Atom. I've raised the priority of ticket #35160 in hopes that we can get those tests in ASAP. Once that's done, we can write additional tests proving this patch is useful.

@vtieu
9 years ago

Fixed coding style.

#11 @vtieu
9 years ago

I've modified the patch file to better follow the coding standards suggested by Steven.

@stevenkword
7 years ago

#12 @stevenkword
7 years ago

  • Keywords has-patch added; needs-patch needs-unit-tests removed
  • Milestone changed from Awaiting Review to Future Release

#13 @stevenkword
7 years ago

  • Milestone changed from Future Release to 4.8

This ticket was mentioned in Slack in #core by obenland. View the logs.


7 years ago

#15 @obenland
7 years ago

  • Milestone changed from 4.8 to Future Release

@birgire
7 years ago

#16 @birgire
7 years ago

  • Keywords has-unit-tests added

In 33591.6.patch we try to extend the url, length and type parsing, to support these variations:

url 
length
type

url
length

url
type

url

Few notes:

  • I'm also wondering if the link tag attributes can have single quotes, like WordPress uses for the stylesheets? We use double quotes here.
  • Should the link tags be removed for empty url attributes? Here we allow empty ones.
  • I assume the type attribute can be empty?

Tests are included.

Last edited 7 years ago by birgire (previous) (diff)

This ticket was mentioned in Slack in #core by sergey. View the logs.


4 years ago

#18 @SergeyBiryukov
4 years ago

  • Milestone changed from Future Release to 5.5
  • Summary changed from Media File Missing Length Causes Wordpress Atom Feed to Be Invalid to Media File Missing Length Causes WordPress Atom Feed to Be Invalid

#19 @davidbaumwald
4 years ago

  • Keywords needs-refresh added

Latest patch fails against trunk, so marking for a refresh.

This ticket was mentioned in Slack in #core by david.baumwald. View the logs.


4 years ago

@rebasaurus
4 years ago

Refresh

#21 @rebasaurus
4 years ago

  • Keywords needs-refresh removed

#22 @whyisjake
4 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 48429:

Feeds: Ensure that enclosures produce valid XML.

Metadata that is stored on newlines has the possibility of missing values, so rather then coercing values, we can check for them and then implicity set the values.

Fixes #33591.
Props jonnybot, stevenkword, vtieu, birgire, SergeyBiryukov, davidbaumwald, rebasaurus, whyisjake.

#23 @SergeyBiryukov
4 years ago

In 48435:

Coding Standards: Use strict type check for in_array() in wp-includes/feed.php.

Additionally:

  • Correct inline comments per the documentation standards.
  • Correct the @ticket reference in tests/feed/atom.php.

Follow-up to [48429].

See #33591.

Note: See TracTickets for help on using tickets.