Opened 12 years ago
Closed 11 years ago
#23219 closed defect (bug) (fixed)
Duplicate enclosure metadata created because of disagreement on meta field formatting
Reported by: | redsweater | Owned by: | nacin |
---|---|---|---|
Milestone: | 3.6 | Priority: | normal |
Severity: | normal | Version: | 2.8 |
Component: | XML-RPC | Keywords: | has-patch needs-unit-tests |
Focuses: | Cc: |
Description
Years ago I reported an issue in Ticket #7773 that the enclosures would be duplicated each time the post was edited and resubmitted through the XMLRPC API. That bug was fixed in [10383] but a nuance was overlooked that causes this kind of duplication to occur with every edit if the ORIGINAL custom field meta for the enclosure was generated by the automatic content-scraper in WordPress.
If a post with links to "enclosure" type files is submitted either through the web interface or through the XMLRPC API, it will automatically generate the insertion of enclosure metadata as a custom field with fragile-structured content of the format:
add_post_meta( $post_ID, 'enclosure', "$url\n$len\n$mime\n" );
But the XMLRPC implementation, where the attempt is made to avoid generating duplicates, uses a slightly different fragile template, with no trailing newline:
$encstring = $enclosure['url'] . "\n" . $enclosure['length'] . "\n" . $enclosure['type'];
To make a long story short: the attempt to detect an existing enclosure on the post always fails because the existing enclosure string has a newline, and the XMLRPC-based one does not.
I propose that this be fixed by at enacting 1 and probably also 2:
- Make sure duplicate detection method insensitive to trailing whitespace. This will fix the bug in a way that doesn't require retroactively "fixing" the existing stored metadata with trailing newline.
- Change the format of the "scraped" enclosure generation to match the format used by XMLRPC. Either with or without the newline would probably be fine, but it seems cleaner to stick with the one that doesn't require a trailing newline, just three blobs of text separated by newlines. It also seems lucky that the last blob is a MIME type and thus would never have a meaningful newline at the trailing end.
I'm attaching a patch that achieves both of these goals. I report this to the XML-RPC component because that's the impact is, even if some of the issue is in the default scraping code from functions.php.
I'm a little less certain about how to tackle a unit test for these cases, but if somebody wants to point me in the right direction I can take a hack at that too.
Attachments (3)
Change History (12)
#2
@
12 years ago
Please note this is slightly worse than just a little extra metadata in the database. When the RSS feed is generated for a blog, it will include multiple enclosure links for the same asset:
<enclosure url="http://www.coreint.org/audio/CoreInt_72.mp3" length="21660995" type="audio/mpeg" /> <enclosure url="http://www.coreint.org/audio/CoreInt_72.mp3" length="21660995" type="audio/mpeg" /> <enclosure url="http://www.coreint.org/audio/CoreInt_72.mp3" length="21660995" type="audio/mpeg" />
Judging from the fact this problem has gone unnoticed ;) I suspect that most feed clients including Apple's iTunes parser are not bothered by the presence of multiple, identical enclosures. But it's obviously a little messy.
#3
@
12 years ago
For completeness, I neglected to include steps to reproduce the bug:
- Create a new post from a XMLRPC client such as the WordPress for iOS app or MarsEdit.
- Include in the post some reference to an enclosure-style media asset such as an MP3. E.g.:
<a href="http://www.coreint.org/audio/CoreInt_72.mp3">Download</a>
- Send the post to the blog.
- Observe that the post has automatically earned itself an enclosure and the custom field entry that goes with that.
- Open and make whatever basic edit to the post, and send to the blog again.
- Observe that it now has an extra custom field metadata entry for the same enclosure.
This will happen again and again now matter how many edits you make. So it can become quite egregious depending on the frequency of editing a particular post.
#4
@
12 years ago
- Keywords needs-unit-tests added
- Milestone changed from Awaiting Review to 3.6
- Version changed from trunk to 2.8
#5
@
11 years ago
This is a great candidate for unit tests. I think we should standardize on \n
trailing. So, we can do an rtrim() comparison. I'm going to toss this patch in, but would like to see unit test coverage as well.
#6
@
11 years ago
Actually, here's a patch: 23219.diff. Would have committed it but need to test first.
#7
@
11 years ago
- Cc tollmanz@… added
Added tests that pass with patch filed "23219.diff" applied and fail without it. I wasn't sure of the best place to put the tests, but I think the location in which I added them makes sense.
#8
@
11 years ago
In 1308/tests:
Patch to be lenient about matching existing enclosures, and to agree on the formatting of enclosure string for future