Opened 11 months ago
#60788 new defect (bug)
Content-Disposition support in download_url() seems broken
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | 6.4.3 |
Component: | HTTP API | Keywords: | |
Focuses: | Cc: |
Description
In https://core.trac.wordpress.org/changeset/51939 a change was made to resolve ticket https://core.trac.wordpress.org/ticket/38231 to make download_url()
use the Content-Disposition
header to specify the name of the downloaded file. I realize I'm a bit late to the party here since the ticket was opened more than 7 years ago and the change was made more than 2 years ago, but I think that this change was flawed and needs to be reconsidered (and possibly reverted entirely). The way it was implemented seems fundamentally broken and has the effect of making download_url()
basically impossible to use reliably.
Consider the following simple PHP script named echo.php
:
<?php header( 'Content-Type: text/plain' ); header( 'Content-Disposition: attachment; filename="echo.txt"' ); if ( isset( $_GET['echo'] ) ) { echo $_GET['echo']; }
This script just takes a query string parameter echo
and outputs it, in plain text format, with suggested filename echo.txt
.
Now consider the following file, download.php
, which is intended to be used with WP-CLI:
<?php $foo = download_url( 'http://example.test/echo.php?echo=foo' ); echo file_get_contents( $foo ); echo "\n"; unlink( $foo ); $bar = download_url( 'http://example.test/echo.php?echo=bar' ); echo file_get_contents( $bar ); echo "\n"; unlink( $bar );
If you run this, it works as expected:
$ wp eval-file download.php foo bar
Now consider the following file, download-bad.php
:
<?php $foo = download_url( 'http://example.test/echo.php?echo=foo' ); $bar = download_url( 'http://example.test/echo.php?echo=bar' ); echo file_get_contents( $foo ); echo "\n"; echo file_get_contents( $bar ); echo "\n"; unlink( $foo ); unlink( $bar );
If you try to run this, the results are not good:
$ wp eval-file download-bad.php bar bar PHP Warning: unlink(...tmp/echo.txt): No such file or directory...
You might argue that the code for download-bad.php
is flawed, and should be rewritten to look more like download.php
, and I would agree. However, note that download-bad.php
would have worked fine before the change https://core.trac.wordpress.org/changeset/51939 was made. Furthermore, even download.php
is likely to display buggy behavior if two users attempt to run it at the same time. I don't think there is any reliable way to use download_url()
in a manner that will protect against concurrent access (short of adding some kind of external locking mechanism). Consider the following PHP files, foo.php
...
<?php $foo = download_url( 'http://example.test/echo.php?echo=foo' ); sleep( 10 ); echo file_get_contents( $foo ); echo "\n"; sleep( 10 ); @ unlink( $foo );
...and bar.php
:
<?php sleep( 5 ); $bar = download_url( 'http://example.test/echo.php?echo=bar' ); sleep( 10 ); echo file_get_contents( $bar ); echo "\n"; @ unlink( $bar );
Now try running them both at the same time:
$ wp eval-file foo.php & wp eval-file bar.php bar bar
Obviously the output might vary because there's a race condition here, but I've added sleep()
calls in such a way that the output will usually be bar
and bar
.
I'm not really sure what should be done with download_url()
, but as it currently stands it seems essentially impossible to use in a reliable manner. Personally I would probably recommend just removing all Content-Disposition
support entirely. That would technically be a backward compatibility break, but I'm not sure anyone is actually using this functionality? To me it doesn't really even make any sense as currently implemented. Why would you want a temporary file stored under a fixed name?
It looks like the Content-Disposition
feature was originally proposed as a solution to the problem that download_url()
uses wp_tempnam()
in such a way that it can create very long filenames for some URLs, as described in https://core.trac.wordpress.org/ticket/34938 - but that doesn't make much sense either:
- The change to use the
Content-Disposition
header bears no resemblance to the change that was originally requested in https://core.trac.wordpress.org/ticket/34938 - Adding support for the
Content-Disposition
header doesn't actually fix the original problem (what if the URL being downloaded doesn't have aContent-Disposition
header?) - Even if the URL being downloaded has a
Content-Disposition
header, the way theContent-Disposition
support was implmented means that it still creates a file based on the URL first, then renames it later, so the first file could still have a filename which is too long - Another change was made in https://core.trac.wordpress.org/changeset/37598 which should (mostly) address the original problem
That having been said, the original patch https://core.trac.wordpress.org/attachment/ticket/38231/38231.diff actually makes a lot more sense than what was finally committed. The original patch takes the Content-Disposition
filename and feeds it as input into wp_tempnam()
, so you will get a random filename which should prevent the issues in the examples I've given. Somehow, in the multiple iterations of the patch, the call to wp_tempnam()
got lost, so the current code just uses the Content-Disposition
filename directly instead of using a random filename based on the Content-Disposition
filename.
I suppose the behavior could be changed to use a random filename based on the Content-Disposition
filename, as was originally intended, but that would also technically be a backward compatibility break (assuming anyone is actually using this functionality and expecting the current behavior).
To sum up - I'm not sure exactly why this Content-Disposition
functionality was added in the first place, but it should probably just be removed. If it can't be removed, then the behavior should be changed so that it uses a random filename based on the Content-Disposition
filename rather than using the Content-Disposition
filename directly. If that can't be done either, perhaps there could be another argument added to the download_url()
function to turn the Content-Disposition
functionality on or off? Ideally the default would be off.