WordPress.org

Make WordPress Core

Opened 5 years ago

Last modified 2 months ago

#38231 assigned enhancement

Allow download_url to respect content-disposition header

Reported by: cklosows Owned by: johnjamesjacoby
Milestone: 5.9 Priority: normal
Severity: normal Version: 4.7
Component: HTTP API Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

In #34938 it was proposed to use a filter in the wp_tempnam function which was rejected in favor of ignore the query string when downloading a file. This, however can still propose a problem when someone is the function to download a tokenized file where the path can be long.

In the case where the headers of the response provide a content-disposition header including a file name, it would be ideal to have the download_url function respect this file name and move the file name (which by default is the token in the path) to the file name requested.

Attachments (6)

38231.diff (810 bytes) - added by cklosows 5 years ago.
38231-tests.diff (1.2 KB) - added by psrpinto 2 months ago.
Unit tests
38231.1.diff (1.1 KB) - added by psrpinto 2 months ago.
Check return values and other improvements
38231-tests.1.diff (1.2 KB) - added by psrpinto 2 months ago.
Test that the file is actually created
38231.2.diff (982 bytes) - added by psrpinto 2 months ago.
Refactor according to @dd32's suggestions
38231-tests.2.diff (1.8 KB) - added by psrpinto 2 months ago.
Add a test for Content-Disposition=filename=foo.txt (i.e. without quotes)

Download all attachments as: .zip

Change History (15)

@cklosows
5 years ago

#1 @cklosows
5 years ago

  • Keywords has-patch added

#2 @dd32
5 years ago

  • Component changed from Filesystem API to HTTP API

#3 @johnjamesjacoby
4 months ago

  • Keywords needs-testing added
  • Milestone changed from Awaiting Review to 5.8
  • Owner set to johnjamesjacoby
  • Status changed from new to assigned

Let's try to make this happen 5.8.

@psrpinto
2 months ago

Unit tests

#4 @psrpinto
2 months ago

  • Keywords has-unit-tests added

@psrpinto
2 months ago

Check return values and other improvements

@psrpinto
2 months ago

Test that the file is actually created

#5 follow-up: @psrpinto
2 months ago

  • Keywords needs-testing removed

38231.1.diff improves on @cklosows's original patch by:

  1. Explicitly checking the return value of preg_match()
  2. Returning an error when the call to wp_tempnam() fails
  3. Returning an error when the call to rename() fails

38231-tests.1.diff adds an extra assertion to make sure the file is actually created.

I've also now manually tested this and can confirm the Content-Disposition header is respected when calling download_url().

#6 @desrosj
2 months ago

  • Milestone changed from 5.8 to 5.9

Today is feature freeze for the 5.8 release. This one is really close! But unfortunately the time for including in 5.8 has passed.

I'm going to punt to 5.9 instead of Future Release since there is a recent momentum, and @psrpinto has done a great job iterating on the approaches here so far.

#7 in reply to: ↑ 5 @dd32
2 months ago

Replying to psrpinto:

38231.1.diff improves on @cklosows's original patch by:

  1. Explicitly checking the return value of preg_match()
  2. Returning an error when the call to wp_tempnam() fails
  3. Returning an error when the call to rename() fails

Hi @psrpinto! Thanks for the improved patch :)

In general, WordPress tries to use the most basic loose boolean checking it can, so for example (I'm being nit-picky here, since you're new to Core patches, this is intended to be constructive feedback)

$content_disposition = wp_remote_retrieve_header( $response, 'Content-Disposition' );
if ( ! empty( $content_disposition ) && 1 === preg_match( '/filename="([^ ]+)"/', $content_disposition, $matches ) ) {

can be simplified to simply:

$content_disposition = wp_remote_retrieve_header( $response, 'Content-Disposition' );
if ( $content_disposition && preg_match( '/filename="([^ ]+)"/', $content_disposition, $matches ) ) {

as a) We only want to know that $content_disposition is 'something' the preg_match will validate it further and b) any truthful value return is acceptable.

I personally think in the event that the Content-Disposition filename can't be used, it should be something like this, what do you think?

set tmpName = create random name tmp file
if ( disposition header is set ) {
   set tmpNameDisposition = create disposition-named tmp file
   if ( can create tmpNameDisposition AND can rename tmpName to tmpNameDisposition ) {
      set tmpName = tmpNameDisposition
      continue on with processing
   } else {
      continue on like no error was encountered, tmpName is still set to random
   }
}
/// .. continue on

So that if the header is invalid, can't be created on the filesystem, or something else odd the function can still succeed without erroring out. As this function is used within the WordPress update process, failing on some edgecase here could be disastrous.
In other words, treating this as a progressive enhancement if possible but retaining existing behaviour in the event that it can't be used.

Other edgecases worth considering here:

  • filename="file.zip" and filename=file.zip are both valid but the patch only handles the former
  • What if the filename in the URL and in the Content-Disposition header match? For example https://wordpress.org/wordpress-5.7.2.zip

@psrpinto
2 months ago

Refactor according to @dd32's suggestions

@psrpinto
2 months ago

Add a test for Content-Disposition=filename=foo.txt (i.e. without quotes)

#8 @psrpinto
2 months ago

Thanks for the feedback @dd32! I agree with pretty much all your points so I've now refactored the code to reflect your suggested changes.

Please do let me know if you have further suggestions, they will be very welcome and I will be happy to consider them.


What if the filename in the URL and in the Content-Disposition header match?

Filenames are generated through wp_tempnam() so there can be no collisions, since that method makes sure the filename is unique. So even if the filename in the URL and in the Content-Disposition header are the same, the temporary files that are created will have unique names.


Changes in 38231.2.diff:

  • No longer check that $content_disposition is not empty before the call to preg_match(). If it is empty, it won't match, so that check is unnecessary. (wp_remote_retrieve_header() returns either the header or an empty string).
  • Treat Content-Disposition as "progressive enhancement", falling back to original randomly-generated filename in case of error.
  • Support the case where the header contains no quotes: Content-Disposition: filename=foo.txt

Changes in 38231-tests.2.diff

  • Add test for the case where the header contains no quotes: Content-Disposition: filename=foo.txt
Last edited 2 months ago by psrpinto (previous) (diff)

#9 @dd32
2 months ago

38231-tests.2.diff Looks good to me @psrpinto! Thanks

The only thought I have is that the unlink could be moved to the else to avoid a file_exists() call, but that's all I got :) I'll leave that up to you or whomever commits this.

if ( tmpfname_disposition && rename ) {
   set ...
} else if ( tmpfname_disposition ) {
   unlink tmpfname_disposition
}
Note: See TracTickets for help on using tickets.