WordPress.org

Make WordPress Core

Opened 17 months ago

Closed 16 months ago

Last modified 5 months ago

#39875 closed defect (bug) (fixed)

PDF previews overwrite existing images with the same name.

Reported by: gitlost Owned by: joemcgill
Milestone: 4.7.3 Priority: normal
Severity: normal Version: 4.7
Component: Media Keywords: has-patch needs-testing fixed-major
Focuses: Cc:

Description

This issue was raised by Benjamin de Jong on the Make blog post Enhanced PDF Support in WordPress 4.7.

Upload a JPEG "test.jpg". Upload a PDF "test.pdf". The original "test.jpg" is overwritten with the generated PDF preview.

(Note that as the thumbnail sizes probably differ you'll have to click into "test.jpg" to see it's been overwritten.)

Attachments (9)

39875.patch (5.0 KB) - added by gitlost 16 months ago.
Special case pdf preview to avoid overwriting exising JPEGs.
39875.diff (2.7 KB) - added by desrosj 16 months ago.
39875-major.diff (4.4 KB) - added by desrosj 16 months ago.
minimal-us-letter.pdf (739 bytes) - added by gitlost 16 months ago.
A public domain PDF, from https://brendanzagaeski.appspot.com/0004.html, modified to be US Letter size. Its very small file size is handy speed-wise for tests.
39875.2.diff (2.7 KB) - added by desrosj 16 months ago.
39875.2.patch (5.1 KB) - added by desrosj 16 months ago.
Uses wp_unique_filename() with no random string in PDF file previews.
39875.3.patch (5.8 KB) - added by gitlost 16 months ago.
Uses comment:8 change with 39875.patch unit tests.
39875.4.patch (5.2 KB) - added by joemcgill 16 months ago.
39875-fix-tests.diff (668 bytes) - added by joemcgill 16 months ago.

Download all attachments as: .zip

Change History (37)

#1 @desrosj
16 months ago

  • Keywords needs-patch added

Just did some testing locally and I was able to reproduce it using the steps specified.

I tested the reverse scenario by uploading a PDF named test.pdf first followed by uploading a JPG named test.jpg) and the JPG was renamed to test-1.jpg correctly.

Not all thumbnail sizes are overwritten when this bug occurs, but the main issue here is that the original test.jpg file is always overwritten by the PDF preview image.

This ticket was mentioned in Slack in #core-media by desrosj. View the logs.


16 months ago

#3 @desrosj
16 months ago

Also discovered another scenario in my testing. PDF files that were uploaded before 4.7 do not have image previews until an image regeneration script is run, or wp media regenerate in WP-CLI.

When one of those scripts are run, the JPG file with the same name is also overwritten.

Steps to reproduce:

  1. Upload PDF file.
  2. Delete all JPG image previews for that PDF from uploads.
  3. Upload a JPG file with the same name.
  4. Run wp media regenerate --yes in WP-CLI.

This ticket was mentioned in Slack in #core-media by joemcgill. View the logs.


16 months ago

@gitlost
16 months ago

Special case pdf preview to avoid overwriting exising JPEGs.

#5 @mikeschroder
16 months ago

  • Milestone changed from Awaiting Review to 4.7.3

@desrosj
16 months ago

#6 @desrosj
16 months ago

  • Keywords has-patch needs-testing added; needs-patch removed

Most recent patch uses the approach [hammered out in this week's Media meeting (https://wordpress.slack.com/archives/core-media/p1487364216001228).

Borrowed some code from wp_save_image(). For the minor release, there will be some code duplication. I am going to whip up a different patch moving the repeated code into a function for the next major release.

#7 @desrosj
16 months ago

Added first pass for moving the repeated logic into a function for a major release.

#8 @gitlost
16 months ago

Although I can see an argument for doing the unique file name outside the class, and also for adding a PDF marker, I don't see a reason for using a random name. What's the benefit over wp_unique_filename()?

An argument against a random name is that it makes linked thumbnails (especially relevant when #39618 gets fixed) of the PDF brittle - for instance if one had to regenerate thumbnails. Another argument is that it loses all association with the original PDF on a file level. Another argument is that it's non-semantic. Another argument is that it's a pretty violent departure from what's there now and is not very WordPressy, not matching practice elsewhere, even in the instance of "wp-admin/includes/image-edit.php" from which it's culled, which only uses it as a suffix on the file name and has its own reasons and context (eg it knows there will be clashes and needs a pregable "unique" marker) for using such a scheme, not present here. Another argument is that non-deterministic outcomes make writing tests harder.

An argument for I can see is that it busts caching, which is nice, but I'd suggest outside the remit of the ticket.

My preference (if 39875.patch is out) would be something like the following (which uses _pdf rather than -pdf so as not to step on the toes of the PDF Image Generator plugin):

$dirname = dirname( $file ) . '/';
$ext = '.' . pathinfo( $file, PATHINFO_EXTENSION );
$preview_file = $dirname . wp_unique_filename( $dirname, wp_basename( $file, $ext ) . '_pdf.jpg' );
$uploaded = $editor->save( $preview_file, 'image/jpeg' );

Failing that, if a random string is really wanted, maybe put the attachment id in there somewhere...

(As an aside, I'll upload a PDF which could be used instead of "wordpress-gsoc-flyer.pdf" in tests to speed them up a bit. It's from Minimal PDF, modified to be US Letter size, and is public domain.)

@gitlost
16 months ago

A public domain PDF, from https://brendanzagaeski.appspot.com/0004.html, modified to be US Letter size. Its very small file size is handy speed-wise for tests.

This ticket was mentioned in Slack in #core-media by desrosj. View the logs.


16 months ago

This ticket was mentioned in Slack in #core-media by joemcgill. View the logs.


16 months ago

@desrosj
16 months ago

#11 @desrosj
16 months ago

Updated the patch per discussion in this week's dev chat (https://wordpress.slack.com/archives/core-media/p1487877384001584). Leave the original file name intact, but add a random string, and -pdf to the end.

#12 @gitlost
16 months ago

Re patch note that PATHINFO_FILENAME is not UTF-8 safe (same problem as basename() and PATHINFO_BASENAME).

I still see no reason for using a random string.

#13 @dd32
16 months ago

I still see no reason for using a random string.

I can't see a need for it either, lets just use wp_unique_filename() here, in most cases it'll not conflict with anything and be used straight up - but in the event another plugin has created the file, or the user uploaded a conflicting name, it'll cause it to not overwrite something that exists already. -pdf2.jpg would work fine - afterall the filename will be stored in the DB, no code in core is ever going to base decisions off the filesystem alone.

This ticket was mentioned in Slack in #core by jnylen. View the logs.


16 months ago

#15 @jnylen0
16 months ago

  • Owner set to joemcgill
  • Status changed from new to assigned

Per above Slack discussion, handing off to @joemcgill to coordinate sign-off/commit with @mikeschroder.

@desrosj
16 months ago

Uses wp_unique_filename() with no random string in PDF file previews.

This ticket was mentioned in Slack in #core-media by desrosj. View the logs.


16 months ago

#17 @gitlost
16 months ago

There's a typo in the patch - '-pdf.pdf' should be '-pdf.jpg'.

Also why not the code as suggested in comment:8 above instead of a variation using preg_replace()? - which is non-semantic and doesn't seem to add anything and is worse in some (admittedly very) edge cases (and frankly is an anti-pattern whose use should be removed from core - /var/www/my.wordpress/blah_noext anyone?!).

Also the unit test doesn't allow for the PDF marker, and the other tests aren't fixed to clean up properly after themselves.

Following is a patch which just uses the change suggested in comment:8 with -pdf instead of _pdf, and with the unit tests from 39875.patch, adjusted for -pdf and cleaned up a bit and with an extra renaming test.

@gitlost
16 months ago

Uses comment:8 change with 39875.patch unit tests.

@joemcgill
16 months ago

#18 @joemcgill
16 months ago

Thansk @desrosj and @gitlost. I agree that using pathinfo() and wp_basename() is an improvement over preg_replace() in this case. The second unit test in 39875.3.patch looks helpful in terms of checking the logic there, but unless we intend to abstract that out into a new function, I'd rather leave that test out.

39875.4.patch Simplifies the unit tests from 39875.3.patch a bit, but should be identical otherwise. I don't think we need to test that the attachments and metadata were generated correctly in this test, that functionality should be tested elsewhere. If it isn't, we should add additional unit test coverage that is focussed specifically on that functionality.

#19 @gitlost
16 months ago

Looks good to me!

#20 @mikeschroder
16 months ago

I liked the random string for consistency's sake with other resizing, but I'm okay with this approach.

I'd be more comfortable if we stopped accessing /tmp/ directly in tests, but otherwise this looks okay.

Last edited 16 months ago by mikeschroder (previous) (diff)

#21 @joemcgill
16 months ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 40130:

Media: Keep PDF previews from overwriting files.

Since support for PDF previews were added in [38949], it's possible
that the generated image file could overwrite an existing image file
with the same name. This uses wp_unique_filename() to avoid this
issue and adds a '-pdf' identifier on the end of filenames.

Props gitlost, derosj, mikeschroder, joemcgill.
Fixes #39875. See #31050.

#22 @joemcgill
16 months ago

  • Keywords fixed-major added
  • Resolution fixed deleted
  • Status changed from closed to reopened

#23 @joemcgill
16 months ago

I've opened #39975 to address the various places where /tmp/ is used in the tests.

#24 @joemcgill
16 months ago

We need to skip the tests when PDF's aren't supported by the current image editor. See 39875-fix-tests.diff.

#25 @joemcgill
16 months ago

In 40131:

Media: Skip PDF preview tests when image editor doens't support.

Following [40130] tests fail on environments that don't support PDF
previews because the expected meta data doesn't get written.

See #39875.

#26 @joemcgill
16 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 40133:

Media: Keep PDF previews from overwriting files.

Since support for PDF previews were added in [38949], it's possible
that the generated image file could overwrite an existing image file
with the same name. This uses wp_unique_filename() to avoid this
issue and adds a '-pdf' identifier on the end of filenames.

Props gitlost, desrosj, mikeschroder, joemcgill.
Merges [40130] and [40131] to the 4.7 branch.
Fixes #39875. See #31050.

This ticket was mentioned in Slack in #core-media by desrosj. View the logs.


5 months ago

This ticket was mentioned in Slack in #core-media by desrosj. View the logs.


5 months ago

Note: See TracTickets for help on using tickets.