#42026 closed defect (bug) (duplicate)
Regression in 4.8.2: WordPress now prevents certain unzipping valid filename patterns (but not creating them)
Reported by: | DavidAnderson | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 4.8.2 |
Component: | Filesystem API | Keywords: | |
Focuses: | Cc: |
Description
WordPress 4.8.2 added a new check into both the _unzip_file_pclzip()
and _unzip_file_ziparchive()
methods:
if ( 0 !== validate_file( $info['name'] ) ) {
return new WP_Error( 'invalid_file_ziparchive', __( 'Could not extract file from archive.' ), $info['name'] );
}
To return 0, validate_file()
requires, among other things, that the filename cannot include two consecutive periods. The aim here is presumably to prevent directory traversal attacks. Unfortunately, the implementation is naive and includes much more than this.
As a result, WordPress will allow you to upload files (e.g. into the media library) with names like picture..jpg
, but not to unzip them.
My particular problem here is that this prevents the restoration of backups. (I am the lead developer of UpdraftPlus, the #1 most-installed WP backup/restore plugin - over 1 million active installs). If your media library includes files with patterns like this (and we've already encountered multiple users who do), then you can no longer restore the backup. There's no filter provided in either of the unzip functions, or in validate_file()
, to get around this. The only work-around is to clone large chunks of WP's zip-handling API (from https://codex.wordpress.org/Function_Reference/unzip_file downwards) in order to alter one line, to get back the old behaviour.
In my view the check in validate_file()
is the problem - forbidding two consecutive periods is too crude. If they're going to be forbidden when unzipping, they should be forbidden everywhere else too (e.g. uploading)... but this should not be enforced on existing WP installs which already have such existing content. Another satisfactory solution would be to allow the result of validate_file()
to be filtered so that particular consumes of unzip_file()
can apply their own checks.
To follow up: when I say that the check is too crude, I mean that it's doing a lot more than just preventing directory traversal attacks. It should be checking for actual directory traversal, i.e.
../
or..\
, not just two consecutive periods, which is something real-world users have been using as a legitimate filename element. (It's not a pretty filename convention, but that's not relevant).