Make WordPress Core

Opened 4 months ago

Last modified 4 months ago

#63780 new defect (bug)

[core] Upgrades over FTP fail for files owned by users or groups with space in the name (non-Windows systems)

Reported by: ivucica's profile ivucica Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 6.8.2
Component: Upgrade/Install Keywords:
Focuses: administration Cc:

Description

Hi,

When core or plugin is being upgraded on a system where some files are owned by a user that has space in the system username, or some files are owned by a group that has space in the system group name, parsing of the output of ftp_rawlist in WP_Filesystem_FTPext in parselisting() will likely fail.

This is in class-wp-filesystem-ftpext.php.

Since the code splits the passed values by the space character 0x20, any extra spaces will result in incorrect filename being used.

For example (artificially constructed example using output of ls -al, but resembling output on proftpd experienced earlier this week):

-rw-r--r--   1 username Domain Users 17123432 Jul 16 20:17 file-name.html

The computed file will be 20:17 file-name.html instead of the expected file-name.html. This has resulted in an attempt to delete /some/path/to/WP/wp-contents/updates/some-plugin-1.23/20:17 file-name.html during recursive directory delete.

Unfortunately, it is not impossible for group names to have spaces in them, or to guarantee FTP will do this correctly. The user authorized for FTP access could have space in its primary group, potentially resulting in new files created with such group ownership.

It is difficult to say what is the correct way to parse these; assuming English locale, perhaps the occurrence of +[0-9]+ +[A-Z][a-z]{2} +[0-9]+ +([0-9]{2}:[0-9]{2}+|[0-9]{4}) + (e.g. " 12345 Jul 1 14:45 " within the year, and " 12345 Jul 1 2022 " for a preceding year) -- so spaces, then numbers for filesize, then spaces, then capital letter followed by two lowercase letters (for month), then spaces, then number, then spaces, then either 4 numbers (year) or two numbers, colon, two numbers (time), is one way to find what precedes the filename.

Finding the split between usernames and groups would be a losing battle, since both username and group could theoretically have a space in them.

If parsing cannot be legitimately improved (perhaps also renaming the variable from $lucifer to something clearer and better commented?), then at least throwing an error telling the user that the line cannot be legitimately parsed would be an improvement: I spent quite a bit of time tracking the inability to upgrade over FTP down to fixing my LDAP NSSwitch setup.

My local workaround ended up being

  1. temporarily update code to just detect that a filename starts with "HH:MM " and remove first 6 characters if it does (bad fix), perform upgrades of plugins, then upgrade core (which, as I expected, nuked the fix)
  2. changing the blog directory's user's primary group to something that does not have spaces in it
  3. chgrp'ing everything in that directory to that group

This may not be that easily fixable in enterprise environments.

I am unsure if this belongs in Upgrade or in Filesystem API components, please reassign as appropriate.

Present at least between 6.0.0 and 6.8.2, inclusive.

Change History (2)

#1 @q0rban
4 months ago

Perhaps this should use ftp_mlsd() instead, which outputs an array:

  array(8) {
    ["name"]=> string(10) "public_ftp"
    ["modify"]=> string(14) "20171211174536"
    ["perm"]=> string(7) "flcdmpe"
    ["type"]=> string(3) "dir"
    ["unique"]=> string(11) "811U57405EE"
    ["UNIX.group"]=> string(2) "33"
    ["UNIX.mode"]=> string(4) "0755"
    ["UNIX.owner"]=> string(2) "33"
  }

This is available in PHP versions (PHP 7 >= 7.2.0, PHP 8)

#2 @ivucica
4 months ago

Looks like that might do the trick, if the server supports RFC 3659 (March 2007).

The 'facts' in MLSD seem to be composed of SCHARs:


      RCHAR          = ALPHA / DIGIT / "," / "." / ":" / "!" /
                       "@" / "#" / "$" / "%" / "^" /
                       "&" / "(" / ")" / "-" / "_" /
                       "+" / "?" / "/" / "\" / "'" /
                       DQUOTE   ; <"> -- double quote character (%x22)
      SCHAR          = RCHAR / "=" ;

and example of the traffic from the RFC:


    C> MLSD
    S> 150 I tink I tee a trisector tree
    D> Type=file;Unique=aaab8cUYqaaa;Perm=rf;Size=15039; list.c
    D> Type=pdir;Unique=aaaaacUYqaaa;Perm=cpmel; /
    D> Type=pdir;Unique=aaaaacUYqaaa;Perm=cpmel; ..
    D> Type=file;Unique=aaab8bUYqaaa;Perm=rf;Size=34589; ftpd.c
    D> Type=cdir;Unique=aaab8aUYqaaa;Perm=cpmel; /files
    S> 226 That's all folks...

It feels it's differently underspecified, given that facts seem to be prohibited from many characters -- including, amusingly, whitespace. RFC mainly talks about encodings such as UTF-8 in context of pathnames themselves; this might be fine.

But if the server supports it, this does seem to be more correct, and if not, PHP core would need to solve it.

Note: See TracTickets for help on using tickets.