Make WordPress Core

Opened 3 years ago

Closed 17 months ago

#54166 closed defect (bug) (fixed)

Updating errors due to PHP timeouts

Reported by: aristath's profile aristath Owned by: sergeybiryukov's profile SergeyBiryukov
Milestone: 6.2 Priority: normal
Severity: normal Version: 6.0
Component: Upgrade/Install Keywords: has-patch
Focuses: Cc:

Description

Initially reported in https://core.trac.wordpress.org/ticket/51857#comment:128
After [51815] there is an error when updating large plugins such as Gutenberg or WooCommerce.
After debugging the issue, it seems to be due to a PHP timeout and depends on the server's configuration (which is why the failure only occurs for large plugins).
The way we delete a directory in PHP is by going through all files in that directory and deleting them one by one (as far as I know this is the only way). As a result, deleting or restoring a backup of a large plugin takes more time than expected.

Change History (65)

This ticket was mentioned in PR #1696 on WordPress/wordpress-develop by aristath.


3 years ago
#1

  • Keywords has-patch added; needs-patch removed

Fixes PHP timeouts by moving the backup deletion & restore to the shutdown action.
Processes running on shutdown are immune to PHP timeouts, so moving them there allows them to run _after_ the main process, without affecting the update.

Trac ticket: https://core.trac.wordpress.org/ticket/54166#ticket

#2 @SergeyBiryukov
3 years ago

  • Owner set to SergeyBiryukov
  • Status changed from new to reviewing

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

peterwilsoncc commented on PR #1696:


3 years ago
#4

I'm still hitting fatal errors occasionally during the update process.

In the screenshot it happened with Gutenberg, updating via "Dashboard > Updates > View details about... > Click install".

During testing last week it was happening consistently but now it's only happening intermittently, so this PR is an improvement.

https://i0.wp.com/user-images.githubusercontent.com/519727/135203206-4d79c87f-f776-42da-bf5b-d5c9178d5d24.png

#5 @peterwilsoncc
3 years ago

As noted on the pull request, I am still seeing fatal errors when testing this locally.

Both gutenberg and wpforms-lite intermittently end up with either some or all of their sub-directories empty.

There are backward compatibility implications but I am wondering if it's possible to switch to a series of AJAX requests for each step:

  1. Create the backup of the previous install
  2. Download the update from w.org
  3. Unzip the download
  4. Activate the plugin
  5. Restore if needs be

As fetch returns a promise, it's offers quite a lot of control to ensure the process doesn't get ahead of itself.

  1. No requirements
  2. No requirements
  3. Requires 1, 2 be complete
  4. Requires 1, 2, 3, be complete
  5. Requires 1 be complete, 4 to fail

#6 @peterwilsoncc
3 years ago

Please note: @aristath, @pbiron and I have some further notes relating to this issue on the original ticket from comment #137 onwards.

This ticket was mentioned in PR #1727 on WordPress/wordpress-develop by afragen.


3 years ago
#7

Add a move_dir() function for use in WP_Upgrader instead of copy_dir(). There seem to be timeout issues in slower systems like Vagrant or the WP Docker test env with installation or update of large plugins like WooCommerce, Yoast SEO, or MailPoet.

This new function attempts a rename() and falls back to the previous copy_dir().

Trac ticket: https://core.trac.wordpress.org/ticket/54166

#8 @afragen
3 years ago

  • Keywords early dev-feedback added

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

This ticket was mentioned in Slack in #core by afragen. View the logs.


3 years ago

#12 @afragen
3 years ago

  • Keywords needs-testing added

This ticket was mentioned in Slack in #core by afragen. View the logs.


3 years ago

#14 @SergeyBiryukov
3 years ago

  • Resolution set to fixed
  • Status changed from reviewing to closed

In 51899:

Upgrade/Install: Introduce a move_dir() function.

This replaces the copy_dir() usage in WP_Upgrader::install_package() and aims to avoid PHP timeout issues when installing or updating large plugins on slower systems like Vagrant or the WP Docker test environment.

The new function attempts a native PHP rename() function first and falls back to the previous copy_dir().

Follow-up to [51815], [51898].

Props afragen, aristath, peterwilsoncc, galbaras, noisysocks, pbiron.
Fixes #54166. See #51857.

afragen commented on PR #1727:


3 years ago
#15

Merged in r51899

#16 follow-up: @peterwilsoncc
3 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Reopening this as I think r51899 will need to be reverted:

  • rename() should not be in move_dir() -- different file systems use different functions, eg ftp_rename()
  • If the destination exists, it is always overwritten. This seems problematic and there should really be an $overwrite parameter per the other functions
  • In _wp_handle_upload() it's noted that rename() fails on streams, so that may be problematic & streams handled differently ('may' is the key word, with the other issues I haven't researched this)

When trying to debug this issue with @afragen on Friday, I was testing with the pull request applied and still having the problem so I am not sure it fixes the reported issue completely either.


Reading the code and doing some logging earlier today, I think something that runs asynchronously isn't been accounted for which is leading to the error.

Installing/updating akismet (a very small plugin) was showing different contents within the working directory at several points within the procedure.

#17 @afragen
3 years ago

Just for reference, if rename() fails the function reverts to copy_dir(), which was there previously. File systems not having a rename() command should continue to function as they did previously.

The previous flow always emptied the specific plugin folder in wp-content/plugins/ that's why a failed update would leave an empty folder. The $clear_destination parameter is always true when updating. https://github.com/WordPress/wordpress-develop/blob/6367c01844e34dd372b999beb0e370f52d8c42d5/src/wp-admin/includes/class-plugin-upgrader.php#L222-L239

I don't believe streams are ever utilized in the plugin update process.

While Peter and I haven't yet been able to figure out the issue on his system using Vagrant and Chassis, I don't believe that reverting r51899 will fix it.

Last edited 3 years ago by afragen (previous) (diff)

#18 @SergeyBiryukov
3 years ago

In a discussion with @aristath today, it was pointed out that PR 1696 is also ready for commit, so let's try that as well.

#19 @SergeyBiryukov
3 years ago

In 51902:

Upgrade/Install: Restore or clean up the temporary plugin or theme backup on shutdown.

This allows these actions to run after the main process, without affecting the update. Actions running on shutdown are immune to PHP timeouts, so in case the failure was due to a PHP timeout, we'll still be able to properly restore the previous version.

Follow-up to [51815], [51898], [51899].

Props aristath, peterwilsoncc.
See #54166.

#20 in reply to: ↑ 16 ; follow-up: @SergeyBiryukov
3 years ago

Replying to peterwilsoncc:

Reopening this as I think r51899 will need to be reverted

Yeah, now that I think of it, I also have some concerns with the function:

  • It does not use the WP_Filesystem_* methods, so the rename() call will only work if direct PHP file access is available.
  • The fallback to copy_dir() causes inconsistent behavior:
    • If the rename() call worked, the source directory is successfully renamed and does not need any further actions.
    • If the rename() call failed, the source directory is left behind and has to be cleaned separately.

I wonder [51899] is still needed after [51902], or after getting to the bottom of the issue from comment:16.

This ticket was mentioned in PR #1747 on WordPress/wordpress-develop by afragen.


3 years ago
#22

Update for r51899 to address issues, https://core.trac.wordpress.org/ticket/54166#comment:20

  • Add check for direct PHP flle access and only use rename() if true.
  • More consistent behavior for fallback to copy_dir()

Trac ticket: https://core.trac.wordpress.org/ticket/54166

afragen commented on PR #1747:


3 years ago
#23

Use move_dir() for moving and restoring to backup directory as $wp_filesystem->move() fallback is a file copy and that will cause problems.

#24 in reply to: ↑ 20 ; follow-up: @afragen
3 years ago

Replying to SergeyBiryukov:

Replying to peterwilsoncc:

Reopening this as I think r51899 will need to be reverted

Yeah, now that I think of it, I also have some concerns with the function:

  • It does not use the WP_Filesystem_* methods, so the rename() call will only work if direct PHP file access is available.
  • The fallback to copy_dir() causes inconsistent behavior:
    • If the rename() call worked, the source directory is successfully renamed and does not need any further actions.
    • If the rename() call failed, the source directory is left behind and has to be cleaned separately.

I wonder [51899] is still needed after [51902], or after getting to the bottom of the issue from comment:16.

I think I have addressed the above concerns in a new PR#1747

I believe move_dir() is still needed it does address a potential timeout issue and does create a directory move that doesn't fallback to a PHP copy() that may fail when attempting to copy directories.

#25 in reply to: ↑ 24 @SergeyBiryukov
3 years ago

Replying to afragen:

Yeah, now that I think of it, I also have some concerns with the function:

  • It does not use the WP_Filesystem_* methods, so the rename() call will only work if direct PHP file access is available.
  • The fallback to copy_dir() causes inconsistent behavior:
    • If the rename() call worked, the source directory is successfully renamed and does not need any further actions.
    • If the rename() call failed, the source directory is left behind and has to be cleaned separately.

I wonder [51899] is still needed after [51902], or after getting to the bottom of the issue from comment:16.

I think I have addressed the above concerns in a new PR#1747

Thanks for the PR! I think it addresses the first concern, but not the second. After the fallback to copy_dir(), I think the source directory should be removed. Currently, in that scenario, move_dir() is just an alias for copy_dir(), as it performs the copying part, but does not clean up the source directory like the move_dir() name suggests, unless I'm missing something.

Addressing that would make the clear_working parameter of WP_Upgrader::install_package() a bit redundant, as the working directory would always be cleared regardless of its value. However, it is already redundant in case the rename() call worked. If move_dir() stays, the clear_working parameter would probably have to be deprecated.

afragen commented on PR #1747:


3 years ago
#26

Updated to clear the working directory so there is internal parity within the function with the results of a successful rename().

As @SergeyBiryukov points out, this may deprecate the clear_working option.

#27 @afragen
3 years ago

  • Keywords dev-feedback removed

This ticket was mentioned in Slack in #core by sergey. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-auto-updates by peterwilsoncc. View the logs.


3 years ago

#30 @peterwilsoncc
3 years ago

Please see my full comment on the original ticket.

I've come to the conclusion that the following commits need to be reverted and the reliability improvements added in a subsequent release:

Repeating the tl;dr here but let's discuss pros and cons on the other ticket.

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

This ticket was mentioned in Slack in #core by afragen. View the logs.


3 years ago

#33 @afragen
3 years ago

@SergeyBiryukov we should commit the rest of this PR for 5.9?

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-test by boniu91. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-auto-updates by hellofromtonya. View the logs.


3 years ago

#37 @SergeyBiryukov
3 years ago

In 52289:

Upgrade/Install: Make some adjustments to the move_dir() function:

  • Check for direct PHP flle access and only use rename() if true.
  • Check whether the destination directory was successfully created.
  • Clear the working directory so there is internal parity within the function between the results of a successful rename() and a fallback to copy_dir().
  • Use move_dir() in WP_Upgrader::move_to_temp_backup_dir() and ::restore_temp_backup().

Follow-up to [51815], [51898], [51899], [51902], [52192], [52284].

Props afragen, peterwilsoncc, dd32, SergeyBiryukov.
See #54166, #51857.

#38 @SergeyBiryukov
3 years ago

Thanks everyone! Keeping this open for now to see if the clear_working parameter is still relevant or should be deprecated, as per comment:25.

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


3 years ago

#41 @hellofromTonya
3 years ago

  • Milestone changed from 5.9 to 6.0

The Upgrade/Install team working on #51857 and this ticket decided to revert all of the changesets and move the work to early 6.0.

Why? An unidentified problem exists on virtualized systems using Vagrant and/or VirtualBox (with tools such as VVV, chassis, and potential others).

Once the 5.9 is branched and 6.0-alpha in trunk is open for business, the changesets will be recommitted to restore the work for continued investigation, refinement, and extensive testing.

This ticket was mentioned in Slack in #core-auto-updates by peterwilsoncc. View the logs.


3 years ago

#43 @peterwilsoncc
3 years ago

In 52351:

Upgrade/install: Revert upgrader rollback features.

Revert the rollback features introduced for theme and plugin upgrades during the WordPress 5.9 cycle. A bug (suspected to be in third party virtualisation software) causes the upgrades to fail consistently on some set ups. The revert is to allow contributors further time to investigate mitigation options.

Reverts [52337], [52289], [52284], [51951], [52192], [51902], [51899], [51898], [51815].

Props pbiron, dlh, peterwilsoncc, galbaras, SergeyBiryukov, afragen, costdev, bronsonquick, aristath, noisysocks, desrosj, TobiasBg, hellofromTonya, francina, Boniu91.
See #54543, #54166, #51857.

#44 @TJNowell
3 years ago

I strongly suspect this is related to this issue on the composer github:

https://github.com/composer/composer/issues/9627

Specifically:

https://github.com/composer/composer/blob/62dfd0af231996b7d3afc306700e026cb0b4aae4/src/Composer/Util/Platform.php#L205-L210

    public static function workaroundFilesystemIssues()
    {
        if (self::isVirtualBoxGuest()) {
            usleep(200000);
        }
    }

You need to sleep for a small amount of time so that filesystem caches are flushed.

When dealing with shared folders and mounts there are many approaches to tackle the performance vs latency tradeoff. For example some Docker implementations sacrifice CPU performance for filesystem immediacy. VirtualBox uses small caches with ultra short expiration dates. By calling usleep composer worked around this issue. Composer calls that function after every batch of filesystem calls to ensure that the FS can catch up before the next batch happens.

Other things to note:

  • this can be tested by performing the tests on a WordPress install that is not located in a shared/mounted folder
  • other filesystems that have latency issues or micro-caches should exhibit intermittent forms of this behaviour
  • On Windows+Hyper-V VirtualBox will transparently use Hyper-V behind the scenes due to nested virtualisation restrictions. People trying to reproduce this on Windows may not get the same results because of this.

#45 @TJNowell
3 years ago

I also suspect this command may influence the results of the tests:

VBoxManage storagectl "VM name" --name <controllername> --hostiocache off

Based on https://www.virtualbox.org/manual/ch05.html#iocaching

I would note that this is the main reason we do not mount the MySQL/MariaDB data folders as shared folders by default. The database server can become unstable when shared DB folders are turned on depending on the workload and the nature of the underlying OS and Hardware.

#46 @costdev
2 years ago

  • Keywords dev-feedback added
  • Version changed from 5.9 to trunk

Per the discussion in the bug scrub, I'm adding the dev-feedback keyword to get as much input on this as possible.

This ticket is touched by PR 2225. As such, this would greatly benefit from additional testing, particularly on different filesystems. See this comment for the current state of things.

Changing the version to trunk as [51815] was reverted in [52351] prior to 5.9's release.

This ticket was mentioned in Slack in #core by chaion07. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by chaion07. View the logs.


2 years ago

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


2 years ago

#50 @hellofromTonya
2 years ago

  • Milestone changed from 6.0 to 6.1

With RC1 tomorrow, moving this to 6.1.

This ticket was mentioned in Slack in #core by costdev. View the logs.


2 years ago

#52 @costdev
2 years ago

As a reminder, per comment 46, this ticket is touched by PR 2225 for the Rollback feature.

This ticket is being kept open to ensure that the fix is verified once the Rollback ticket (#51857) lands in Core.

This ticket was mentioned in Slack in #core by afragen. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by chaion07. View the logs.


2 years ago

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


2 years ago

#56 @SergeyBiryukov
2 years ago

In 53578:

Upgrade/Install: Add a conditional to facilitate testing of the Rollbacks feature project.

The Rollback Update Failure feature project creates a temporary backup of plugins and themes before updating. This aims to make the update process more reliable and ensure that if a plugin or theme update fails, the previous version can be safely restored.

If the Rollback Update Failure plugin is installed, WP_Upgrader::install_package() will use the move_dir() function from there for better performance. Instead of copying a directory from one location to another, it uses the rename() PHP function to speed up the process, which is instrumental in creating a temporary backup without a delay. If the renaming failed, it falls back to copy_dir() WP function.

This conditional aims to facilitate broader testing of the feature. It is temporary, until the plugin is merged into core.

Props afragen, pbiron, costdev, davidbaumwald, audrasjb, jrf, SergeyBiryukov.
Fixes #56057. See #51857, #54166.

This ticket was mentioned in Slack in #core by chaion07. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by chaion07. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


2 years ago

#61 @audrasjb
2 years ago

  • Keywords early removed

Removing early workflow keyword, as per today's bugscrub.

This ticket was mentioned in Slack in #core-auto-updates by costdev. View the logs.


2 years ago

#63 @costdev
2 years ago

  • Milestone changed from 6.1 to Future Release

#64 follow-up: @afragen
20 months ago

Consider closing this ticket due to #57375 and #57557 having been committed.

#65 in reply to: ↑ 64 @SergeyBiryukov
17 months ago

  • Keywords needs-testing dev-feedback removed
  • Milestone changed from Future Release to 6.2
  • Resolution set to fixed
  • Status changed from reopened to closed

Replying to afragen:

Consider closing this ticket due to #57375 and #57557 having been committed.

Right, I believe this was fixed with the introduction of move_dir() in those tickets.

Note: See TracTickets for help on using tickets.