Make WordPress Core

Opened 3 years ago

Closed 13 months ago

Last modified 7 months ago

#51857 closed enhancement (fixed)

Add rollback for failed plugin/theme updates

Reported by: pbiron's profile pbiron Owned by: pbiron's profile pbiron
Milestone: 6.3 Priority: normal
Severity: normal Version:
Component: Upgrade/Install Keywords: has-testing-info has-patch commit has-dev-note
Focuses: Cc:

Description (last modified by pbiron)

When a core auto-update fails, core has long had the ability to automatically attempt a "rollback" to the latest stable release (in the branch that the site is running). Note: for core auto-updates, "rollback" is not attempted for certain failures (e.g. "disk full", etc).

This capability should be extended to be available for plugin and theme updates.

[note: originally this description mentioned `auto-updates`...that was a mistake on my part. This is about rollback for both `manual` and `auto-updates`]

The current functionality for core rollbacks relies on the availability of a "rollback" package being returned by the Updates API. It is unclear (to me, at this time) whether extending this functionality to plugins/themes will require changes to the API or not.

Important: This ticket is not about backup/restore in case a site admin later determines that a successful plugin/theme update actually causes problem on a site!!!! Rather it is about covering cases such as #51823, where a plugin/theme update failure happens between the time the existing plugin/theme is deleted but before the new version successfully installed.

Attachments (12)

51857.diff (513 bytes) - added by afragen 3 years ago.
In copy_dir() check $wp_filesystem->dirlist() and return WP_Error if false
51857.2.diff (1.7 KB) - added by afragen 3 years ago.
only test on first pass
51857.3.diff (1.7 KB) - added by afragen 3 years ago.
pass slug in WP_Error instead of FS_METHOD
51857.4.diff (5.9 KB) - added by afragen 3 years ago.
add rollback
51857.5.diff (7.7 KB) - added by afragen 3 years ago.
also update _copy_dir() per @dd32 rec
51857.6.diff (7.6 KB) - added by afragen 3 years ago.
bubble up error for extract_rollback
Screen Shot 2021-09-23 at 10.55.44 am.png (19.1 KB) - added by peterwilsoncc 3 years ago.
upgrade-warnings.png (1.2 MB) - added by noisysocks 3 years ago.
Screen Shot 2021-09-30 at 10.34.40 am.png (35.3 KB) - added by peterwilsoncc 3 years ago.
51857-cron-fix.diff (4.6 KB) - added by pbiron 3 years ago.
51857.7.diff (897 bytes) - added by pbiron 3 years ago.
pr-2225.diff (6.5 KB) - added by pbiron 2 years ago.

Download all attachments as: .zip

Change History (324)

#1 @pbiron
3 years ago

  • Owner set to pbiron

#2 @afragen
3 years ago

Just a quick thought. The plugin/theme update transient could have an additional element containing the most recent download package. Then some error during the update process might be able to attempt an update using the alternate download package.

This ticket was mentioned in Slack in #core-auto-updates by audrasjb. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-auto-updates by pbiron. View the logs.


3 years ago

#5 @pbiron
3 years ago

  • Description modified (diff)
  • Summary changed from Add rollback for failed plugin/theme auto-updates to Add rollback for failed plugin/themeupdates

#6 @pbiron
3 years ago

  • Summary changed from Add rollback for failed plugin/themeupdates to Add rollback for failed plugin/theme updates

#7 @afragen
3 years ago

It seems that some additional error reporting is needed. In referencing the core failure data provided by @dd32 in https://wordpress.slack.com/archives/CULBN711P/p1606869594134200 it seems that copy_dir() is a focal point of failure.

In digging a little if $wp_filesystem->dirlist( $from ) is false what happens is the contents of the plugin folder is deleted and the folder remains.

I added a patch to test for this and return a WP_Error. At a minimum it can give us more information. Perhaps it can mitigate this type of failure.

Last edited 3 years ago by afragen (previous) (diff)

@afragen
3 years ago

In copy_dir() check $wp_filesystem->dirlist() and return WP_Error if false

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

@afragen
3 years ago

only test on first pass

#9 @afragen
3 years ago

Only running of the first pass of copy_dir() allows plugins to empty subfolders.

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

#11 @audrasjb
3 years ago

@afragen indeed this would at least prevent the updater to delete plugin without being able to upload the new version.

I tested it by changing writing permissions on my test install and it looks good to me (plugin not deleted). Is it the best way to test this patch?

#12 @afragen
3 years ago

@audrasjb part of the issue is that I don't even know if this is issue. It does seem to solve this particular problem ( $dirlist returning false ).

I test by settings a breakpoint after $dirlist = $wp_filesystem->dirlist( $from ); in copy_dir() and then setting $dirlist = false; in VSCode.

I'm working on sending this failure data to dot org but I have no way to tell if they get it. I'll make another ticket for that.

Last edited 3 years ago by afragen (previous) (diff)

#13 @afragen
3 years ago

See #51928 as possible method of data collection by dot org.

@afragen
3 years ago

pass slug in WP_Error instead of FS_METHOD

#14 @afragen
3 years ago

The Automatic Upgrader Skin doesn't really pass any specific data about plugin/theme so I'm passing the slug in WP_Error as something to help.

Last edited 3 years ago by afragen (previous) (diff)

This ticket was mentioned in Slack in #core by afragen. View the logs.


3 years ago

#16 @afragen
3 years ago

  • Keywords has-patch dev-feedback needs-testing added

51857.4.diff added rollback of plugin/theme to patch.

To test copy_dir() must return a WP_Error. This would likely happen if $dirlist = false.

@afragen
3 years ago

add rollback

@afragen
3 years ago

also update _copy_dir() per @dd32 rec

This ticket was mentioned in Slack in #core by afragen. View the logs.


3 years ago

#18 @lukecarbis
3 years ago

  • Keywords early added

#19 @poena
3 years ago

During today's scrub,
@afragen suggested that:

Easiest way to test is install patch, set a breakpoint in copy_dir() after $dirlist assignment and set $dirlist = false; This will return WP_Error and trigger the rollback during a plugin/theme update.

Last edited 3 years ago by poena (previous) (diff)

This ticket was mentioned in Slack in #core-auto-updates by audrasjb. View the logs.


3 years ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-auto-updates by hellofromtonya. View the logs.


3 years ago

#23 follow-up: @TimothyBlynJacobs
3 years ago

I'm not sure if zipping a copy is the right approach here. In my experience, most sites that have issues with a partial update failure ( ie some files are not copied resulting in a crash ) are due to resources timing out during the upgrade process. By taking a zip, this will significantly increase the amount of processing that needs to be done during the upgrade process and thus may increase the likelihood of failed updates. Further, since this is a fatal error issue, copy_dir won't return a result and the rollback code won't execute in that case.

#24 @afragen
3 years ago

ATM this is not quite meant to be all encompassing. Some of the issue is getting a WP_Error out of copy_dir().

It is meant to function if the $dirlist returns false as this is clearly an issue likely relating to the download, extraction , copy process. There's another ticket #51928 to help provide more data for where and when failures happen.

The zipping is done before the any unpacking of the downloaded package and I'm not sure that it should increase the server resources significantly. If this can be demonstrated that would be great. But since the failure seems to be in copy_dir(), zipping and unzipping the current plugin seems like a better solution than copying the data somewhere and then copying it back.

Last edited 3 years ago by afragen (previous) (diff)

#25 @afragen
3 years ago

I did some quick time tests locally and it takes almost no time to create a zip even of something as large as Jetpack.

This ticket was mentioned in Slack in #core by metalandcoffee. View the logs.


3 years ago

#27 @afragen
3 years ago

It would be helpful to test this on slow servers to see if the zipping process introduces any potential timeout issues.

Please post results here.

Last edited 3 years ago by afragen (previous) (diff)

#28 @audrasjb
3 years ago

I successfully tested the above patch on the worst server I can access to: 1 vCore, RAM 2Gb, 20 Gb SATA SSD. That's one of the cheapest services provided by big hosting companies here in France.
I'll try to find a non-SSD server next week.

But FWIW: rollback worked :-)

This ticket was mentioned in Slack in #core by afragen. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-auto-updates by audrasjb. View the logs.


3 years ago

This ticket was mentioned in Slack in #hosting-community by pbiron. View the logs.


3 years ago

This ticket was mentioned in Slack in #hosting-community by crixu. View the logs.


3 years ago

#33 in reply to: ↑ 23 ; follow-up: @kirasong
3 years ago

Replying to TimothyBlynJacobs:

I'm not sure if zipping a copy is the right approach here. In my experience, most sites that have issues with a partial update failure ( ie some files are not copied resulting in a crash ) are due to resources timing out during the upgrade process. By taking a zip, this will significantly increase the amount of processing that needs to be done during the upgrade process and thus may increase the likelihood of failed updates. Further, since this is a fatal error issue, copy_dir won't return a result and the rollback code won't execute in that case.

This was brought up in the hosting community channel for discussion and testing, but wanted to leave a quick note that this is also my experience.

#34 follow-up: @afragen
3 years ago

Mike, are you saying you have seen/experienced this problem or that you agree it's a theoretical concern?

This ticket was mentioned in Slack in #hosting-community by jadonn. View the logs.


3 years ago

#36 in reply to: ↑ 33 @a2hosting
3 years ago

I'd just like to add in that we've seen what should be very small zip tasks called from PHP fail if there are resource issues and that should be avoided if at all possible in cases such as these.

#37 @afragen
3 years ago

Can we please try to obtain actual data? Let's try to test the actual patch.

I understand theoretical issues, but I also understand that one type of zip task is not the same as another.

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

#40 @afragen
3 years ago

From the description.

The current functionality for core rollbacks relies on the availability of a "rollback" package being returned by the Updates API. It is unclear (to me, at this time) whether extending this functionality to plugins/themes will require changes to the API or not.

Obtaining the currently installed version package from dot org is possible but attempting to reinstall it in the same manner as the failure doesn't seem like the practical way forward.

This approach also has the issue of not working with any plugin/theme not hosted on dot org. I think this is a hard block.

#41 @afragen
3 years ago

Lots of discussion between @TimothyBlynJacobs and myself, https://wordpress.slack.com/archives/CULBN711P/p1609968242405800

My impression of the conclusion is that if there is a server resource issue that occurs a result of the zip creation process then a simple page reload brings the site back to it's previous state showing a plugin/theme update being available.

This shouldn't be an issue as long as the frequency of this happening is rare. If this is happens with small plugins it's an issue.

If it happens with large plugins like Mailpoet, Jetpack, Yoast SEO, etc. then perhaps it's not ideal to run a resource intensive site on a resource poor shared environment.

#42 @TimothyBlynJacobs
3 years ago

My impression of the conclusion is that if there is a server resource issue that occurs a result of the zip creation process then a simple page reload brings the site back to it's previous state showing a plugin/theme update being available.

I think this is accurate. But the failure happening _during_ the zip process specifically is not my big worry, but that it consumes resources so that when the file transfer happens, the site no longer has enough resources to complete the upgrade. Which is how you can end up with a fatal erroring site.

@afragen
3 years ago

bubble up error for extract_rollback

This ticket was mentioned in PR #860 on WordPress/wordpress-develop by afragen.


3 years ago
#43

Moving previous patch(es) to GitHub.

  • Updates copy_dir() and _copy_dir() to include WP_Error return early if the $dirlist is empty.
  • Adds function zip_to_rollback_dir() which creates a zip of the current plugin/theme that is being updated.
  • Adds function extract_rollback() which on a WP_Error returned from copy_dir() will extract the above zip back into the appropriate location.

Trac ticket: https://core.trac.wordpress.org/ticket/51857

This ticket was mentioned in Slack in #core-auto-updates by audrasjb. View the logs.


3 years ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-auto-updates by audrasjb. View the logs.


3 years ago

This ticket was mentioned in Slack in #core by afragen. View the logs.


3 years ago

#48 in reply to: ↑ 34 @kirasong
3 years ago

Replying to afragen:

Mike, are you saying you have seen/experienced this problem or that you agree it's a theoretical concern?

Sorry, I missed this question before. I'm saying I've seen and experienced this problem.
It is not a theoretical concern.

I agree that data would be helpful here, whether from the WordPress project, or hosts.

I'll ask around a bit and see if I can find recent data. It is one of those things that I helped with a lot in tech support, but that was some time ago.

#49 follow-ups: @SergeyBiryukov
3 years ago

Thanks for the patch!

It looks like there are some concerns about performance implications of building (and then potentially unpacking) a zip file with the current plugin version, especially for large plugins. Instead of building a zip file, could we rename the plugin directory to something reasonably unique like plugin-slug.__rollback__, and then delete it on successful update, or rename back on failure? I think that would not require any additional resources. Creating a zip file might save some disk space, but if the space is already low enough for that to make a difference, then it seems like updates on that site would start failing anyway sooner rather than later.

Personally, I would like the code for plugin/theme upgraders to be as close to core as possible. At a glance, it looks like Core_Upgrader::upgrade() has a lot more than Plugin_Upgrader::upgrade() or Theme_Upgrader::upgrade(), so perhaps unifying these methods and documenting the differences would be a good first step. I would prefer to adapt the core upgrader for plugins and themes as appropriate, and not create a completely different implementation.

#50 @afragen
3 years ago

@SergeyBiryukov there’s a fundamental problem with trying to use the same method for a rollback that is the cause of the failure in the first place. In this case copy_dir().

#51 in reply to: ↑ 49 @afragen
3 years ago

Replying to SergeyBiryukov:

Personally, I would like the code for plugin/theme upgraders to be as close to core as possible. At a glance, it looks like Core_Upgrader::upgrade() has a lot more than Plugin_Upgrader::upgrade() or Theme_Upgrader::upgrade(), so perhaps unifying these methods and documenting the differences would be a good first step. I would prefer to adapt the core upgrader for plugins and themes as appropriate, and not create a completely different implementation.

Sounds like this should be another ticket.

#52 in reply to: ↑ 49 ; follow-up: @afragen
3 years ago

Replying to SergeyBiryukov:

Thanks for the patch!

It looks like there are some concerns about performance implications of building (and then potentially unpacking) a zip file with the current plugin version, especially for large plugins. Instead of building a zip file, could we rename the plugin directory to something reasonably unique like plugin-slug.__rollback__, and then delete it on successful update, or rename back on failure? I think that would not require any additional resources. Creating a zip file might save some disk space, but if the space is already low enough for that to make a difference, then it seems like updates on that site would start failing anyway sooner rather than later.

Thanks for the feedback.

Unfortunately rename() can be flaky/unreliable and fallback from rename() involves file copy which is the likely cause of the failure in the first place. Without testing the patch are we looking to solve a problem that doesn't exist?

A server timeout issue is mostly fixed in a page reload and the site isn't likley any worse off than it currently would be.

Isn't this what alpha is for? Unless there's some other concern?

#53 @TimothyBlynJacobs
3 years ago

A server timeout issue is mostly fixed in a page reload and the site isn't likley any worse off than it currently would be.

Calling this out again.

But the failure happening _during_ the zip process specifically is not my big worry, but that it consumes resources so that when the file transfer happens, the site no longer has enough resources to complete the upgrade. Which is how you can end up with a fatal erroring site.

This ticket was mentioned in Slack in #core-auto-updates by hellofromtonya. View the logs.


3 years ago

#55 in reply to: ↑ 52 @dd32
3 years ago

Replying to SergeyBiryukov:

It looks like there are some concerns about performance implications [...] could we rename the plugin directory to something reasonably unique like plugin-slug.__rollback__, and then delete it on successful update, or rename back on failure?

This looks like the best option from a filesystem-io perspective to me.

Replying to afragen:

Unfortunately rename() can be flaky/unreliable and fallback from rename() involves file copy which is the likely cause of the failure in the first place.

I'm not sure what the flaky/unreliable experience is here, but I feel like creating a ZIP backup is solving a problem that doesn't actually exist while adding significant CPU/IO into the process at the same time that's not needed.

Looking at the filesystem handlers move(), it looks like the Direct IO one has some behaviour to use a copy, which _probably_ isn't needed and is probably not useful, [13001] added the usage of rename() into it. A lot has changed in the last decade host IO wise.

Also of note, is that the FTP variants don't check the overwrite functionality.. but that's mostly irrelevant here.

I would probably suggest just cleaning up those move() methods to be consistent and not fallback on a copy() if it's not actually performing a move()/removing the source directory.

IMHO if move() fails, either the delete of the existing directory will also fail, or rollback should just be skipped for those edge-cases.

#56 follow-ups: @dd32
3 years ago

An alternative option would be to flip this around, and not add a rollback functionality into it, but rather install into a temporary directory first, and then remove existing/rename new into place.

(edit: I guess that's kind of how it works now, except the temporary install directory is wp-content/upgrade/somethinghere. It's the copy from there to the plugins directory that sometimes fails, so avoiding that copy by extracting into the plugins directory in the first place and then performing a move() would mean far less IO is needed and as a result could be faster too)

That would result in less downtime of the plugin not existing on the site, and also not delete the plugin from the site in the event of a failure to create the new files.

eg:

  • Plugin update into .hello-dolly.upgrade.1234..
  • Once successful:
    • hello-dolly directory is deleted.
    • move( '.hello-dolly.upgrade.1234', 'hello-dolly' );
    • return true;
  • Upon failure:
    • Delete .hello-dolly.upgrade.1234
    • return false
Last edited 3 years ago by dd32 (previous) (diff)

#57 in reply to: ↑ 56 @dd32
3 years ago

Replying to dd32:

I guess that's kind of how it works now, except the temporary install directory is wp-content/upgrade/somethinghere. It's the copy from there to the plugins directory that sometimes fails

I think this might have originally been a copy, rather than a move, as moving items cross-partition isn't possible. Sometimes each site would have it's own wp-content directory, but the wp-content/plugins or themes folder was actually a symlink to another disk or some such.

I could also be mis-remembering it, and it was purely bad planning and development skills that I wrote it like this :) If I was doing it again, I'd do it how I've outlined in comment 56 above

Last edited 3 years ago by dd32 (previous) (diff)

This ticket was mentioned in Slack in #core-auto-updates by hellofromtonya. View the logs.


3 years ago

This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.


3 years ago

#60 in reply to: ↑ 56 @hellofromTonya
3 years ago

From core scrub today, @afragen noted that he'll open a new ticket to address the copy_dir changes Dion mentioned above.

For this ticket though, just want to make sure we all understand. Replying to dd32:

An alternative option would be to flip this around, and not add a rollback functionality into it, but rather install into a temporary directory first, and then remove existing/rename new into place.

Hey @dd32 > Is the proposed implementation how it currently works?

If yes, then nothing else needs to be done in this ticket.

Last edited 3 years ago by hellofromTonya (previous) (diff)

#61 @hellofromTonya
3 years ago

  • Keywords close added

Marking as a close candidate. Why? Our understanding is that Dion's proposed solution does not require any additional work in this ticket. And it resolves the server resources concerns.

#62 @SergeyBiryukov
3 years ago

It's my understanding that comment:56 is not exactly how it currently works and still needs to be implemented, so I don't think the ticket should be closed yet.

#63 @hellofromTonya
3 years ago

  • Keywords close removed

This ticket was mentioned in Slack in #core by lukecarbis. View the logs.


3 years ago

This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.


3 years ago

#66 @hellofromTonya
3 years ago

  • Milestone 5.7 deleted
  • Resolution set to maybelater
  • Status changed from assigned to closed

Closing this ticket for now as a maybelater. Why?

  • #52342 is the new direction proposed by Dion.
  • #52381 proposes a new filter to enable moving this ticket into a feature plugin.

#67 @dd32
3 years ago

Just noting, that while this has been closed as maybelater, #52342 is not the suggestion as I mentioned above, and #52381 is a suggestion for a filter to make this ticket possible as a feature plugin, which could've just been kept here.

#68 @SergeyBiryukov
3 years ago

  • Milestone set to 5.8
  • Resolution maybelater deleted
  • Status changed from closed to reopened

Reopening per comment:62 and comment:67.

#69 follow-ups: @SergeyBiryukov
3 years ago

  • Keywords needs-patch added; has-patch dev-feedback needs-testing removed

I think comment:56 is the preferred direction here.

This ticket was mentioned in Slack in #core by sergey. View the logs.


3 years ago

This ticket was mentioned in Slack in #hosting-community by javier. View the logs.


3 years ago

This ticket was mentioned in Slack in #hosting-community by jadonn. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

#74 @KZeni
3 years ago

Question... does this also accommodate a plugin/theme update that's fully completed (no issue during copying of files or anything like the example stated), but then actually has a bug within the code of the latest release that causes a fatal server error?

I know WP has precautions in place when trying to activate a plugin that'll cause a server error so I'm thinking this might be the case for what's being proposed here (just applying to when updating a plugin with it then rolling back if an error is encountered.)

If that isn't being accommodated, currently, is there anything preventing that situation from being included in this functionality as well?

Last edited 3 years ago by KZeni (previous) (diff)

#75 follow-up: @afragen
3 years ago

The intent of this is currently accommodating update failures, not complete updates where the new version causes a fatal.

#76 in reply to: ↑ 75 @KZeni
3 years ago

Replying to afragen:

The intent of this is currently accommodating update failures, not complete updates where the new version causes a fatal.

I wanted to check since I'm not certain on what it might entail to expand this to include that scenario as well if it didn't already (seems like it could be a natural expansion on the idea.)

For what it's worth, I'm thinking it could be an incredibly helpful thing that could greatly improve ongoing reliability of WordPress sites if it's a reasonable thing to consider for inclusion as part/expansion of this. I mean, unfortunately, there are still plugins out there that have a release buggy release from time to time (ex. I encountered 2 in the last month with them being widely used [one having 100,000+ active installs] plugins where the issue really did live in the plugin release itself), and this is more important now that auto-updates are becoming more of a thing since a buggy release that auto-updates (or even is encountered during a manual update) means everyone encounters the issue rather than the sporadic cases where a copy didn't complete properly due to a server shortcoming/hiccup like this is currently set to address.

#77 @pbiron
3 years ago

@KZeni thanx for the comment.

Personally, I would love to be able to detect whether a plugin/theme update will cause a fatal. However, that would be very hard to do so, since in many cases, such fatals won't occur until the plugin is actually used (e.g., they often don't happen until an admin- or end-user actually does something with the plugin/theme).

Since WP 5.1 was release, there is Fatal Error Protection built into core, where a plugin will be "suspended" if a fatal occurs at run-time and admins will be notified (see the "Fatal Error Protection" section of PHP Site Health Mechanisms in 5.1 for more info).

#78 @KZeni
3 years ago

@pbiron Gotcha. I was just hoping for this since, in my recent case, the Simple History plugin had a syntax error in a recent release where it took down all sites using it for me per the sites being set to auto-update. The other case was the hCaptcha for WordPress plugin where they left in development code where it was trying to require a file that didn't exist.

Both cases were involving a very straightforward & immediate server error simply by way of simply having errors when active without any further action needed (errors in the code itself), and WordPress (running 5.6.2) simply applied the update & had things break site-wide as a result (when it'd be ideal to have it prevent activation or even rollback to the previous version, if encountered.)

Meanwhile, it seems like the kind of situation that the error-checking upon attempting to activate a plugin that'd cause a server error would prevent, but then that protection isn't happening on update & is only on activation (seems like partway towards having plugins not cause server errors where this ticket focuses on issues during installing the new version while it could/should also ideally include the actual plugin being at fault during update just like it does upon the attempted first activation.)

I guess I was hoping that type of error prevention could be inserted during the update process as well somehow while I was uncertain if this particular ticket would take care of that situation (with this ticket's rollback behavior being ideal, if an issue is encountered), it'd be possible to expand this ticket to include that scenario, and/or this ticket is using a different mechanism from that where it'd be a wholly separate request/implementation (with it sounding like the latter, currently & unfortunately, unless these details have helped clarify things.)

Last edited 3 years ago by KZeni (previous) (diff)

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

#80 in reply to: ↑ 69 @SergeyBiryukov
3 years ago

Replying to SergeyBiryukov:

I think comment:56 is the preferred direction here.

Related: #52832

#81 @galbaras
3 years ago

How about treating old versions like trashed items and deleting them later?

This can be easily done by a background job, as well as manual links, e.g. "remove previous version backup" in the plugin list and a "remove previous version backups" action for selected plugins?

Last edited 3 years ago by galbaras (previous) (diff)

#82 follow-up: @SergeyBiryukov
3 years ago

#52832 was marked as a duplicate.

#83 in reply to: ↑ 69 @galbaras
3 years ago

I think comment:56 is the preferred direction here.

This great approach can be combined with avoiding copy_dir() altogether and using move() again.

  1. Unpack the zip file to hello.update.1234
  2. If unsuccessful, return false
  3. If hello.old exists, remove it
  4. Move hello to hello.old
  5. Move hello.update.1234 to hello
  6. Later on (manually and/or scheduled), remove hello.old.

move() should fail a lot less often than copy_dir(), especially since it uses delete & copy anyway when direct renaming fails.

This should greatly simplify the entire upgrade process, make it faster and provide a way to recover from problems by quickly restoring hello.old to whence it came, also by deleting and renaming.

#84 @desrosj
3 years ago

  • Milestone changed from 5.8 to Future Release

As far as I can tell, this effort has been moved to a feature plugin. With 5.8 feature freeze fast approaching (a few short weeks away) and no published merge proposal, I think the reasonable early cutoff has passed. I'm going to punt this to Future Release for now.

If that's an incorrect assessment, please do move back and add the needed context.

This ticket was mentioned in Slack in #core-auto-updates by francina. View the logs.


3 years ago

#86 follow-up: @richards1052
3 years ago

I am a victim of this auto update failure. Two plug-ins, Yoast and Mailpoet, when they update have failed to do so multiple times AND disappeared from my directory.

So when I discover they're missing (long after they've been deleted) I have to manually reinstall them.

I'm surprised a roll back hasn't been implemented for failed plugin upgrades. Also disappointed the auto updater doesn't notify the user when such failures occur. It would permit me to discover the failure, notice the disappearance and reinstall immediately, rather than waiting till I notice the plugin is gone.

#87 in reply to: ↑ 86 @afragen
3 years ago

Replying to richards1052:

I am a victim of this auto update failure. Two plug-ins, Yoast and Mailpoet, when they update have failed to do so multiple times AND disappeared from my directory.

So when I discover they're missing (long after they've been deleted) I have to manually reinstall them.

I'm surprised a roll back hasn't been implemented for failed plugin upgrades. Also disappointed the auto updater doesn't notify the user when such failures occur. It would permit me to discover the failure, notice the disappearance and reinstall immediately, rather than waiting till I notice the plugin is gone.

Have you tried installing the feature plugin Rollback Update Failure??

#88 @richards1052
3 years ago

I probably will. But I have so many plug-ins that I try to restrain myself from adding another.

#89 @pbiron
3 years ago

Richard, please do install and test/use Rollback Update Failure.

It is basically a proposal for how core could do rollbacks and we need real world usage of the proposed solution before we're comfortable incorporating it into core.

#90 in reply to: ↑ 82 ; follow-up: @galbaras
3 years ago

Replying to SergeyBiryukov:

#52832 was marked as a duplicate.

It's related, but not a duplicate, unless my suggestions are integrated into this change. So far, the feature plugin isn't covering them, though, because it's somewhat of an alternative approach to the same thing.

Does anyone disagree that moving is quicker and more flexible than copying?

#91 in reply to: ↑ 90 @SergeyBiryukov
3 years ago

Replying to galbaras:

Replying to SergeyBiryukov:

#52832 was marked as a duplicate.

It's related, but not a duplicate, unless my suggestions are integrated into this change. So far, the feature plugin isn't covering them, though, because it's somewhat of an alternative approach to the same thing.

Does anyone disagree that moving is quicker and more flexible than copying?

I think moving instead of copying is exactly what was proposed in comment:56, and is the preferred direction for this ticket, as previously noted in comment:69.

The feature plugin indeed offers an alternative approach that could be added later, if still necessary after the approach from comment:56 is implemented.

#92 @afragen
3 years ago

comment:56 is not a rollback provision per se but a new method in the course of plugin updating.

#93 @galbaras
3 years ago

Looks like this ticket is getting so long that we just look at the tail end of it. Thank you @SergeyBiryukov for clarifying.

@afragen Given the preferred direction, the only failure scenarios I see are when WP fails to delete the existing directory or move the temporary directory to the live one, likely due to insufficient permissions, which should almost never happen.

An even rarer case is when part of the directory is deleted/moved, and some fails to be deleted/moved.

In the entire deletion fails, WP should return false (with some meaningful message), but at least the old plugin is still there. If only some of the deletion fails, or the move fails (partly or completely), WP may be crippled, but rolling back may suffer the same fate :(

This ticket was mentioned in PR #1492 on WordPress/wordpress-develop by aristath.


3 years ago
#94

  • Keywords has-patch added; needs-patch removed

This patch aims to improve the stability of plugin/theme updates, by adding the ability to "rollback" failed updates.

  • When updating a plugin/theme, the old version of the plugin/theme gets moved to a wp-content/upgrade/rollback/plugins/PLUGINNAME or wp-content/upgrade/rollback/themes/THEMENAME folder. The reason we chose to move instead of zip, is because zipping/unzipping are very resources-intensive processes, and would increase the risk on low-end, shared hosts. Moving on the other hand is performed instantly and won't be a bottleneck.
  • If the update fails, then the "backup" we kept in the upgrade/rollback folder gets restored to its original location
  • If the update succeeds, then the "backup" is deleted
  • A new section was added in the site-health screen, to make sure that the rollbacks folder is writable.

To avoid confusion: The "rollback" folder will NOT be used to "roll-back" a plugin to a previous version after an update. This folder will simply contain a transient backup of the previous version of a plugins/themes getting updated, and as soon as the update process finishes, the folder will be empty.

Trac ticket: https://core.trac.wordpress.org/ticket/51857

#95 @pbiron
3 years ago

Following on the core-auto-updates slack discussion of this today, see this comment on the PHP manual page for rename() for some of the pitfalls when the $source and $destination directions are on different filesystems.

I don't think that's much of a problem for the PR, but it does mention there can be cases where rename() returns true even though there were "problems" with the operation. We may want to add some logic to WP_Filesystem_Direct::move() to trap those warnings and deal with them appropriately.

#96 in reply to: ↑ 56 @SergeyBiryukov
3 years ago

Replying to dd32:

An alternative option would be to flip this around, and not add a rollback functionality into it, but rather install into a temporary directory first, and then remove existing/rename new into place.

(edit: I guess that's kind of how it works now, except the temporary install directory is wp-content/upgrade/somethinghere. It's the copy from there to the plugins directory that sometimes fails, so avoiding that copy by extracting into the plugins directory in the first place and then performing a move() would mean far less IO is needed and as a result could be faster too)

That would result in less downtime of the plugin not existing on the site, and also not delete the plugin from the site in the event of a failure to create the new files.

@dd32 When you have a moment, could you take a look at PR 1492?

In this implementation, we've explored the approach you suggested, with a notable difference that we chose not to use wp-content/plugins as a temporary directory, as that might lead to multiple copies of the plugin being displayed on the Plugins screen. Even if it's only for a short period of time, I think it might still cause confusion if someone tries to activate a temporary copy of the plugin.

We did replace copying with a call to ::move() though, for better perfomance. Most of the time, that should be a simple rename() call. If the rollback directory is on another filesystem, I guess a rename() would not work, but in that case the ::move() method uses copy()/delete() as a fallback, which I think should work :)

As per the PR description:

  • When updating a plugin/theme, the old version of the plugin/theme gets moved to a wp-content/upgrade/rollback/plugins/PLUGINNAME or wp-content/upgrade/rollback/themes/THEMENAME folder. The reason we chose to move instead of zip, is because zipping/unzipping are very resources-intensive processes, and would increase the risk on low-end, shared hosts. Moving on the other hand is performed instantly and won't be a bottleneck.
  • If the update fails, then the "backup" we kept in the upgrade/rollback folder gets restored to its original location.
  • If the update succeeds, then the "backup" is deleted.
  • A new section was added in the site-health screen, to make sure that the rollbacks folder is writable.

To avoid confusion: The "rollback" folder will NOT be used to "roll-back" a plugin to a previous version after an update. This folder will simply contain a transient backup of the previous version of a plugin/theme getting updated, and as soon as the update process finishes, the folder will be empty.

This ticket was mentioned in Slack in #core by pbiron. View the logs.


3 years ago

#98 follow-up: @kirasong
3 years ago

I surfaced this at GDDY for more feedback and testing by appropriate folks.

Haven't done deep testing, but did an initial review of PR 1492.

These look like great changes + checks to improve the reliability of the updater.

Thank you!!

This ticket was mentioned in Slack in #core-test by hellofromtonya. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-test by boniu91. View the logs.


3 years ago

#101 @Boniu91
3 years ago

  • Keywords has-testing-info added

Copying testing instructions from the PR:

  • If the wp-content/upgrade/temp-backup folder is not writable, there should be an error in the site-health screen.
  • If the server has less than 20MB available, there should be an error in the site-health screen that updates may fail. If the server has less than 100MB, it should be a notice that disk space is running low.
  • When updating a plugin, you should be able to see the old plugin in the wp-content/upgrade/temp-backup/plugins/PLUGINNAME folder. The same should apply for themes. Since updates sometimes run fast and we may miss the folder creation during testing, you can add return true; as the 1st line inside the WP_Upgrader->delete_temp_backup() method. This will return early and skip deleting the backup on update-success.
  • When a plugin update fails, the previous version should be restored. To test that, change the version of a plugin to a previous number, run the update, and on fail the previous version (the one where you changed the version number) should still be installed on the site. To simulate an update failure and confirm this works, you can use the snippet below:
add_filter( 'upgrader_install_package_result', function() {
   return new WP_Error( 'simulated_error', 'Simulated Error' );
});
Last edited 3 years ago by Boniu91 (previous) (diff)

#102 @Boniu91
3 years ago

Test report

Environment

  • WordPress 5.9-alpha-51272-src
  • WordPress-develop environment used

Tested

  1. wp-content/upgrade/temp-backup and wp-content/upgrade/temp-backup/plugins not writeable, the error is displayed in the Site Health
  2. The error and notice is displayed correctly when disk_free_space returns mentioned values.
  3. Worked as described
  4. Worked as described

Question related to 2nd point:
Is this function 100% reliable in all environments?

Question/suggestion related to 3rd point:
If we change permissions of /upgrade/temp-backup/plugins/plugin_name/ it won't be possible to update the plugin plugin_name anymore. It will end up with error:
Could not move old version to the temp-backup directory.
There's no message inside the Site Health section about that. I think we could add kind of check, that would display a notice when /upgrade/temp-backup/plugins/ or /upgrade/temp-backup/themes/ are not empty and error when their content is not writeable.

#103 in reply to: ↑ 98 @mai21
3 years ago

Replying to mikeschroder:

I surfaced this at GDDY for more feedback and testing by appropriate folks.

Haven't done deep testing, but did an initial review of PR 1492.

These look like great changes + checks to improve the reliability of the updater.

Thank you!!

Checked the following on https://github.com/WordPress/wordpress-develop/pull/1492:

  • If the wp-content/temp-backup folder is not writable, there should be an error in the site-health screen.(pass)
  • When updating a plugin, you should be able to see the old plugin in the wp-content/upgrade/temp-backup/plugins/PLUGINNAME folder. The same should apply for themes. Since updates sometimes run fast and we may miss the folder creation during testing, you can add return true; as the 1st line inside the WP_Upgrader->delete_temp_backup() method. This will return early and skip deleting the backup on update-success. (pass)
  • If the server has less than 20MB available, there should be an error in the site-health screen that updates may fail. If the server has less than 100MB, it should be noticed that disk space is running low. => Any ideas on how to control server size while using https://localwp.com/? (blocked)
  • For the failed update I tried the following: (Inprogress)

1- add return true;

as the 1st line inside the WP_Upgrader->delete_temp_backup()

2- upgrade theme
3- turn off the internet
4- notice the upgrade folder
5- turn on the network after awhile
===> update failed and the theme was never added to the upgrade folder in the 1st place which I think is fine unless if backup shall be there while processing then will be deleted /or restored in case of success /failure

Env: development version (5.8.1-alpha-51643)

Finally having 2 questions:

1- How to do the needed prequest of server size?
2- What is exactly expected with failure of upgrade regarding backup file? is it ever added to the upgrade folder?

Thanks!

Last edited 3 years ago by mai21 (previous) (diff)

#104 @SergeyBiryukov
3 years ago

  • Milestone changed from Future Release to 5.9

Moving to the milestone for more visibility and discussion of PR #1492.

#105 @afragen
3 years ago

Tested PR #1492.

I was unable to test the server memory conditions.

If both the wp-content/upgrade/temp-backup/plugins and wp-content/upgrade/temp-backup/themes are both un-writeable, only the plugins folder is noted to be un-writeable in Site Health.

If either is un-writeable the correct notice is displayed.

If wp-content/upgrade/temp-backups is un-writeable there is no notice in Site Health.

Updating and restoring a failed backup works great!

#106 @Boniu91
3 years ago

@mai21 @afragen I think you can simulate the available space by changing the value of $available_space to expected value in bytes here:
https://github.com/WordPress/wordpress-develop/blob/429ed3a8fe56d470ae5a8f4bdd3e46ce4046cf03/src/wp-admin/includes/class-wp-site-health.php#L1889

This ticket was mentioned in Slack in #core by francina. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-test by hellofromtonya. View the logs.


3 years ago

#109 @aristath
3 years ago

Replying to feedback/questions on https://core.trac.wordpress.org/ticket/51857#comment:102

Question related to 2nd point:
Is this function 100% reliable in all environments?

I searched and didn't find any gotchas for that function... I haven't tested it on all environments because I only have access to a limited pool of testing environments.
That being said, if the disk_free_space function can't get a proper result it returns false, in which case the message shown to users in site-health is Could not determine available disk space for updates.

If we change permissions of /upgrade/temp-backup/plugins/plugin_name/ it won't be possible to update the plugin plugin_name anymore. It will end up with error:
Could not move old version to the temp-backup directory.
There's no message inside the Site Health section about that. I think we could add kind of check, that would display a notice when /upgrade/temp-backup/plugins/ or /upgrade/temp-backup/themes/ are not empty and error when their content is not writeable.

The only problem with that is the fact that there is a chance an update is running at the same time... Maybe an automatic-update is triggered on the background, or we have 2 tabs open to do multiple tasks at once and we're updating plugins on another tab, or maybe the site has multiple admins and someone else is running an update at the same time.
I agree that ideally we'd have that message, but at this stage I don't see how we can safely do it without false-positives. It would be possible if there was a function like is_currently_updating() which would allow us to check if there's an update currently in progress, but so far I haven't found a way to do that efficiently.


Replying to questions from https://core.trac.wordpress.org/ticket/51857#comment:103

1- How to do the needed prequest of server size?

That one is tricky... The easiest solution would be to edit this line:

if ( 20 > $available_space_in_mb ) {

and replace the number 20 with something like 200000 (or some other value depending on how much disk space you have available on your machine) to make sure the error gets properly triggered when the defined number is larger than the available disk-space.

2- What is exactly expected with failure of upgrade regarding backup file? is it ever added to the upgrade folder?

If the failure happens while downloading the update, then the backup is never created, it fails before we get to the backup step
If the backup fails for some reason like file/folder permissions, low disk space etc, then the backup gets restored to the plugin's original location. So on failure the backup folder will be empty, but the original plugin will be restored (as opposed to the previous behavior which had the plugin removed or in some cases half-installed)

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-test by hellofromtonya. View the logs.


3 years ago

#112 @afragen
3 years ago

@SergeyBiryukov

I'm using the updated Rollback Update Failure plugin which now has similar code as this patch.

I discovered that my server has disabled the disk_free_space() function and now anytime I visit the Site Health page I get the following error.

PHP Warning disk_free_space() has been disabled for security reasons in /sites/example.com/files/wp-content/plugins/rollback-update-failure/plugin.php on line 257

which corresponds to the following line.

$available_space = (int) disk_free_space( WP_CONTENT_DIR . '/upgrade/' );

Perhaps a server check to ini_get( 'disabled_functions' ) and setting to false if the function is not available?

I'll see what I can test in the plugin.

Last edited 3 years ago by afragen (previous) (diff)

#113 follow-up: @afragen
3 years ago

The following change fixes this issue in the plugin.

$available_space = (int) disk_free_space( WP_CONTENT_DIR . '/upgrade/' );

changed to

$disabled = explode( ',', ini_get('disable_functions') );
$available_space = ! in_array( 'disk_free_space', $disabled, true ) ? (int) disk_free_space( WP_CONTENT_DIR . '/upgrade/' ) : false;

Currently testing in https://github.com/WordPress/rollback-update-failure/

Last edited 3 years ago by afragen (previous) (diff)

This ticket was mentioned in Slack in #core by afragen. View the logs.


3 years ago

#115 @aristath
3 years ago

I backported the disk_free_space fix to the patch, so it looks like we're good to go 👍

#116 in reply to: ↑ 113 @SergeyBiryukov
3 years ago

Replying to afragen:

$disabled = explode( ',', ini_get('disable_functions') );
$available_space = ! in_array( 'disk_free_space', $disabled, true ) ? (int) disk_free_space( WP_CONTENT_DIR . '/upgrade/' ) : false;

I wanted to see whether we have a precedent of checking the disable_functions setting this way in core.

We do in apache_mod_loaded() for phpinfo() as of [29330] / #26772, but that appears to be the only instance. As noted in comment:3:ticket:42085, this seems a bit redundant and a simple function_exists() check should suffice.

Per the comments in the PHP manual, function_exists() should return false for functions disabled via disable_functions setting.

As also noted in comment:3:ticket:42085, it is possible for the function_exists() check to return true if Suhosin is in use. However, that would be an edge case, as Suhosin was only officially available for PHP 5.4 to 5.6, and its development was discontinued in 2015.

It's also worth noting that as of PHP 8, disabled functions can be redeclared. Unless the function is redeclared, function_exists() still returns false.

It looks like most of the other instances in core where we need to check for a function that might be disabled just use a simple function_exists() check:

  • wp_is_ini_value_changeable() takes this approach for ini_get_all() in [38431] / #37680.
  • Site Health also uses this approach for ini_get() in [48535] / #50038.

Related: #33112, #37680, #37978, #48693, #52226.

So I would suggest just using a function_exists() check here as well, optionally combined with the error suppression operator just in case:

// Sometimes `disk_free_space()` is disabled via the `disable_functions` option for "security purposes".
if ( function_exists( 'disk_free_space' ) ) {
	$available_space = (int) @disk_free_space( WP_CONTENT_DIR . '/upgrade/' );
} else {
	$available_space = false;
}

This works in my testing on PHP 5.6 to 8.1.

#117 @aristath
3 years ago

Thank you for the research @SergeyBiryukov!
Pushed the fix to the patch 👍

#118 @afragen
3 years ago

I can confirm that the updated patch works. Thanks @SergeyBiryukov and @aristath

Rollback Update Failure plugin updated.

Last edited 3 years ago by afragen (previous) (diff)

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

#120 @SergeyBiryukov
3 years ago

In 51815:

Upgrade/Install: Create a temporary backup of plugins and themes before updating.

This aims to make the update process more reliable and ensures that if a plugin or theme update fails, the previous version can be safely restored.

  • When updating a plugin or theme, the old version is moved to a temporary backup directory:
    • wp-content/upgrade/temp-backup/plugins/[plugin-slug] for plugins
    • wp-content/upgrade/temp-backup/themes/[theme-slug] for themes.
  • If the update fails, then the temporary backup kept in the upgrade/temp-backup directory is restored to its original location.
  • If the update succeeds, the temporary backup is deleted.

To further help troubleshoot plugin and theme updates, two new checks were added to the Site Health screen:

  • A check to make sure that the temp-backup directory is writable.
  • A check that there is enough disk space available to safely perform updates.

To avoid confusion: The temp-backup directory will NOT be used to "roll back" a plugin to a previous version after a completed update. This directory will simply contain a transient backup of the previous version of a plugin or theme being updated, and as soon as the update process finishes, the directory will be empty.

Props aristath, afragen, pbiron, dd32, poena, TimothyBlynJacobs, audrasjb, mikeschroder, a2hosting, hellofromTonya, KZeni, galbaras, richards1052, Boniu91, mai21, francina, SergeyBiryukov.
See #51857.

#121 @SergeyBiryukov
3 years ago

Thanks everyone! Keeping the ticket open for now just in case any follow-up adjustments are needed.

This ticket was mentioned in Slack in #core by sergey. View the logs.


3 years ago

#124 @TobiasBg
3 years ago

The falsey value in the function_exists() check https://core.trac.wordpress.org/browser/trunk/src/wp-admin/includes/class-wp-site-health.php?rev=51815#L1893 should probably be 0 instead of false.
Otherwise, we potentially have a type problem in the next line (boolean divided by a number) (ok, that's still 0), and we are potentially passing false to size_format() which is documented to accept ints or strings only.

#125 follow-up: @desrosj
3 years ago

Just wanted to report that this has been running without issue on my site since committing.

One thing I did notice is that the temp-backup directory remains in upgrade after an update completes. When a Core auto-update happens, the upgrades directory is completely emptied. Should the plugin/theme process just clean up the temp-backup directory?

This ticket was mentioned in Slack in #core-test by boniu91. View the logs.


3 years ago

#127 in reply to: ↑ 125 @aristath
3 years ago

Replying to desrosj:

Just wanted to report that this has been running without issue on my site since committing.

That's great to hear, thank you for verifying!

One thing I did notice is that the temp-backup directory remains in upgrade after an update completes. When a Core auto-update happens, the upgrades directory is completely emptied. Should the plugin/theme process just clean up the temp-backup directory?

The PR added an exception for the temp-backup folder so when an update runs, that folder does not get deleted (see https://core.trac.wordpress.org/browser/trunk/src/wp-admin/includes/class-wp-upgrader.php?rev=51815#L337). This was done to ensure that in case multiple updates run at the same time, the backups won't get deleted before the update finishes running. This also decreases the effect of #53705: The downloaded/to-be-installed files get deleted, but the backup remains and is restored on failure so there are no errors.

Last edited 3 years ago by SergeyBiryukov (previous) (diff)

#128 @peterwilsoncc
3 years ago

  • Keywords needs-patch needs-unit-tests added; has-patch removed

I'm seeing some bugs following [51815] on my local development environment:

  • Dev environment: Chassis
  • Config file for Chassis at https://gist.github.com/peterwilsoncc/53c197cb5ef3e3dd9d649fb1d2069d0f
  • Replaced default wp directory with a checkout of the build server
  • Runs in sub-directory mode (ie, site url: wp-build.local/wp, home url: wp-build.local, content: wp-build.local/content)
  • Plugins installed pre test: User Switching (latest), Query Monitor (latest), Gutenberg (old, 11.1.0)

When updating plugins that contain sub-directories (such as Gutenberg), the top level of the plugin extracts but none of the sub-directories contain files. See See Screen Shot 2021-09-23 at 10.55.44 am.png.

I've tested with various workflows to update the plugins, most cause a fatal error after the update fails. The results of the various workflows are:

Plugin screen

  • view version details, click install: Fails, stays active and triggers critical error
  • update now link: Fails, stays active and triggers critical error
  • checkbox, select update, apply: Fails, stays active and triggers critical error

Update screen:

  • checkbox, click update plugins: fails, stays active and triggers critical error
  • view version details, click install: fails, deactivates.

I've asked some other contributors to test on their local environments but the result is consistent for me with multiple plugins that contain a sub-directory structure using Chassis.

There are some plugin tests with hello dolly but it would be good if something a little more complex could be tested against. It's probably a poor choice to include all of Gutenberg in the tests but something with a slightly more complex structure than dolly would be helpful.

This ticket was mentioned in Slack in #core-auto-updates by peterwilsoncc. View the logs.


3 years ago

#130 @galbaras
3 years ago

@peterwilsoncc Any errors logged during the update? Is it possible debug/trace to catch where things go wrong?

It's interesting that the "view version details" results are different for the two screens. Shouldn't be doing exactly the same thing?

Can you test with a different hierarchical plugin, just to rule out some Gutenberg-specific issue?

Last edited 3 years ago by galbaras (previous) (diff)

#131 @afragen
3 years ago

@peterwilsoncc I tested locally using Akismet. I updated from plugins page using the view details iframe and using the shiny updates. I also tested from updates page and and tested from view details iframe and using checkbox.

Everything worked.

I wonder if something in your dev environment is causing an issue. Have you tried in the WP docker test environment?

#132 @peterwilsoncc
3 years ago

@galbaras No errors or notices during download, only the fatal error when the update procedure attempts to reactivate the plugin and the file doesn't exist.

@afragen Akismet worked for me too using Chassis. It's much smaller than Gutenberg so maybe something is timing out for larger plugins. I'll test with the following:

  • Small plugin but several sub-directories and nested sub-directories
  • Large plugin with a flat file structure

I just need to find some examples of each.

#133 @peterwilsoncc
3 years ago

@afragen As mentioned via Slack, a correction on the Akismet update.

The sub-directories remain empty but Akismet uses include rather than require so doesn't fatal, it only throws warnings.

#134 follow-up: @noisysocks
3 years ago

I updated Gutenberg via the Dashboard → Updates → View version details modal and, while the upgrade worked, there were some warnings. See upgrade-warnings.png. I'm running a semi-recent trunk version of WordPress in Apache on macOS.

#135 @aristath
3 years ago

For reference, the issue reported above now has a new ticket on trac: #54166

Last edited 3 years ago by SergeyBiryukov (previous) (diff)

#136 in reply to: ↑ 134 @SergeyBiryukov
3 years ago

Replying to noisysocks:

I updated Gutenberg via the Dashboard → Updates → View version details modal and, while the upgrade worked, there were some warnings. See upgrade-warnings.png. I'm running a semi-recent trunk version of WordPress in Apache on macOS.

Just noting that I'm also getting these warnings with the current Gutenberg version, but they appear to be unrelated to this ticket. I think they are being tracked in #53565.

#137 @aristath
3 years ago

I tested reverting [51815] and I'm still getting a timeout on my local installation, so I'm not sure the errors reported above are the result of this patch... @peterwilsoncc did you try reverting the patch and see if that fixes your issue? Maybe we should be looking elsewhere for the cause? 🤔

#138 follow-ups: @peterwilsoncc
3 years ago

@aristath I did a git bisect on a copy of the built version of WP and was getting the timeout with a checkout of [51815]'s equivalent (dcca9323 on git's built version) but not with [51814]'s equivalent (a13f7b31 on git's built version).

I just retested on the built version with [51815]'s equivalent dcca9323 reverted and wasn't getting any timeouts or fatal errors. I tested with gutenberg, akismet and wpforms-lite. I reconfirmed the timeouts & fatals with the most recent commit, [51876].

As with earlier testing, I was using the "Dashboard > Updates > view version ### details > install update now" workflow to update the plugins. My local environment is using the WP_Filesystem_Direct.

#139 in reply to: ↑ 138 @pbiron
3 years ago

@peterwilsoncc Here are few plugins that have been known to cause timeout errors because of their size are (nothing against these plugins, it's just that they are known to have large downloads which can sometimes result in timeouts):

  1. Woo (https://wordpress.org/plugins/woocommerce/)
  2. Yoast SEO (https://wordpress.org/plugins/wordpress-seo/)
  3. Mail Poet (https://wordpress.org/plugins/mailpoet/)

Can you try one (or all) of those without [51815] applied?

#140 @peterwilsoncc
3 years ago

@pbiron Thanks for the additional plugins to check Paul.

I've tested with each of those plugins on the built version, I also threw Jetpack in to the mix as another big plugin.

With [51815] reverted (dcca93232b on the git built version) all update and reactivate successfully.


With the latest commit [51876] all fail to update correctly:

  • WooCommerce: Failure reported on upgrade screen due to fatal error
  • Yoast SEO: False success reported on upgrade screen (including reactivation) but actually fails to reactivate
  • MailPoet: False success reported on upgrade screen (including reactivation) but actually fails to reactivate
  • Jetpack: Successfully reactivates but installation incomplete

It appears Jetpack, Yoast SEO and MailPoet have error catching routines that gracefully manage errors if the plugin update is incomplete.

Jetpack notice following update: "Your installation of Jetpack is incomplete. If you installed Jetpack from GitHub, please refer to this document to set up your development environment. Jetpack must have Composer dependencies installed and built via the build command."

MailPoet notice upon attempting to reactivate manually: "MailPoet cannot start because it is missing core files. Please reinstall the plugin."

Yoast SEO notice upon attempting to reactivate manually: "Activation failed: The Yoast SEO plugin installation is incomplete. Please refer to installation instructions."


Doing timing checks with Yoast SEO, there's no significant difference between how long the requests take to complete with the reverted vs unreverted versions.

When testing, there was an existing backup in the upgrade/temp-backup/plugins/wordpress-seo directory however, it appeared to be incomplete. The root directories files were not included. See Screen Shot 2021-09-30 at 10.34.40 am.png.

#141 @galbaras
3 years ago

@peterwilsoncc I'm guessing no errors shown anywhere in your testing. I haven't looked closely at the code, but this possibly points to a need for more detailed capture of information, at least when there are failures.

I see that WP_Error objects are being created by the new functions and returned to their calling functions, but I can't see that the errors are output anywhere. Am I missing something?

#142 @peterwilsoncc
3 years ago

That's correct, no errors are logged during the update process. Until reactivation, success is reported for each step for the plugins that fatal/are incomplete.

By slug:

  • woocommerce, gutenberg, wpforms-lite: fatal errors logged upon reactivation attempt only. Source is the plugin's require of files.
  • akisment: warnings logged upon reactivation and subsequent page loads (due to the use of include rather than require, the update is incomplete)
  • jetpack, wordpress-seo, mailpoet: plugin's error handling gracefully recovers, showing admin notice

#143 in reply to: ↑ 138 @aristath
3 years ago

Replying to peterwilsoncc:

@aristath I did a git bisect on a copy of the built version of WP and was getting the timeout with a checkout of [51815]'s equivalent (dcca9323 on git's built version) but not with [51814]'s equivalent (a13f7b31 on git's built version).

I just retested on the built version with [51815]'s equivalent dcca9323 reverted and wasn't getting any timeouts or fatal errors. I tested with gutenberg, akismet and wpforms-lite. I reconfirmed the timeouts & fatals with the most recent commit, [51876].

Using git-bisect I'm also getting the same results, but here's the weirdness I currently see on my localhost:

  • If I rollback to before this commit, updates succeed.
  • With this commit, updates fail
  • If I pull the latest changes from master and then revert this commit, updates fail.

That last bullet-point is what makes me believe this regression is not 100% caused by the commit on this ticket, and there's maybe more things to consider. If this commit was to blame, then reverting it on the current develop branch should fix it, and it does not. 🤔

#144 @afragen
3 years ago

I saw similar timeout issues in large plugins in the WP docker test env. I was able to solve this by adding rename() as the initial attempt in copy_dir().

I shared a testing solution with @peterwilsoncc

Perhaps it’s time to update how we move downloaded plugins from the download folder to the plugins folder.

Last edited 3 years ago by afragen (previous) (diff)

This ticket was mentioned in PR #1725 on WordPress/wordpress-develop by afragen.


3 years ago
#145

  • Keywords has-patch added; needs-patch removed

Updates copy_dir() to try a rename() before attempting to copy file by file. On large plugins with many files a file copy will result in a timeout on a slow system. The patch should revert to a file copy if the rename() fails.

I was able to timeout the installation of WooCommerce, Yoast SEO, and MailPoet on the Docker WP test env. With this patch it installs fine.

Trac ticket: https://core.trac.wordpress.org/ticket/51857

afragen commented on PR #1725:


3 years ago
#146

After some discussion with @aristath I think a new PR with a move_dir function and leave copy_dir intact.

Will close this PR once that one up.

#147 @afragen
3 years ago

Closing PR 1725 in favor or PR 1727

Related: #54166

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

This ticket was mentioned in Slack in #core by afragen. View the logs.


3 years ago

#151 @afragen
3 years ago

  • Keywords needs-testing added

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


3 years ago

#153 @SergeyBiryukov
3 years ago

In 51898:

Site Health: Use an integer value as a fallback in the available disk space check.

This avoids a few type problems further in the code: boolean divided by a number, or passing false to size_format() which is documented to accept integers or strings only.

Follow-up to [51815].

Props TobiasBg.
See #51857.

#154 @SergeyBiryukov
3 years ago

In 51899:

Upgrade/Install: Introduce a move_dir() function.

This replaces the copy_dir() usage in WP_Upgrader::install_package() and aims to avoid PHP timeout issues when installing or updating large plugins on slower systems like Vagrant or the WP Docker test environment.

The new function attempts a native PHP rename() function first and falls back to the previous copy_dir().

Follow-up to [51815], [51898].

Props afragen, aristath, peterwilsoncc, galbaras, noisysocks, pbiron.
Fixes #54166. See #51857.

#155 follow-up: @johnbillion
3 years ago

  • Keywords needs-patch added; has-patch removed

There are some problems with the delete_temp_updater_backups cron event added in [51815]. It's getting scheduled multiple times and its action is not present when the event runs.

Ref: https://github.com/WordPress/wordpress-develop/blob/ad9de4547cdba36567eb609e716dcd4aac4e3e38/src/wp-admin/includes/class-wp-upgrader.php#L153-L154

  • Inside schedule_temp_backup_cleanup() we have a call to wp_schedule_event() which schedules the delete_temp_updater_backups event, and a call to add_action() for this event. The problem is the schedule_temp_backup_cleanup() method is only called when an upgrader class is initialised with init() which only happens when an actual upgrade process runs. This means the delete_temp_updater_backups event does not have an action associated with it when the cron event runs. The add_action() needs to move into default-filters.php.
  • Each time an upgrader class is initialised by calling its init() method it will schedule an additional weekly event for the delete_temp_updater_backups hook. This can be seen by performing a few updates, such as for plugins, themes, or translations, and then using a cron management tool such as WP Crontrol or wp cron event list in WP-CLI to inspect the list of cron events. An extra weekly event for delete_temp_updater_backups will be scheduled each time. The schedule_temp_backup_cleanup() method needs to only schedule this event if one doesn't already exist.

#156 @johnbillion
3 years ago

Also I recommend that the delete_temp_updater_backups event gets a wp_ prefix on its name.

#157 @SergeyBiryukov
3 years ago

In 51951:

Coding Standards: Correct alignment in WP_Site_Health::get_test_update_temp_backup_writable().

This fixes an Equals sign not aligned with surrounding assignments; expected 1 space but found 6 spaces WPCS warning.

Follow-up to [51815].

See #51857, #53359.

This ticket was mentioned in Slack in #core-test by justinahinon. View the logs.


3 years ago

This ticket was mentioned in Slack in #core-auto-updates by peterwilsoncc. View the logs.


3 years ago

#160 follow-up: @peterwilsoncc
3 years ago

As I mentioned in Slack, I think it's best if these changes are reverted as my continued experience is that they are making updates more likely to fail.

It's always in the same manner as described in my original comment in September.

Given this, I've come to the conclusion that the following commits need to be reverted and the reliability improvements added in a subsequent release:

@SergeyBiryukov as you committed this, can you please comment with either a verification or amendment to this list

As there is only one chance to get updater fixes right, I'd like to see this reworked and tested for the next major rather than risk breaking the feature.

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

#162 in reply to: ↑ 160 @SergeyBiryukov
3 years ago

Replying to peterwilsoncc:

Given this, I've come to the conclusion that the following commits need to be reverted and the reliability improvements added in a subsequent release:

@SergeyBiryukov as you committed this, can you please comment with either a verification or amendment to this list

The list of changesets looks correct to me. Would it be possible to continue investigating this during beta, and revert if there's still no solution at a later stage? Regrouping and trying again early next cycle seems like a safer option though.

This ticket was mentioned in Slack in #core-auto-updates by peterwilsoncc. View the logs.


3 years ago

#164 in reply to: ↑ 155 @pbiron
3 years ago

  • Keywords has-patch added; needs-patch removed

Replying to johnbillion:

There are some problems with the delete_temp_updater_backups cron event added in [51815]. It's getting scheduled multiple times and its action is not present when the event runs.

Thanx @johnbillion. I confirmed that the problems you reported are real.

51857-cron-fix.diff addresses them. However, instead of adding the action in wp-includes/default-filters.php, I added it in wp-includes/update.php and also move the WP_Upgrader::delete_all_temp_backups() method to a standalone wp_delete_all_temp_backups() function in wp-includes/update.php, to avoid having to change that method to be static and doing a require_once ABSPATH . 'wp-admin/includes/class-wp-upgrader.php'; on every request.

I'm not 100% sure on that new function name. As a standalone function maybe it should be wp_delete_all_update_temp_backups()??? Opinions?

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


3 years ago

#166 @pbiron
3 years ago

@SergeyBiryukov Any chance of getting 51857-cron-fix.diff committed beofre 5.9-beta1 is released today?

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


3 years ago

This ticket was mentioned in Slack in #core by pbiron. View the logs.


3 years ago

#169 @SergeyBiryukov
3 years ago

In 52192:

Upgrade/Install: Correct the weekly cron event for clearing the temp-backup directory:

  • Make sure the wp_delete_temp_updater_backups event has an action associated with it when it runs.
  • Check if the cron event already exists before scheduling it, to avoid scheduling duplicate events.
  • Move the code for clearing the temp-backup directory to a standalone function.

Follow-up to [51815], [51898], [51899].

Props pbiron, johnbillion.
See #51857.

#170 follow-up: @galbaras
3 years ago

Possible long shot: Is it possible that @peterwilsoncc is experiencing the issue from https://core.trac.wordpress.org/ticket/53705 (Plugin upgrade deletes files from other in-progress upgrades)?

#171 in reply to: ↑ 170 @peterwilsoncc
3 years ago

Replying to galbaras:

Possible long shot: Is it possible that @peterwilsoncc is experiencing the issue from #53705 (Plugin upgrade deletes files from other in-progress upgrades)?

No, sorry. I saw that come up but it's unrelated to my issue as I've been updating plugins one at a time while testing this ticket.

#172 follow-up: @dlh
3 years ago

After [51815], a database error occurs when installing WordPress from wp-admin/install.php because there's an attempt to write the cron option.

A ! wp_installing() check inside \WP_Upgrader::schedule_temp_backup_cleanup() might be all that's needed, but I'm not familiar enough with the changes in this ticket to be sure whether it's the right approach.

@pbiron
3 years ago

#173 in reply to: ↑ 172 @pbiron
3 years ago

Replying to dlh:

After [51815], a database error occurs when installing WordPress from wp-admin/install.php because there's an attempt to write the cron option.

A ! wp_installing() check inside \WP_Upgrader::schedule_temp_backup_cleanup() might be all that's needed, but I'm not familiar enough with the changes in this ticket to be sure whether it's the right approach.

Good catch @dlh .

51857.7.diff fixes the problem.

#174 @SergeyBiryukov
2 years ago

In 52284:

Upgrade/Install: Check that WordPress is installed before scheduling cleanup of the temp-backup directory.

Trying to schedule cron jobs before WordPress is installed results in DB errors, which is suboptimal.

This addresses a Table 'wp_options' doesn't exist error when running the installation with WP_DEBUG enabled.

Follow-up to [51815], [51898], [51899], [51902], [52192].

Props dlh, pbiron.
See #51857.

#175 @SergeyBiryukov
2 years ago

In 52289:

Upgrade/Install: Make some adjustments to the move_dir() function:

  • Check for direct PHP flle access and only use rename() if true.
  • Check whether the destination directory was successfully created.
  • Clear the working directory so there is internal parity within the function between the results of a successful rename() and a fallback to copy_dir().
  • Use move_dir() in WP_Upgrader::move_to_temp_backup_dir() and ::restore_temp_backup().

Follow-up to [51815], [51898], [51899], [51902], [52192], [52284].

Props afragen, peterwilsoncc, dd32, SergeyBiryukov.
See #54166, #51857.

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


2 years ago

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


2 years ago

#180 @hellofromTonya
2 years ago

  • Milestone changed from 5.9 to 6.0

The Upgrade/Install team working on #54166, #54351, and this ticket decided to revert all of the changesets and move the work to early 6.0.

Why? An unidentified problem exists on virtualized systems using Vagrant and/or VirtualBox (with tools such as VVV, chassis, and potential others).

Once the 5.9 is branched and 6.0-alpha in trunk is open for business, the changesets will be recommitted to restore the work for continued investigation, refinement, and extensive testing.

This ticket was mentioned in Slack in #core-auto-updates by peterwilsoncc. View the logs.


2 years ago

#182 @peterwilsoncc
2 years ago

In 52351:

Upgrade/install: Revert upgrader rollback features.

Revert the rollback features introduced for theme and plugin upgrades during the WordPress 5.9 cycle. A bug (suspected to be in third party virtualisation software) causes the upgrades to fail consistently on some set ups. The revert is to allow contributors further time to investigate mitigation options.

Reverts [52337], [52289], [52284], [51951], [52192], [51902], [51899], [51898], [51815].

Props pbiron, dlh, peterwilsoncc, galbaras, SergeyBiryukov, afragen, costdev, bronsonquick, aristath, noisysocks, desrosj, TobiasBg, hellofromTonya, francina, Boniu91.
See #54543, #54166, #51857.

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


2 years ago

This ticket was mentioned in PR #2225 on WordPress/wordpress-develop by afragen.


2 years ago
#185

Adds back PRs for rollback that were reverted from 5.9 release.

Updates to fix issue with VirtualBox and updates move_dir() to return better values.

Invaluable help from @peterwilsoncc, @costdev, @bronsonquick, @aristath and many others

Trac tickets:
https://core.trac.wordpress.org/ticket/51857
https://core.trac.wordpress.org/ticket/54166
https://core.trac.wordpress.org/ticket/54543

afragen commented on PR #2225:


2 years ago
#186

Lots of testing with @costdev and @bronsonquick. Below is a summary.

In testing, VirtualBox was the only common denominator. We found the following issues with VB.
https://www.virtualbox.org/ticket/1393
https://www.virtualbox.org/ticket/17971

When we tested it seems the issue is entirely about VirtualBox it not working well with the rename(). In testing, skipping the rename() in move_dir() for VB eliminated the issue entirely as the fallback is copy_dir().

We thought the most reproducible method of determining whether or not VirtualBox was running, was for the user to set either a constant or an environmental variable.

A new method in WP_Filesystem_Direct to test for either the new constant ENV_VB or the environmental variable WP_ENV_VB and return a boolean would be the simplest. Obviously the variable names can be changed.

All testing was via the Rollback Update Failure feature plugin which contained the changes.

Pinging @SergeyBiryukov and @hellofromtonya

peterwilsoncc commented on PR #2225:


2 years ago
#187

A new method in WP_Filesystem_Direct to test for either the new constant ENV_VB or the environmental variable WP_ENV_VB and return a boolean would be the simplest.

If at all possible, my preference would be to try and determine if the install is running on VirtualBox without requiring a constant be defined. VVV, Chassis and other WP dedicated systems could add these, other VB based set ups may not know to add them.

Obviously the variable names can be changed.

I'd go with something more generic but naming things is hard so will ponder.

afragen commented on PR #2225:


2 years ago
#188

Part of the issue is there doesn't seem to be a simple test to determine if a system is running on VirtualBox. If we ever do figure that out we can always add it to the is_virtual_box() function.

This ticket was mentioned in Slack in #core by afragen. View the logs.


2 years ago

@pbiron
2 years ago

#190 @pbiron
2 years ago

For some reason, I'm not able to push changes to the latest PR (get a "permission denied" error). So, I'm uploading them here as patch.

Could @afragen or @SergeyBiryukov please update the PR accordingly.

The patch doesn't change the rollback functionality, it just fixes some things with the changes to site health in the PR:

  1. fixes minor HTML errors in 2 of the new site health tests
  2. adds WP_RUNTIME_ENVIRONMENT to the WordPress Constants section of the site health debug data screen

This ticket was mentioned in Slack in #core-auto-updates by pbiron. View the logs.


2 years ago

peterwilsoncc commented on PR #2225:


2 years ago
#192

I tested this with Chassis/Chassis and define( 'WP_RUNTIME_ENVIRONMENT', 'virtualbox' ); in my wp-config file.

Updating each of the plugins that were failing consistently during the 5.9 cycle was successful:

  • akismet
  • jetpack
  • mailpoet
  • woocommerce
  • wordpress-seo
  • wpforms-lite

It appears woocommerce failed and rolled-back successfully on my initial attempt (a bulk update from the plugins screen)

I'm heading to the vet's for a check up now, will answer any questions in the discussion/comments above this-afternoon.

peterwilsoncc commented on PR #2225:


2 years ago
#193

This is where I am on this at the moment.

General questions/concerns

  • Per above comment, it seems to work on VirtualBox with the constant defined in my wp-config file.
  • I'm still a little concerned for users that don't know the virtualization software in use, for example I learnt what my host uses an hour ago. Mitigation: make sure a post goes in make/hosting alerting hosts to the new variable.
  • increasing stability by creating more I/O operations remains a concern: it's counterintuitive but sometimes that's how fixes work

Blocker

  • there were some comments in the Slack channel last week about difficulty testing with filesystems other than WP_Filesystem_Direct. These difficulties need to be overcome and the results documented on the trac ticket.

This was new info to me last week and _fair enough_ too. Last cycle (5.9) we'd found a bug in the direct filesystem and it took those of us testing and working on this ticket weeks to figure that out. I'm sure you all remember ;)

As a bug has been found in the quickest filesystem, the slower filesystems need to be checked too.

afragen commented on PR #2225:


2 years ago
#194

Seems like all the other filesystems use the $wp_filesystem->move() linked to a rename of sorts. As none of them have a fallback we could just use $wp_filesystem->move() as first choice for all non-direct filesystems. Then we'd still have copy_dir() as a fallback.

afragen commented on PR #2225:


2 years ago
#195

We could always look at ☝️ for another PR after this is committed.

afragen commented on PR #2225:


2 years ago
#196

Thoughts on the above, allowing non-direct filesystems to use $wp_filesystem->move() as first choice and then copy_dir() as fallback?

Before this PR everything uses copy_dir(). Rollback uses move_dir() with has rename as first try and fallback to copy_dir(). Adding the following should give that same option to non-direct filesystems.

{{{php
if ( 'direct' !== $wp_filesystem->method() ) {

$wp_filesystem->move( $from, $to );

}
}}}

The non-direct filesystems use some other PHP variant of rename that returns true/false for success/failure. The do no have a fallback within the ->move() so if they fail they would use copy_dir()

@costdev @aristath @pbiron @peterwilsoncc @SergeyBiryukov what do y'all think?

afragen commented on PR #2225:


2 years ago
#197

@hellofromtonya I believe I have fixed all the anonymous function as callback issues as well as several of the other ones. Can you please re-review to ensure I did it correctly. Thanks.

This ticket was mentioned in Slack in #core by costdev. View the logs.


2 years ago

#199 follow-up: @costdev
2 years ago

  • Keywords dev-feedback added

Per the discussion in the bug scrub, I'm adding the dev-feedback keyword to get as much input on this as possible. Take a look at the PR and the comments and give your thoughts to help us land this in 6.0!

This touches several other tickets: #54166, #54543 and #54351. As such, this would greatly benefit from additional testing, particularly on different filesystems.

@pbiron has extensively tested in WP_Filesystem_ftpext, and quite a few have tested with WP_Filesystem_Direct on Windows, Mac, Linux, VirtualBox, Hyper-V etc.

Additional testing on WP_Filesystem_ftpsockets and WP_Filesystem_ssh2 to ensure that subtle differences don't cause any issues would be great!

#200 in reply to: ↑ 199 @pbiron
2 years ago

Replying to costdev:

@pbiron has extensively tested in WP_Filesystem_ftpext, and quite a few have tested with WP_Filesystem_Direct on Windows, Mac, Linux, VirtualBox, Hyper-V etc.

Additional testing on WP_Filesystem_ftpsockets and WP_Filesystem_ssh2 to ensure that subtle differences don't cause any issues would be great!

I'll be able to do testing on WP_Filesystem_ftpsockets and WP_Filesystem_ssh2 this weekend.

This ticket was mentioned in Slack in #core by afragen. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by costdev. View the logs.


2 years ago

This ticket was mentioned in Slack in #hosting-community by afragen. View the logs.


2 years ago

#204 @afragen
2 years ago

  • Keywords needs-dev-note added

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


2 years ago

#206 @peterwilsoncc
2 years ago

  • Milestone changed from 6.0 to Future Release

I'm bumping this from the 6.0 milestone as I don't think the PR close enough to ready for a ticket marked early.

There are a few architectural discussions taking place on the pull request. These ought to be moved here and discussed without the presence of code.

Once these have been decided, the needs-unit-tests keyword will need to be addressed: there are a few plugins in the tests/phpunit/data directory but you may need to add an "upgrade".

If unit tests prove difficult, you may be able to use them in combination with end-to-end tests.

#207 @afragen
2 years ago

We are currently working on another approach that is a combination of a filter and some file checks to that should identify VirtualBox.

Unfortunately there is no perfect coded solution.

SergeyBiryukov commented on PR #2225:


2 years ago
#208

I'm curious about 47933c8, what kind of issues does that solve?

I'm not sure we should remove the cleanup part. Otherwise, in the case of using the copy_dir() fallback, the function behavior does not match the documented description of _moving_ a directory and only does the _copying_. This was what r52289 tried to fix:

Clear the working directory so there is internal parity within the function between the results of a successful rename() and a fallback to copy_dir().

I was thinking that maybe a similar fragment in WP_Upgrader::install_package() could be removed instead, but it runs on a slightly different input: $remote_source vs. $source. The difference is that the latter also contains the plugin directory:

$source:
/wp-content/upgrade/performance-lab.1.0.0-rc.1/performance-lab/
$remote_source:
/wp-content/upgrade/performance-lab.1.0.0-rc.1

So it seems fine to keep both fragments, the one in move_dir() removes the plugin directory, and the one in WP_Upgrader::install_package() removes the package directory after that.

afragen commented on PR #2225:


2 years ago
#209

@SergeyBiryukov the clean up part still happens as it's in the functions after move_dir() is called. I found an issue where I was using move_dir() in another case where I had to move downloaded files into a cleanly named directory so the rename of the plugin would be correct.

move_dir() was essentially getting called twice and the second time was failing as the source was deleted. I remember initially we had discussed that removing the source inside of move_dir() might be an efficient thing to do. I found an edge case.

afragen commented on PR #2225:


2 years ago
#210

Somewhere along the way r52289 was reverted in d19c3fd10b70855e94f496d338b1f37c4045180c

afragen commented on PR #2225:


2 years ago
#211

In trying to be more clear. I have an expanded download of plugin files in wp-content/updates/<large hash>/*. I'm moving them into wp-content/updates/<large hash>/my-slug/*

In this instance clearing the source also clears the destination.

afragen commented on PR #2225:


2 years ago
#212

@SergeyBiryukov with the removal of the $working_dir parameter for move_dir() in https://github.com/WordPress/wordpress-develop/commit/d19c3fd10b70855e94f496d338b1f37c4045180c and it's replacement with a delete of the $source, this makes the function a bit less useful in situations where the fallback, copy_dir(), copies into a folder inside $source. With the $working_dir parameter you could leave this unset and therefore not delete everything you just copied.

This is more an issue in that copy_dir() isn't equivalent to rename(). While perfectly fine in this usage it fails in other usages. Having the ability to not delete the $working_dir would help make move_dir() more flexible in other use cases.

Does that make sense?

afragen commented on PR #2225:


2 years ago
#213

@SergeyBiryukov I added back the $working_dir parameter and clearing as it is more versatile allowing for the use of move_dir() in places that deleting the $working_dir prematurely causes errors. Yes I have such an edge case. I'm also not certain that deleting the rollbacks inside of move_dir() is a good idea.

afragen commented on PR #2225:


2 years ago
#214

If we can delete the rollback temp files/folders in move_dir() then we can switch it back. Still testing that.

afragen commented on PR #2225:


2 years ago
#215

Yes, it's a very bad idea to delete the $working_dir in the Rollback code that saves and restores.

afragen commented on PR #2225:


2 years ago
#216

Since the above issue seems to be more an issue with copy_dir(), I have a solution.

The primary issue with copy_dir() is that it cannot copy a directory into a subdirectory of the source. What happens is at the beginning a $dirlist is created that creates an additional destination folder causing an endless loop of nested folders. This is somewhat of an edge case, but that doesn't mean we shouldn't accommodate it.

I'm really just saying let's use a more versatile fallback to move_dir() than copy_dir().

https://github.com/WordPress/rollback-update-failure/blob/develop/wp-admin/includes/file.php#L85-L128

#217 @afragen
2 years ago

  • Milestone changed from Future Release to 6.1

This ticket was mentioned in Slack in #hosting-community by afragen. View the logs.


2 years ago

afragen commented on PR #2225:


2 years ago
#219

I found a much simpler solution to the above issue for potential looping of folder creation. Just add a $skip_list item to copy_dir().

This ticket was mentioned in Slack in #core by costdev. View the logs.


2 years ago

costdev commented on PR #2225:


2 years ago
#221

To answer some potential questions about the is_virtualbox() function:

  1. Is this reliable for developers?
  • Developers will commonly be using Composer in their project. As such, developers using VirtualBox should set the Composer environment variable COMPOSER_RUNTIME_ENV to virtualbox as Composer suggests and as this comment confirms.
  • Alternatively, the is_virtualbox filter can be used to correctly identify whether the environment is using VirtualBox.
  1. What influenced the checks in this function?
  • Composer's own detection function.
  • Manual testing of file owners/groups for additional detection not included in Composer.
  1. Is there no other way?
  • We could add a constant and/or environment variable to WordPress. This is the most reliable method of detecting a virtualbox environment. Hosts would set it for hosting plans, developers would set it for local environments. However, this made committers uncomfortable. This function is the next best alternative.
  • We attempted to implement sleep() calls where appropriate, as has been done in other codebases. This wasn't fruitful, and other codebases have discussions where the sleep() isn't consistent. In addition, this would mean using sleep() in every environment.
  1. Can't we just use exec() to detect VirtualBox?
  • No. While a call to exec() would be safe as it would be immune from user input, we do not implement exec() if it's not necessary. Additionally, hosts can disable the exec() function using the disable_functions php.ini directive.
  1. Why can't we just use copy_dir() for all environments?
  • copy_dir() causes serious issues on low resource systems when copying larger plugins. move_dir() was implemented to specifically avoid this situation.
  1. Does it work?
  • Yes. We have tested this extensively.
  1. An inline comment says that detection using the vagrant username isn't reliable. Why are we doing it then?
  • This is included in Composer's own VirtualBox detection. This has not raised real-world reports that influenced its removal.
  1. What if we get reports of false positives/negatives?
  • The is_virtualbox filter can be used to correctly identify whether the environment is using VirtualBox.
  1. Will this be included in the devnote for Rollback?
  • Yes.

This ticket was mentioned in Slack in #hosting-community by afragen. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by sergey. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by afragen. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by costdev. View the logs.


2 years ago

#226 @costdev
2 years ago

  • Keywords commit added; needs-testing dev-feedback removed

This ticket was discussed in the recent bug scrub. Adding commit for committers to take a look at this one.

For committers, is it fine for tests to be added during the cycle?

This ticket was mentioned in Slack in #core by afragen. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by chaion07. View the logs.


2 years ago

peterwilsoncc commented on PR #2225:


2 years ago
#229

On a quick review, there are a number of suggestions on this ticket that have been marked resolved without adding the change/explaining why the change is not needed. Could explanations be added where appropriate.

I'm still not convinced this will make updates more reliable. Testing locally a Jetpack update took 1.00 minutes on trunk and 1.55 minutes on this branch. An updating taking 50% longer and doing more operations seems more likely to fail.

When bulk updating there's not additional feedback so it's a very long time to be looking at a screen that looks like this.

https://i0.wp.com/user-images.githubusercontent.com/519727/172310308-7b1cfad5-4560-4b6e-9e60-477dc4c685e5.png

I'm testing with this line in my local config file define( 'WP_RUNTIME_ENVIRONMENT', 'virtualbox' );

afragen commented on PR #2225:


2 years ago
#230

Peter, anyone using VirtualBox will not see any performance improvements because they will continue to use copy_dir() and not move_dir()/rename.

I left the comments unresolved when we were waiting for further discussion. Is there a specific question about a resolved conversation? Most were resolved in code or discussion.

costdev commented on PR #2225:


2 years ago
#231

Peter, the constant no longer exists and was replaced with the detection methods in is_virtualbox().

Even if not using a virtualbox environment, you can trigger the virtualbox fallback via:

add_filter( 'is_virtualbox', '__return_true' );

or by setting the COMPOSER_RUNTIME_ENV to virtualbox.

The speeds you're getting seem much longer than in my testing. Can you elaborate on your testing environment?

SergeyBiryukov commented on PR #2225:


2 years ago
#232

I'm still not convinced this will make updates more reliable. Testing locally a Jetpack update took 1.00 minutes on trunk and 1.55 minutes on this branch. An updating taking 50% longer and doing more operations seems more likely to fail.

Thanks for raising this concern! It was discussed in today's Upgrade/Install component meeting.

A large plugin update may indeed take longer in a virtualized environment due to creating a temporary backup with copy_dir() and deleting it after a successful update. However, creating a backup (and restoring it in case of failure) is precisely what should make the update more reliable, and on a real-world hosting environment it should be near-instant due to using the rename() native PHP function instead of a slower copy_dir().

Restoring happens on PHP shutdown, which means it is not affected by the `max_execution_time` limit, so in case the failure was due to a PHP timeout, it will still be able to properly restore the previous version.

When bulk updating there's not additional feedback so it's a very long time to be looking at a screen that looks like this.

This seems unrelated to the temp backups functionality, but rather to slower updates in general. I think it should be explored in a separate ticket, maybe #WP34676 (Optimizing bulk updates).

afragen commented on PR #2225:


2 years ago
#233

@peterwilsoncc I just updated the same 6 plugins in a bulk update using Local, PHP8, MySQL 8 and PR2225. Otherwise clean install, no plugins active.

It took 49 seconds for all of them.

When I set the filter to run like I was using VirtualBox the total time for bulk update of those 6 plugins was 2 min 14 sec.

I don't really have any idea what's going on with your testing environment but I can't even come close to replicating your results.

afragen commented on PR #2225:


2 years ago
#234

In testing on a remote server running PHP8, linux, nginx, MySQL clean install with PR2225.

Bulk update the above 6 plugins, ~15 seconds

Build update with VB filter set, 6 plugins, ~40 seconds

costdev commented on PR #2225:


2 years ago
#235

Tests on Chassis

Environment

  • Windows 10
  • Chassis
  • Provider: VirtualBox
  • VM:
    • Memory: 2048
    • Cores: 4
    • N.B. This setup is slow in general. Times are not reflective of non-VirtualBox virtualized environments, and may not be reflective of all VirtualBox/Chassis setups on Windows 10.
  • Plugins:
    • akismet
    • jetpack
    • mailpoet
    • woocommerce
    • wordpress-seo
    • wpforms-lite

Test 1

  • Without Rollback: 2:08
  • With Rollback: 3:54
  • Difference: 1:46 (+44.91%)

Test 2

  • Without Rollback: 2:10
  • With Rollback: 3:55
  • Difference: 1:45 (+44.68%)

Test 3

  • Without Rollback: 2:08
  • With Rollback: 3:56
  • Difference: 1:48 (+45.76%)

Test 4

  • With Rollback code in Core, but disabled for VirtualBox: 2:08
  • Difference: 0:01-0:02 (+0%)
  • This adds ! is_virtualbox() to 5 locations:
    • wp-includes\update.php Line 1037 (wrap this line in a conditional)
    • wp-admin\includes\class-wp-upgrader.php
      • All of these are the final condition in the if statement.
      • Line 155
      • Line 538
      • Line 864
      • Line 886
      • Other lines may be necessary, but wp-content/upgrade was clean after upgrading, and activating the plugins produced no errors.

costdev commented on PR #2225:


2 years ago
#236

IMO, we either:

  • Accept the 44-45% increase due to VirtualBox having to use copy_dir() twice in order to update the plugin + create the temp backup (note: this extends to three times when rolling back). OR
  • Add the conditions in Test 4 to only bring Rollback to non-VirtualBox environments.
    • Caveat(s): VirtualBox environments will still need to be identified in order to avoid running temp backups etc., so the same caveats still apply to the modified version in Test 4.

SergeyBiryukov commented on PR #2225:


2 years ago
#237

Add the conditions in Test 4 to only bring Rollback to non-VirtualBox environments.

I was also leaning towards disabling the temp backups/rollback functionality on VirtualBox. To make it a bit more flexible though, would it make sense to introduce a specific (maybe filterable) function for that, e.g. wp_plugin_theme_temp_backups_enabled(), encapsulating the is_virtualbox() check? I think that would be more clear than multiple is_virtualbox() checks.

afragen commented on PR #2225:


2 years ago
#238

If this is a way forward can we commit and work out how we skip Rollback for VirtualBox in a follow up ticket?

peterwilsoncc commented on PR #2225:


2 years ago
#239

I don't think skipping rollbacks on virtualbox is a good idea, I certainly wouldn't want it committed:

  • as a group working on the ticket, we only know of problems in virtualbox -- there may be issues elsewhere but we don't know what we don't know
  • it's not good architecture, it hides the problem rather than solving it
  • anytime there are forks in code like this it complicates support and diagnostics.

As a rule, the internet runs on virtual servers these days so if this issue is wider then there are consequences. Especially if it prevents sites from receiving a forced update due to a major security issue.

costdev commented on PR #2225:


2 years ago
#240

as a group working on the ticket, we only know of problems in virtualbox -- there may be issues elsewhere but we don't know what we don't know

As a rule, the internet runs on virtual servers these days so if this issue is wider then there are consequences. Especially if it prevents sites from receiving a forced update due to a major security issue.

There is no indication of an issue existing in any other virtualized environment. This was traced to a specific bug in virtualbox, and one that has gone unresolved for many years. This is even more evident when you look at Composer having alternative behaviour for only one virtualized environment - virtualbox. In addition, we have tested this in other virtualized environments like WSL2 (HyperV) and Docker, without experiencing the bug.

costdev commented on PR #2225:


2 years ago
#241

@peterwilsoncc do you have any ideas regarding improving rollback performance for virtualbox?

SergeyBiryukov commented on PR #2225:


2 years ago
#242

It looks like move_dir() is once again not clearing the source directory after a fallback to copy_dir(). I see how that might seem redundant, as that cleanup currently happens in WP_Upgrader::install_package(). However, as the function can be used in other places, I think it should have a predictable and consistent behavior of _moving_ a directory, whether or not the copy_dir() fallback is involved. I think this was addressed in 493a99a but appears to be reverted in 47933c8. I would love some more details on the reasoning for the revert: if clearing the source directory after copy_dir() in move_dir() is problematic, wouldn't the rename() call have the same issue, as it does not leave the source directory either?

do you have any ideas regarding improving rollback performance for virtualbox?

Could it be any faster if we go back to creating a zip archive of the current plugin or theme version, as in some earlier iterations of this project, instead of copying it?

As a rule, the internet runs on virtual servers these days so if this issue is wider then there are consequences. Especially if it prevents sites from receiving a forced update due to a major security issue.

I might be missing something, but how would this prevent a site from receiving an update? Is it just because an additional operation of creating a backup is performed?

#243 @audrasjb
2 years ago

  • Keywords commit removed

Removing commit workflow keyword as it appears there's still some discussion on the implementation :)

afragen commented on PR #2225:


2 years ago
#244

@SergeyBiryukov we're testing the re-addition of clearing the working directory now. There have been a number of updates since I last tested and so far in preliminary testing it appears to be working just fine.

Using zip archives seemed to be overly resource intensive for hosting..

SergeyBiryukov commented on PR #2225:


2 years ago
#245

My concern is that the extra operations cause the process to take longer. Each time a request gets longer, the more likely it is to time out. While ini_set() can be used to extend the PHP execution time, it doesn't affect the hard limit imposed by web servers for HTTP requests.

The lack of visual feedback also makes it more likely people will leave the update screen. When testing for visual feedback, I almost left the screen during a bulk update thinking some thing had crashed.

I see, thanks! That makes sense to me.

Another idea that might be worth considering is creating backups for installed plugins and themes in advance, so that it does not affect the update process. That would require more disk space but should address the performance concerns. I think this could be achieved by creating a copy in the temp-backups directory at the same time the plugin or theme is installed, but on PHP shutdown, so that it does not block the UI and is not affected by max_execution_time. Then, by the time the update runs, we already have a backup and don't need to perform extra operations. If the update fails, we restore the backup, also on PHP shutdown, like we already do in this PR.

pbiron commented on PR #2225:


2 years ago
#246

Another idea that might be worth considering is creating backups for installed plugins and themes in advance, so that it does not affect the update process. That would require more disk space

But the extra disk space is likely itself to cause many more failures. We only have anecdotal eveidence on this, but I believe that a common cause of failures is lack of disk space! That why we opened https://core.trac.wordpress.org/ticket/51928 a long time ago...but that ticket has stalled as well.

afragen commented on PR #2225:


2 years ago
#247

This also wasn’t really an issue when creating zip files as the backup but hosting said it would be a resource issue.

I can resurrect that branch?!

afragen commented on PR #2225:


2 years ago
#248

In previous testing is seems that running echo 2 > /proc/sys/vm/drop_caches reliably dropped the correct VirtualBox caches such that rename properly functions.

As the use of VirtualBox is a hosting issue, we propose the host set up a cronjob that runs a "watcher" that would run this task. We could further specify when to run by adding pre_move_dir and post_move_dir action hooks and the host would use that to set an environment variable to test for before dropping the caches.

In this way no special changes would be required by the user.

SergeyBiryukov commented on PR #2225:


2 years ago
#249

But the extra disk space is likely itself to cause many more failures. We only have anecdotal eveidence on this, but I believe that a common cause of failures is lack of disk space! That why we opened https://core.trac.wordpress.org/ticket/51928 a long time ago...but that ticket has stalled as well.

Good point! Slightly different proposal then: when installing a plugin or theme, either from the WP.org directory or from a zip archive, keep the archive in the temp-backups directory, instead of the extracted copy. If an update fails, restore the backup from that archive.

That might be a good balance between the extra disk space and the performance concerns. And let's revive that ticket to have some data :) I'll put it on my list to review.

costdev commented on PR #2225:


2 years ago
#250

But the extra disk space is likely itself to cause many more failures. We only have anecdotal eveidence on this, but I believe that a common cause of failures is lack of disk space! That why we opened https://core.trac.wordpress.org/ticket/51928 a long time ago...but that ticket has stalled as well.

Good point! Slightly different proposal then: when installing a plugin or theme, either from the WP.org directory or from a zip archive, keep the archive in the temp-backups directory, instead of the extracted copy. If an update fails, restore the backup from that archive (on PHP shutdown, like we already do in this PR). If an update succeeds, replace the archive with a newer one.

That might be a good balance between the extra disk space and the performance concerns. And let's revive that ticket to have some data :) I'll put it on my list to review.

To summarize, based on some comments above, it looks like creating a backup during an update can critically slow it down in some cases, so I think creating it earlier (by not deleting the package) and having it ready by the time the update runs, so that we don't need to perform any extra operations, would be important for this to move forward.

Problem case:

  • A new version of a plugin is released. This version includes a bug fix.
  • While waiting for this fix, the site owner/developer modified the plugin to add a stopgap fix.
  • A failed update would restore the original archive, without the stopgap fix.

While we highly discourage direct editing of plugin files, we know people still do it, and we'd be breaking it because of an approach we chose for a feature that can't be disabled.

afragen commented on PR #2225:


2 years ago
#251

summarize, based on some comments above, it looks like creating a backup during an update can critically slow it down in some cases, so I think creating it earlier (by not deleting the package) and having it ready by the time the update runs, so that we don't need to perform any extra operations, would be important for this to move forward.

What happens for premium plugins that may be installed/updated by a slightly different method than uploading a zip?

SergeyBiryukov commented on PR #2225:


2 years ago
#252

Problem case:

  • A new version of a plugin is released. This version includes a bug fix.
  • While waiting for this fix, the site owner/developer modified the plugin to add a stopgap fix.
  • A failed update would restore the original archive, without the stopgap fix.

While we highly discourage direct editing of plugin files, we know people still do it, and we'd be breaking it because of an approach we chose for a feature that can't be disabled.

Yeah, I was thinking about that possibility too. In that case, how about a daily cron task, which creates zip backups for all installed plugins and themes on PHP shutdown? I think it could check the last modification date of a plugin or theme's directory to only refresh the backup if something has changed. Was going to suggest that initially, but thought restoring from the downloaded zip might be simpler :)

costdev commented on PR #2225:


2 years ago
#253

We have been working on an alternative implementation that brings the resolution closer to where the bug occurs - VirtualBox - rather than putting the onus on WordPress Core to find an alternative even remotely comparable in performance to rename().

This involves providing the ability for VirtualBox-based environments to correct VirtualBox's filesystem cache at the exact point needed. It does not involve WordPress having to detect VirtualBox in any manner. In testing on Chassis, this implementation performs the upgrade in 58-66 seconds for the six plugins mentioned earlier in this PR, and tests are consistently successful.

We are in discussions with VVV and Chassis maintainers for feedback that will be coming back to us soon. More to follow.

tomjn commented on PR #2225:


2 years ago
#254

I would note that an MU plugin means that users would need to commit this MU plugin to their sites git repo, which would then mean the MU plugin would appear in their production sites.

I consider the MU plugin solution to be a worst case scenario and the most user hostile.

It also ignores that WordPress hosted on any filesystem with micro-caches will run into this issue, namely shared network hosts. Or that:

  • not all VVV's are VirtualBox
  • not all VirtualBox installs use the same shared mounting system

There are also a lot of forks of VVV that will never adopt this or be aware of this, and the proposed watcher assumes only a single site is in the environment, which is false by default on VVV.

tomjn commented on PR #2225:


2 years ago
#255

I think for now the option to disable rollbacks via a constant is the most user friendly and performant option, as what was suggested to me doesn't look feasible, and requires a very high level of technical expertise to implement that most users are incapable of performing.

I am also fine with VirtualBox performance not being as good since it's a local environment not a production environment, and plugin updates do not happen on every request.

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


2 years ago

afragen commented on PR #2225:


2 years ago
#257

@SergeyBiryukov @peterwilsoncc

The current PR removes all reference to VirtualBox-specific code. We have been working on creating a VirtualBox-specific solution that only requires the hooks, pre_move_dir and post_move_dir, currently in the PR.

It is highly unlikely that the PR will need any additional adjustments to accommodate these users.

This ticket was mentioned in Slack in #core-auto-updates by hellofromtonya. View the logs.


2 years ago

#259 @hellofromTonya
2 years ago

  • Keywords needs-testing added

Summary of today's Upgrade/Install component meeting

Where: in Making WordPress slack #core-auto-updates channel
Link: https://wordpress.slack.com/archives/CULBN711P/p1655229327510739

tl;dr

👉 More testing and feedback is needed from web hosts and extenders before this lands in Core. 👈

Why? To surface unknowns and raise confidence.

Discussion

  • Had trouble getting web hosts to test and give feedback

👉 Without this feedback, how do we know the PR won't impact them or their customers? 👈

  • Core first? Or feature plugin?

Previous thinking:
Get it into Core trunk first.
Then (with being in Core as a catalyst) broader feedback and validation will come.

Note: This thinking was an evolution due to troubles getting broader web hosting feedback (as noted above).

Current thinking:
Get broader feedback and testing first.
Then when ready, land it in Core.

  • Add temporary code into Core's Upgrader to invoke the feature plugin's move_dir().

Sergey shared there is precedence for doing this: [25590], [26865], [32625]. He also noted

Should probably be removed from the final release after branching if that doesn't happen earlier.

Action Items

  • Develop a testing and feedback strategy
    • Use the featured plugin for folks to test it
    • Make it easy for people to participate in testing and giving feedback
    • Define what kind of feedback is needed / wanted
    • Include outreach to web hosts and extenders
  • Need a PR that adds code to the Upgrader to invoke the feature plugin's move_dir() when the plugin is activated.
  • Once strategy is set, then publish a Make post to ask to ask for testing and feedback.
Last edited 2 years ago by hellofromTonya (previous) (diff)

This ticket was mentioned in Slack in #core-auto-updates by hellofromtonya. View the logs.


2 years ago

This ticket was mentioned in Slack in #hosting-community by pbiron. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by pbiron. View the logs.


2 years ago

#263 @SergeyBiryukov
2 years ago

In 53578:

Upgrade/Install: Add a conditional to facilitate testing of the Rollbacks feature project.

The Rollback Update Failure feature project creates a temporary backup of plugins and themes before updating. This aims to make the update process more reliable and ensure that if a plugin or theme update fails, the previous version can be safely restored.

If the Rollback Update Failure plugin is installed, WP_Upgrader::install_package() will use the move_dir() function from there for better performance. Instead of copying a directory from one location to another, it uses the rename() PHP function to speed up the process, which is instrumental in creating a temporary backup without a delay. If the renaming failed, it falls back to copy_dir() WP function.

This conditional aims to facilitate broader testing of the feature. It is temporary, until the plugin is merged into core.

Props afragen, pbiron, costdev, davidbaumwald, audrasjb, jrf, SergeyBiryukov.
Fixes #56057. See #51857, #54166.

This ticket was mentioned in Slack in #hosting-community by javier. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by chaion07. View the logs.


22 months ago

#266 @chaion07
22 months ago

Thanks @pbiron for reporting this. We reviewed this ticket during a recent bug-scrub session. Here's a summary of the feedback received from the team:

  1. It would be helpful to respond to the Call for testing, especially if you represent a hosting company
  2. Both Testing information and Patches are provided. However, unit tests may possibly be difficult to achieve as there currently aren't any for any of the Upgrader classes
  3. A separate ticket #54245 reported for WP_Upgrader unit tests as mentioned in one of the previous comments. However we are doubtful anything specific to this PR will happen before time to consider it being committed. We would also hope, as there are no unit tests for this class that it may not matter.

Cheers!

Props to: @afragen @pbiron @costdev @SergeyBiryukov @aristath @hellofromTonya @davidbaumwald @jrf @audrasjb @thisisyeasin

Last edited 22 months ago by costdev (previous) (diff)

#267 @afragen
22 months ago

  • Keywords needs-unit-tests removed

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


22 months ago

This ticket was mentioned in PR #3044 on WordPress/wordpress-develop by costdev.


22 months ago
#269

  • Keywords has-unit-tests added

This PR expands on PR 1754 to:

  • [x] bring PR 1754 in line with modern Core/Test standards including:
    • [x] improved test method descriptions.
    • [x] clearer test method names.
    • [x] data_ prefixes for data providers per Trac 53119.
    • x $message parameters per [Core Handbook: Writing PHPUnit Tests - Using Assertions].
    • [x] Line length limits for contributor a11y consideration.
    • x Filter callbacks changed to closures + static where appropriate per [Core Handbook: Writing PHPUnit Tests - One-off Functions for Hooks].
  • x adjust a refactor in [PR 1754] to remove a now unnecessary condition and restore a separated if to elseif.
  • x remove an unnecessary constant added to tests/phpunit/includes/bootstrap.php in [PR 1754].
  • x remove tests related to [the Rollback feature], which is not yet committed.
    • These have also been updated with the above and retained in a separate branch for a later time.
  • [x] add missing generic strings to WP_Upgrader::generic_strings() that caused test failures.
  • [x] add more tests/datasets.

Line coverage is now 58.04% with trunk.

I believe this is as far as I can go for now with unit tests due to the use of general functions and the need to mock WPDB to ensure the tests are pure unit tests. Otherwise, integration tests can be added to cover the areas not covered by this PR.

Trac ticket:

#270 @costdev
22 months ago

  • Keywords has-unit-tests removed

Oops! Please disregard PR 3044 - which was only intended to reference this ticket. The PR's description has been updated to remove the link to this ticket.

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


22 months ago

#272 @hellofromTonya
22 months ago

  • Keywords early removed
  • Milestone changed from 6.1 to Future Release

Feature Current State

Summary of discussions from the core-auto-updates and core slack meetings:

@peterwilsoncc and I moved this feature to a featured plugin. Why? Because there are concerns that are too risky to mitigate after shipping in a major.

In a featured plugin, it can gain feedback, faster iteration, and when ready, then be scheduled for a future major release.

Q: What's the bar for determining when it's ready?
A: When the concerns are removed.

Q: What will remove the concerns?
A: Active real-world testing across multiple web hosted platforms as well as feedback from web agencies whose development workflows use VirtualBox environments.

Concerns:

  • What are the unknown impacts when doing larger bulk updates when multiple failures occur, especially when resources are throttled on shared hosting?
  • What state does it leave a site in when failures happen?
  • What is the impact to agencies and developers who use VirtualBox powered local development workflows such as VVV and others?

@davidbaumwald raised additional concern:

Yeah, the infra that makes me question whether this is ready are network-based volumes and such where there's additional latency. Unless you have $$$, I doubt anyone has recreated that yet


Testing / Feedback Status:

I've been actively reaching out to web hosting companies to coordinate testing. I'm currently working with wp.com, Bluehost, GoDaddy, and WP Engine. I've reached to others too (awaiting feedback). One the goals is to test the plugin across 1000s of sites. It'll take time to set up these tests.

Some of the criteria I've asked of these hosts are:

  • Setting up an environment with at least 3 large plugins such as WooCommerce, Jetpack, BuddyPress, etc.
  • Benchmark bulk updates the large plugins:
    • before activating the feature
    • after activating the feature
    • then with the feature activated, force 1 failure in a large plugins
    • and then force multiple large plugins to fail during bulk update
  • Benchmarking data to capture for all of these scenarios:
    • Time: How long does a bulk update?
    • Resources: How much resource usage is consumed?

As the testing completes for each hosts, I'll add the results to the Call for Testing post on Make/Core.

Ticket Updates:

As this feature is in the featured plugin phase, I've moved it to Future Release. Once the concerns are mitigated (see above), then please move it into a milestone.

I've removed the early keyword as this is not necessary since it'll be battled tested through a featured plugin process.

This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.


22 months ago

#274 @TJNowell
22 months ago

Noting I have a lot of testing I was hoping to do regarding VVV and Virtualbox but my local internet situation has been a worst case scenario with an ETA of April now being an ETA of September. Without a reliable fixed home internet connection I can only test with my M1 Pro Macbook but as it's an Apple Silicon device running Parallels it isn't helpful for this ticket.

Changes I wanted to test included various guest OS based changes in hopes that at least for VB users a recommendation to mitigate the issue could be made at a platform level if rollbacks was something they wanted. Some of the unwanted filesystem micro-caching behaviour is implemented in VB Guest add-ons, and the vboxsf filesystem.

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


22 months ago

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


22 months ago

This ticket was mentioned in Slack in #core-auto-updates by hellofromtonya. View the logs.


21 months ago

#278 @ignatggeorgiev
21 months ago

As discussed on WCUS Contributor's day, I'm doing some testing on this and will report, if any issues arise.

tomjn commented on PR #2225:


20 months ago
#279

I'd just like to reiterate again that keeping things working is more important than keeping things fast. Do not sacrifice VirtualBox users with the ultimatum that they must implement some complex worker scheme to get things fast (a scheme that does not work with VMs that contain multiple WP installs).

Even if all the major Virtualbox local dev environments implemented this worker scheme, most users won't see it as they don't update, a lot will turn it off, and it won't touch the majority who forked or rolled their own private projects.

This will also be true if testing managed to produce an easy configuration fix for the virtualbox + vagrant combination. If only because there are lots of people who do not use vagrant to manage their virtualbox

This ticket was mentioned in Slack in #meta by costdev. View the logs.


20 months ago

This ticket was mentioned in Slack in #core-upgrade-install by afragen. View the logs.


20 months ago

This ticket was mentioned in Slack in #core-upgrade-install by hellofromtonya. View the logs.


20 months ago

This ticket was mentioned in Slack in #core-upgrade-install by afragen. View the logs.


19 months ago

This ticket was mentioned in Slack in #core by pbiron. View the logs.


19 months ago

@costdev commented on PR #2225:


19 months ago
#285

I'd just like to reiterate again that keeping things working is more important than keeping things fast.

To clarify: The upgrader currently uses a recursive copy. This can't work for Rollback as it caused timeout issues on low-resource systems. rename() doesn't produce this timeout, and is a more appropriate method for moving files. The speed is a happy side effect, but wasn't the reason for using it.

Do not sacrifice VirtualBox users with the ultimatum that they must implement some complex worker scheme to get things fast (a scheme that does not work with VMs that contain multiple WP installs).

  1. I think suggesting we're sacrificing VirtualBox users isn't representative of the work we've done in developing multiple solutions and landing on this VirtualBox-specific one. I've put more time into exploring VirtualBox solutions than I should have as a WordPress Core contributor. This bug in VirtualBox requires resolution in the box, not in WordPress, and should be considered the responsibility of VirtualBox contributors or, failing that, of VirtualBox environment maintainers/VirtualBox users who roll their own. I'm happy to discuss any and all alternatives that others have done the groundwork on to test and prove feasible.
  2. The proposed worker scheme requires pasting one block of commands once, followed by a command block for a Must-Use plugin.
    • I don't see this as complex when major VirtualBox local dev environments can handle this in provisioning, and for those who roll their own, the commands for this worker scheme aren't complex compared to rolling your own.
    • For major VirtualBox local dev environments, the Must-Use plugin can be automatically added to .gitignore during provisioning.
    • For users who roll their own, this can be manually added to their global .gitignore,
    • For both, the plugin could be symlinked to only have one copy of the plugin in the box that serves all WordPress installs.
  3. The worker looks for a /tmp/wpclearcache file, which is created by the Must-Use plugin. Every WordPress install in the box will create this file, and therefore the worker scheme supports multiple WordPress installs.

    Even if all the major Virtualbox local dev environments implemented this worker scheme, most users won't see it as they don't update, a lot will turn it off, and it won't touch the majority who forked or rolled their own private projects.

We plan to have the worker scheme publicised on wordpress.org (dev notes, installation requirements, etc), Twitter, WP Tavern, and such, to maximise its visibility.

This will also be true if testing managed to produce an easy configuration fix for the virtualbox + vagrant combination. If only because there are lots of people who do not use vagrant to manage their virtualbox

The worker scheme doesn't require vagrant, and to my knowledge, a configuration fix for the VirtualBox + Vagrant configuration is still theoretical - I don't believe any solution has been discovered yet.

@TJNowell commented on PR #2225:


19 months ago
#286

@costdev I already understood how it worked and what it did. Trying to explain it again or re-assure is just patronising, I do not appreciate the dismissive tone to the large and very real issues the proposal brings. It is not the silver bullet you think it is. I know that the experience when it is implemented is superior, but implementation is not trivial, and not always possible. Any illusions that you can write some instructions and share it on twitter/tavern/etc/etc need to be shattered ASAP.

I don't see this as complex when major VirtualBox local dev environments can handle this in provisioning

This is not true. Dropping the MU plugin into place is easy for brand new sites, but for everything else it's a major minefield and logistical nightmare. Even figuring out the location of the mu-plugins folder reliably for an arbitrary site is a major challenge.

This is assuming the user is running the latest and greatest unmodified, which is rarely the case.

and for those who roll their own, the commands for this worker scheme aren't complex compared to rolling your own.

A lot of VVV users are non-technical, and won't be aware of the mu-plugins folder, many aren't even aware it's called VVV. While there are many developers who use VVV that can handle this easily, many will need hand holding. We spent years installing VVV at contributor days, they won't recognise any VirtualBox announcements as relevant to them.

For major VirtualBox local dev environments, the Must-Use plugin can be automatically added to .gitignore during provisioning.

Assuming the previous challenges were overcome, modifying the .gitignore automatically would be a violation of trust and a gross betrayal of users. Local dev environments don't mess with client git repos like this. This on its own could be more controversial than anything that has been proposed so far.

For both, the plugin could be symlinked to only have one copy of the plugin in the box that serves all WordPress installs.

This actually runs head first into a major Windows issue, this symlinking would need to happen on the host which is a challenge in and of itself, and many Windows installs require system administrator privileges to create a symlink. Many users have no privilege to do this on their system at all due to company enforced group policies, making this impossible to do.

A copy for each site is therefore mandatory for a huge number of people, purely because symlinks won't be possible, even on non-VirtualBox systems.

---

At the moment there are no plans to implement the worker proposal for VVV, and if we did it still wouldn't reach most users. Do not proceed on the assumption that anybody with issues can just apply the worker proposal. Degraded performance is an acceptable outcome, even excruciatingly slow performance is a bitter but acceptable pill to swallow, but breakage is not.

Likewise if I found a magical config option that made the entire problem go away tomorrow and applied it to the next VVV version, you would still have to deal with the breakage. The timeline for implementing that could be years, with many users never discovering it.

@costdev commented on PR #2225:


19 months ago
#287

@tomjn > I already understood how it worked and what it did. Trying to explain it again or re-assure is just patronising, I do not appreciate the dismissive tone to the large and very real issues the proposal brings. 

You previously said:

keeping things working is more important than keeping things fast

As our approach isn't about speed - using a recursive copy didn't work, I thought it was appropriate to clarify this. My intention was not to be patronising, or to dismiss, but to clarify and discuss.

To your other points, I'd suggest others on the Rollback team and Core committers respond to these. I'm withdraw from the conversation so that the discussion can continue without animosity.

@TJNowell commented on PR #2225:


19 months ago
#288

At some point this PR worked on VirtualBox but it worked slowly, as long as that doesn't change then it's fine and there is no issue.

The problem is the VBox proposal changing from an enhancement, to a mandatory requirement. It cannot and must never be a mandatory requirement.

This ticket was mentioned in Slack in #core-upgrade-install by afragen. View the logs.


19 months ago

@peterwilsoncc commented on PR #2225:


19 months ago
#290

As I have said before, I share Tom's concerns about mandating developers on particular environments run additional code, add constants or do other work to workaround a known issue.

Furthermore, there isn't a precedent in WordPress that I am aware of in which certain users are required to run special code in order to fix an issue in Core. It's not a precedent I'd be willing to sign off on creating either. To be clear, it's not something I'd be willing to sign off on in my role as a committer.


I'm also concerned about performance impacts. During an upgrade sites go in to maintenance mode so the longer an update takes, the longer the site remains in maintenance mode. However, if the effect is likely to be on local environments only I may (emphasis on _may_) be convinced provided some feedback is provided to the user. Without feedback, it looks like the updates have failed.


I'm going to circle back to my long held concern on this feature: if feels like the definition of success is to get it committed to WordPress, rather than to get it working. I and other committers have said this a number of times but still we're having the same conversation.

To a very large extent, it is WordPress's role to work around this issue in VirtualBox. That has precedent with functions such as wp_parse_url(), wp_json_encode() and, arguably, wp_strip_all_tags(). Additional examples of WP working around different systems can be found by searching for PHP_VERSION_ID, work around and workaround` in the source code.

This ticket was mentioned in PR #3805 on WordPress/wordpress-develop by @afragen.


17 months ago
#291

copy_dir() is a recursive file copy for directories and is used when updating plugins, themes, and core.

There are several tickets related to timeouts and optimization of the update process. As part of Rollback #51857, a new function move_dir() and a new filter #56057 were introduced. This ticket is an improvement upon #56057.

Optimizing the plugin/theme update process can be solved using the move_dir() function and a filter similar to #56057.

#57357 was opened to add move_dir() to core. This ticket is to add the new filter.

The new filter would easily allow the substitution of move_dir for copy_dir adding more efficiency to the plugin/theme update process by having the user add a simple filter. This could improve the efficiency and performance for 99+% of users who opt-in and will likely fix #53832, #54166, and #34676.

{{{php
add_filter( 'upgrader_copy_directory', function($callback) {

return 'move_dir';

}, 10, 1 );
}}}

I believe the use cases are sufficient to merit inclusion to core, additionally both of these have been tested rigorously in the Rollback PR/feature plugin.

Trac ticket:

This ticket was mentioned in Slack in #core by peterwilsoncc. View the logs.


17 months ago

This ticket was mentioned in Slack in #core-upgrade-install by peterwilsoncc. View the logs.


17 months ago

@afragen commented on PR #3805:


17 months ago
#294

PR now using switch filter.

{{{php
if ( ! apply_filters( 'upgrader_use_move_dir', false, ... ) {

copy_dir();

} else {

move_dir();

}
}}}

#295 @peterwilsoncc
16 months ago

Testing Notes

Environment

Plugin slugs (version being prior to update)

  • akismet 4.1.10
  • gutenberg 14.8.3
  • jetpack 11.3.2
  • mailpoet 4.0.1
  • woocommerce 7.0.0
  • wordpress-seo 19.7
  • wpforms-lite 1.7.0

Succeeded with each of the following methods

  • Update action within each item on the plugin screen's list table
  • Select each plugin and use bulk update action in the plugin screen's header
  • Select add and bulk update on Dashboard > Updates screen

Time

I only ran timing tests using the bulk update action on the Dashboards > Updates screen, it seems to be a great improvement:

  • With rollbacks and faster updates: 3m 34s
  • WordPress current system: 5m 50s

In both cases this is a long time without visual feedback but that may be a problem for another day.


I'd like to see some testing with VirtualBox 7 but don't really want to update myself just yet.

It would also be good to test without plugin dependencies installed (it's currently a dependency of the rollback plugin to force the faster-updates dependency).

This ticket was mentioned in Slack in #core-upgrade-install by peterwilsoncc. View the logs.


16 months ago

@afragen commented on PR #3805:


16 months ago
#297

Closing in favor of PR3911

#298 @richards1052
16 months ago

I wanted to report that the latest auto-update of Rollback failed. I updated it manually. But the reason I installed it originally was because Mailpoet was failing to update and deactivating itself. This week, for the first time since I installed Rollback, Mailpoet again decactivated itself.

I don't know why this is happening. But as I'm the OP who first reported the issue here, I wanted the folks concerned about, or working on this to know about the issue.

Last edited 16 months ago by richards1052 (previous) (diff)

#299 @afragen
16 months ago

@richards1052 I’ve responded on the forum issue you created.

@afragen commented on PR #860:


16 months ago
#300

Reference for how to rollback with zip files.

@afragen commented on PR #2225:


16 months ago
#301

I'm removing the auto-update functionality to focus on the core Rollback feature. I will create a new ticket for the auto-update feature.

#302 @afragen
16 months ago

A big hurdle has been overcome due to #57375 and #57557 having been committed.

The Rollback Update Failure plugin has been updated to take advantage of these tickets having been merged. I have modified it so that the plugin now requires WP 6.2-beta1.

Also, if you were testing the Rollback Update Failure plugin and needed the Faster Updates plugin, this is no longer a requirement and the Faster Updates plugin may be deleted.

This ticket was mentioned in Slack in #core-upgrade-install by afragen. View the logs.


14 months ago

#304 @azaozz
13 months ago

Following up on the discussions in Slack:

The recently added move_dir() is a lot faster than copy_dir(), and makes rollbacks possible. However move_dir() may fall back to copy_dir() in some rare cases. Been wondering how to handle this edge case.

For now it seems a good option would be to provide means for the users to retry a plugin or theme update without first creating a rollback backup. This could be achieved by not making backups when the users update by uploading a .zip file. This workaround will let anybody to fall back to the previous update code relatively easily.

It will also need a good explanation about what to do and how if plugins and themes updating keeps failing. Seems a suitable place would be at the top of the update screen.

In both cases this is a long time without visual feedback

Thinking this probably needs some user help/explanations too. At least a warning that updates may take several minutes would be nice. Another option would be to "fake it" with some JS although not sure if that's good enough :)

This ticket was mentioned in Slack in #core-upgrade-install by costdev. View the logs.


13 months ago

#306 @costdev
13 months ago

  • Keywords commit added; needs-testing removed
  • Milestone changed from Future Release to 6.3

Following further discussions in Slack:

For now it seems a good option would be to provide means for the users to retry a plugin or theme update without first creating a rollback backup. This could be achieved by not making backups when the users update by uploading a .zip file. This workaround will let anybody to fall back to the previous update code relatively easily.

Manual uploads of .zip files via Plugins > Add New > Upload Plugin counts as an install rather than an upgrade and will not trigger the backup/restore logic of Rollback, so this workaround is already available.

It will also need a good explanation about what to do and how if plugins and themes updating keeps failing. Seems a suitable place would be at the top of the update screen.

#58199 was created to address this and it was determined not to be a blocker for Rollback.

In both cases this is a long time without visual feedback

Thinking this probably needs some user help/explanations too. At least a warning that updates may take several minutes would be nice. Another option would be to "fake it" with some JS although not sure if that's good enough :)

As this was determined to be a pre-existing issue in Core, and that Rollback's use of move_dir() reduces the time without visual feedback, it was agreed that this isn't a blocker for Rollback. We should definitely address this issue in a separate ticket that documents reproduction steps. It appears that the lack of visual feedback only happens in some circumstances, which are unknown to me as yet.


In addition, once we get #51928 committed to Core, we'll have some data for plugin/theme update failures to help determine pain points during the upgrade process as a whole.


  • Given extensive testing and positive test reports above, I'm removing needs-testing.
  • Given that there are no blockers for Rollback and that @azaozz and @peterwilsoncc have both said they're happy for it to go forward to final commit consideration, I'm adding commit and milestoning for 6.3. @SergeyBiryukov, do you have time to do this?
  • For the upgrade process as a whole, we should definitely get #58199 and #51928 landed in 6.3.

#307 @peterwilsoncc
13 months ago

Just to note: I haven't had the chance to review the current PR -- and won't for a little while -- but my most recent testing of the plugin was successful on VirtualBox so I'm happy for Ozz and Sergey to take lead on review and commit.

#308 @SergeyBiryukov
13 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 55720:

Upgrade/Install: Create a temporary backup of plugins and themes before updating.

This aims to make the update process more reliable and ensures that if a plugin or theme update fails, the previous version can be safely restored.

  • When updating a plugin or theme, the old version is moved to a temporary backup directory:
    • wp-content/upgrade-temp-backup/plugins/[plugin-slug] for plugins
    • wp-content/upgrade-temp-backup/themes/[theme-slug] for themes.
  • If the update fails, then the backup kept in the temporary backup directory is restored to its original location.
  • If the update succeeds, the temporary backup is deleted.

To further help troubleshoot plugin and theme updates, two new checks were added to the Site Health screen:

  • A check to make sure that the upgrade-temp-backup directory is writable.
  • A check that there is enough disk space available to safely perform updates.

To avoid confusion: The temporary backup directory will NOT be used to “roll back” a plugin to a previous version after a completed update. This directory will simply contain a transient backup of the previous version of a plugin or theme being updated, and as soon as the update process finishes, the directory will be empty.

Follow-up to [55204], [55220].

Props afragen, costdev, pbiron, azaozz, hellofromTonya, aristath, peterwilsoncc, TJNowell, bronsonquick, Clorith, dd32, poena, TimothyBlynJacobs, audrasjb, mikeschroder, a2hosting, KZeni, galbaras, richards1052, Boniu91, mai21, francina, TobiasBg, desrosj, noisysocks, johnbillion, dlh, chaion07, davidbaumwald, jrf, thisisyeasin, ignatggeorgiev, SergeyBiryukov.
Fixes #51857.

@SergeyBiryukov commented on PR #2225:


13 months ago
#309

Thanks for the PR! Merged in r55720. Great job everyone! 🙌

This ticket was mentioned in Slack in #core-upgrade-install by afragen. View the logs.


13 months ago

#311 @afragen
11 months ago

  • Keywords has-dev-note added; needs-dev-note removed

@peterwilsoncc commented on PR #3044:


7 months ago
#312

Tests added in r56992 / https://github.com/WordPress/wordpress-develop/commit/b5392cc28c6f065227e9345cdb9ac266a0f9ee30 and subsequently backported to the 6.4 branch.

The src changes and the tests relating to those fixes were not included to allow for the backport, these will need to be done as a follow up during the 6.5 release cycle. As WordPress 6.4 is in the release candidate stage it's too late to include the bug fix.

Note: See TracTickets for help on using tickets.