WordPress.org

Make WordPress Core

Opened 12 months ago

Last modified 3 weeks ago

#51928 assigned task (blessed)

Provide plugin/theme update failure data to dot org

Reported by: afragen Owned by:
Milestone: 6.0 Priority: normal
Severity: normal Version:
Component: Upgrade/Install Keywords: has-patch
Focuses: Cc:

Description (last modified by afragen)

With plugin auto-updates in core there have been instances of update failures leaving the user's site without the update and without any idea why the update failed. We receive core auto-update failure data to dot org and receiving plugin/theme failure data would help significantly in determining the causes of these failures.

I'm mostly guessing from how class-core-upgrader.php sends failure data via wp_version_check( $stats ) and I've added similar data and a call to wp_update_{plugins|themes}( $stats ) in class-wp-upgrader.php Thanks @pbiron

If it actually does send the data to dot org it could be useful. This requires the return of WP_Errors at potential points of failure. I have added one of these in #51857.

Feedback from the dot org maintainers will be needed.

Attachments (9)

51928.diff (1.6 KB) - added by afragen 12 months ago.
51928.2.diff (1.8 KB) - added by afragen 12 months ago.
more universal failure reporting
51928.3.diff (2.2 KB) - added by pbiron 12 months ago.
51928.3.2.diff (2.4 KB) - added by afragen 12 months ago.
pass at identifying automatic vs manual update
51928.4.diff (4.0 KB) - added by afragen 12 months ago.
put into method and call at each failure point
51928.5.diff (5.8 KB) - added by afragen 12 months ago.
include ziparchive of plugin/theme being upgraded to rollback directory
51928.6.diff (7.4 KB) - added by afragen 12 months ago.
zip and unzip of rollback now working
51928.7.diff (4.0 KB) - added by afragen 12 months ago.
just the telemetry stuff for dot org
51928.8.diff (10.5 KB) - added by afragen 12 months ago.
adding so much data that Dion will either jump for joy or cry

Download all attachments as: .zip

Change History (65)

@afragen
12 months ago

@afragen
12 months ago

more universal failure reporting

#1 @afragen
12 months ago

  • Summary changed from Provide plugin update failure data to dot org to Provide plugin/theme update failure data to dot org

#2 @afragen
12 months ago

Updated to provide for more universal failure reporting. 51928.2.diff should now report failure data to dot org for either plugins or themes.

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


12 months ago

#4 @hellofromTonya
12 months ago

  • Keywords needs-unit-tests added

@pbiron
12 months ago

#5 @pbiron
12 months ago

51928.3.diff is the same as 51928.2.diff except it calls wp_update_plugins( $stats ) or wp_update_themes( $stats ) instead of wp_version_check( $stats ) to report the failure.

I'll have to check how we can report translation update failures...will update the patch when I've had a chance to do that.

#6 @afragen
12 months ago

  • Description modified (diff)

#7 @afragen
12 months ago

  • Description modified (diff)

#8 @dd32
12 months ago

Looking at 51928.3.diff:

'fs_method_direct' => ! empty( $GLOBALS['_wp_filesystem_direct_method'] ) ? $GLOB['_wp_filesystem_direct_method'] : ''

$GLOB variable truncated.

'wp_error' => $result,

I'm not sure how WP_Error will JSON serialize here, it might be better to simply add the error_code and error_data instead. It'd also be worthwhile checking what error_data actually holds in these cases, as for core we specifically "anonymised" it by removing ABSPATH from the data: https://core.trac.wordpress.org/browser/tags/5.5.1/src/wp-admin/includes/update-core.php?marks=1106#L1103

I can make the WordPress.org APIs store/process/etc the data that gets sent from this in whatever format is decided, it's just easier to be able to do less processing as it makes aggregating the data far simpler.

Another data point that might make sense adding here, is whether it was an auto-update or manually triggered update. It'll allow the resulting aggregated data to focus on failures that happen via autoupdates, to add protections to avoid autoupdates when that scenario is detected.

@afragen
12 months ago

pass at identifying automatic vs manual update

#9 @afragen
12 months ago

51928.3.2.diff changed to per recommendations by @dd32 .

Also passes data re:automatic vs manual update. Unfortunately the Automatic Upgrader Skin doesn't contain much specific data.

#10 @afragen
12 months ago

51928.4.diff puts the telemetry data into a method and calls that at each failure point in WP_Upgrader::run().

@afragen
12 months ago

put into method and call at each failure point

#11 @afragen
12 months ago

51928.5.diff adds the creation of a ZipArchive of the plugin/theme being upgraded to wp-content/upgrade/rollback

The rollback directory gets cleared at the next plugin/theme upgrade.

@afragen
12 months ago

include ziparchive of plugin/theme being upgraded to rollback directory

#12 @afragen
12 months ago

51928.6.diff this is an initial pass at zipping the upgrade from the source and if a WP_Error is encountered restoring the original package.

Requires patch in #51857 or WP_Error returned from copy_dir()

Last edited 12 months ago by afragen (previous) (diff)

@afragen
12 months ago

zip and unzip of rollback now working

This ticket was mentioned in Slack in #core by afragen. View the logs.


12 months ago

@afragen
12 months ago

just the telemetry stuff for dot org

#14 @afragen
12 months ago

51928.7.diff removes all the rollback stuff as it belongs in #51857

@afragen
12 months ago

adding so much data that Dion will either jump for joy or cry

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


12 months ago

#16 @audrasjb
12 months ago

  • Milestone changed from Awaiting Review to 5.7

This ticket was mentioned in Slack in #core by afragen. View the logs.


12 months ago

This ticket was mentioned in Slack in #core by afragen. View the logs.


11 months ago

This ticket was mentioned in Slack in #core by afragen. View the logs.


11 months ago

#20 @audrasjb
11 months ago

  • Milestone changed from 5.7 to 5.6.1

As per today's devchat, let's consider this for 5.6.1 inclusion so it could hoepfully give some insights for #51857 :-)

#21 @audrasjb
11 months ago

  • Type changed from enhancement to task (blessed)

This ticket was mentioned in PR #850 on WordPress/wordpress-develop by hellofromtonya.


11 months ago

  • Keywords has-unit-tests added; needs-unit-tests removed

Trac ticket: https://core.trac.wordpress.org/ticket/51928

  • Applies telemetry patch
  • Changes time() to microtime( true )
  • Adds unit tests for sending (or not sending) error report

This ticket was mentioned in Slack in #core-auto-updates by audrasjb. View the logs.


11 months ago

#24 follow-up: @hellofromTonya
11 months ago

  • Keywords dev-feedback removed

The PR 850 is ready for review. It includes unit tests for:

  • Theme_Upgrader::install
  • Theme_Upgrader::upgrade
  • Theme_Upgrader::bulk_upgrade
  • Plugin_Upgrader::install
  • Plugin_Upgrader::upgrade
  • Plugin_Upgrader::bulk_upgrade

Hey @dd32, Andy and I were talking about whether to use time() or microtime(). Do you have a preference for the time_taken in the error data?

#25 in reply to: ↑ 24 @dd32
11 months ago

Replying to hellofromTonya:

whether to use time() or microtime(). Do you have a preference for the time_taken in the error data?

IMHO: time(); measuring anything less than seconds (or parts thereof) doesn't seem like it's useful to anyone to me. The only data points you would really want to track in bulk out of it are things like "90% of updates complete in 5s or less" or "1% of updates takes more than 30s, why?"

Measuring in microtime() would make sense for local logging inside WordPress though (of which there is none currently... ) but not on dotorg. Of course, if you were to use microtime() here it could still be stored as an int on the dotorg side in stats.

This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.


11 months ago

#27 @hellofromTonya
11 months ago

Thank you @dd32 for your explanation and insights. Reverted it back to time() in this commit.

This ticket was mentioned in Slack in #core-auto-updates by pbiron. View the logs.


11 months ago

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


11 months ago

This ticket was mentioned in Slack in #core by afragen. View the logs.


11 months ago

This ticket was mentioned in Slack in #core-auto-updates by audrasjb. View the logs.


11 months ago

#32 follow-up: @dd32
11 months ago

Sorry for the delays in getting back to y'all on this.

Reviewing PR 850 above, I have two main questions, neither of which is a blocker:

https://github.com/WordPress/wordpress-develop/pull/850/files#diff-d85d82fa0b14395e2edaaa49f3671189988bc3d7af07e3fccb6b93a00fc99480R982-R999

  1. Is it possible to send the Slug of the plugin/theme being updated, rather than it's Name?
  2. Like above, is it possible to pass through the Slug/Version of automatic updates too?

I suspect I know the answer here, being that it's hard to retrieve those details, but I'm thinking maybe the Automatic updater skin could act as a temporary store of what is currently being processed. This could be done in a follow up change I guess.

https://github.com/WordPress/wordpress-develop/pull/850/files#diff-d85d82fa0b14395e2edaaa49f3671189988bc3d7af07e3fccb6b93a00fc99480R965-R979

  1. Due to error_message being translated it's not really storable for aggregate stats, Is process & error_code enough of a unique error with error_data as context?

It can either be kept in the stats blob for easier debugging or removed, either is fine, it just won't get stored api-server-side.

I'm going to attempt to get some storage running for this today/tomorrow, but don't let that block this ticket proceeding. Nothing should break by sending this through (at worst, it'll be disregarded as junk input, at best, it'll be recorded as successful rather than a failure).

This ticket was mentioned in Slack in #core by afragen. View the logs.


11 months ago

#34 in reply to: ↑ 32 ; follow-up: @afragen
11 months ago

Replying to dd32:

Reviewing PR 850 above, I have two main questions, neither of which is a blocker:

https://github.com/WordPress/wordpress-develop/pull/850/files#diff-d85d82fa0b14395e2edaaa49f3671189988bc3d7af07e3fccb6b93a00fc99480R982-R999

  1. Is it possible to send the Slug of the plugin/theme being updated, rather than it's Name?
  2. Like above, is it possible to pass through the Slug/Version of automatic updates too?

I suspect I know the answer here, being that it's hard to retrieve those details, but I'm thinking maybe the Automatic updater skin could act as a temporary store of what is currently being processed. This could be done in a follow up change I guess.

Unfortunately that data isn't available in the Upgrader object. Depending upon the data passed in the WP_Error, it might show the slug by reference to the destination folder. I know, not necessarily the same thing.

https://github.com/WordPress/wordpress-develop/pull/850/files#diff-d85d82fa0b14395e2edaaa49f3671189988bc3d7af07e3fccb6b93a00fc99480R965-R979

  1. Due to error_message being translated it's not really storable for aggregate stats, Is process & error_code enough of a unique error with error_data as context?

It can either be kept in the stats blob for easier debugging or removed, either is fine, it just won't get stored api-server-side.

I'm going to attempt to get some storage running for this today/tomorrow, but don't let that block this ticket proceeding. Nothing should break by sending this through (at worst, it'll be disregarded as junk input, at best, it'll be recorded as successful rather than a failure).

I'm of the opinion that more data is better, but you're the one crunching the data so I'll defer to whatever you need/want here. Certainly something that can be adjusted in another ticket/later time.

I have made a small PR to @hellofromTonya's PR https://github.com/WordPress/wordpress-develop/pull/850 to include error reporting for copy_dir() errors.

#35 in reply to: ↑ 34 @dd32
11 months ago

I've been testing PR 850 and it's data storage on dotorg, and I can happily say the data is being saved correctly now, however..

Replying to afragen:

I suspect I know the answer here, being that it's hard to retrieve those details, but I'm thinking maybe the Automatic updater skin could act as a temporary store of what is currently being processed. This could be done in a follow up change I guess.

Unfortunately that data isn't available in the Upgrader object. Depending upon the data passed in the WP_Error, it might show the slug by reference to the destination folder. I know, not necessarily the same thing.

In my testing of this PR, it looks like it's actually very common for it to not have even the name available. The current way that it's retrieving the details isn't reliable enough and only covers a singular use-case (Plugin/Theme upgrade via the ajax upgrader).

It also incorrectly identifies most plugin upgrade/install attempts as automatic_plugin_update.

I have made a small PR to @hellofromTonya's PR https://github.com/WordPress/wordpress-develop/pull/850 to include error reporting for copy_dir() errors.

Currently the PR as is, actually sends two API pings, with your proposed changes, it sends 3 :)

I'm not sure I understand what the 'process' field is for, as the WP_Error objects should have a unique error_code value that specifies the issues being encountered... The inclusion of that is what was causing double api pings to be sent.

I've made some significant-ish changes via PR2 on PR850 (my branch) which simplifies the process a little, and ensures that slug/version context is always sent with the failure pings, which makes the data much more useful.

I've ditched the process key, relying upon the WP_Error error codes being unique enough.
I've also removed most of the pings from WP_Upgrader, relegating the failure catching to ::install(), ::upgrade(), and ::bulk_upgrade() instead.
I haven't tested this with Background auto-updates, but I think it should catch those upgrades correctly, and identify them as such.

I've ignored the unit tests, because I'm almost 100% certain it'll now be failing after these changes, and as far as I can tell was probably too decoupled from it anyway to catch anything useful.
The tests also include some platform-specific error codes, PCLZIP_ERR_MISSING_FILE which to me suggests the existing tests probably would've failed on a PHP with the zip extension enabled.
I'd highly support not including any unit testing what so ever of this change.

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


10 months ago

#37 @whyisjake
10 months ago

  • Milestone changed from 5.6.1 to 5.6.2

Just a heads up, I think we are still working on this, so I am bumping to the 5.6.2 milestone to clear up 5.6.1 for the release next week.

#38 @hellofromTonya
10 months ago

So sorry everyone, I didn't see that a PR was submitted to my forked copy of the repo. Doh. Just merged @dd32's changes into PR 850.

@dd32

  • Did you get a chance to test it out with Background auto-updates?

    I haven't tested this with Background auto-updates, but I think it should catch those upgrades correctly, and identify them as such.

  • Excluding the tests (which I can clean up_, is the implementation ready to go?

#39 @desrosj
10 months ago

  • Milestone changed from 5.6.2 to Future Release

5.6.2 RC is set to be packaged in a few hours, and this one needs another review. Going to punt to Future Release for now, but feel free to re-milestone appropriately after a review happens.

#40 @afragen
10 months ago

  • Keywords early added
  • Milestone changed from Future Release to 5.8

Updated milestone for 5.8 as this would really help provide update failure data to dot org. This will be especially important for the Rollback Update Failure feature plugin scope.

This ticket was mentioned in Slack in #core by afragen. View the logs.


9 months ago

#42 @hellofromTonya
9 months ago

  • Keywords needs-unit-tests added; has-unit-tests removed

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


9 months ago

#44 @desrosj
7 months ago

I think this one just needs a review from @dd32.

If everyone can review and is comfortable with the changes this week, I think this can make it in to 5.8. If it takes longer than that I think we'd pass a reasonable early cutoff.

#45 @dd32
7 months ago

While I haven't tested the patches here recently, assuming the shortcomings of my previous comment was taken into account, and the team working on updates still wants this data (I have no need for it personally since I'm not working on updates), I have no issue with it as-is.

After it's in core, I'll follow up with the team with the data we're seeing, and flagging any duplicate API requests / missing data points / etc along with the aggregated data generated.

(I don't see this as needing to be early as such, other than that it needs time for the team to evaluate the data and fix any bugs before it going out in a release)

This ticket was mentioned in Slack in #core-auto-updates by audrasjb. View the logs.


6 months ago

This ticket was mentioned in Slack in #core by desrosj. View the logs.


6 months ago

#48 @desrosj
6 months ago

  • Keywords early removed
  • Owner set to desrosj
  • Status changed from new to assigned

#49 @desrosj
6 months ago

  • Milestone changed from 5.8 to 5.9

I was hoping to review this for 5.8, but unfortunately, we've just run out of time. The tests here are quite extensive (which is great), but not sure I'm comfortable committing without reviewing more thoroughly. Especially where it makes changes to the upgrade code.

#50 @desrosj
6 months ago

  • Keywords needs-unit-tests removed

This ticket was mentioned in Slack in #core-auto-updates by pbiron. View the logs.


4 months ago

This ticket was mentioned in Slack in #core by afragen. View the logs.


3 months ago

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


3 weeks ago

#54 @audrasjb
3 weeks ago

  • Keywords early added
  • Milestone changed from 5.9 to 6.0

We're getting close to 5.9 beta 1, let's move this ticket to the next milestone

#55 @audrasjb
3 weeks ago

  • Keywords early removed

#56 @desrosj
3 weeks ago

  • Owner desrosj deleted

Removing myself as owner so other interested contributors do not hesitate to take this on. Won't have time to tackle this in the next 2-3 months.

Note: See TracTickets for help on using tickets.