WordPress.org

Make WordPress Core

Opened 7 weeks ago

Last modified 2 days ago

#51928 new task (blessed)

Provide plugin/theme update failure data to dot org

Reported by: afragen Owned by:
Milestone: 5.6.1 Priority: normal
Severity: normal Version:
Component: Upgrade/Install Keywords: has-patch has-unit-tests
Focuses: Cc:

Description (last modified by afragen)

With plugin auto-updates in core there have been instances of update failures leaving the user's site without the update and without any idea why the update failed. We receive core auto-update failure data to dot org and receiving plugin/theme failure data would help significantly in determining the causes of these failures.

I'm mostly guessing from how class-core-upgrader.php sends failure data via wp_version_check( $stats ) and I've added similar data and a call to wp_update_{plugins|themes}( $stats ) in class-wp-upgrader.php Thanks @pbiron

If it actually does send the data to dot org it could be useful. This requires the return of WP_Errors at potential points of failure. I have added one of these in #51857.

Feedback from the dot org maintainers will be needed.

Attachments (9)

51928.diff (1.6 KB) - added by afragen 7 weeks ago.
51928.2.diff (1.8 KB) - added by afragen 7 weeks ago.
more universal failure reporting
51928.3.diff (2.2 KB) - added by pbiron 7 weeks ago.
51928.3.2.diff (2.4 KB) - added by afragen 7 weeks ago.
pass at identifying automatic vs manual update
51928.4.diff (4.0 KB) - added by afragen 7 weeks ago.
put into method and call at each failure point
51928.5.diff (5.8 KB) - added by afragen 7 weeks ago.
include ziparchive of plugin/theme being upgraded to rollback directory
51928.6.diff (7.4 KB) - added by afragen 7 weeks ago.
zip and unzip of rollback now working
51928.7.diff (4.0 KB) - added by afragen 7 weeks ago.
just the telemetry stuff for dot org
51928.8.diff (10.5 KB) - added by afragen 6 weeks ago.
adding so much data that Dion will either jump for joy or cry

Download all attachments as: .zip

Change History (44)

@afragen
7 weeks ago

@afragen
7 weeks ago

more universal failure reporting

#1 @afragen
7 weeks ago

  • Summary changed from Provide plugin update failure data to dot org to Provide plugin/theme update failure data to dot org

#2 @afragen
7 weeks ago

Updated to provide for more universal failure reporting. 51928.2.diff should now report failure data to dot org for either plugins or themes.

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


7 weeks ago

#4 @hellofromTonya
7 weeks ago

  • Keywords needs-unit-tests added

@pbiron
7 weeks ago

#5 @pbiron
7 weeks ago

51928.3.diff is the same as 51928.2.diff except it calls wp_update_plugins( $stats ) or wp_update_themes( $stats ) instead of wp_version_check( $stats ) to report the failure.

I'll have to check how we can report translation update failures...will update the patch when I've had a chance to do that.

#6 @afragen
7 weeks ago

  • Description modified (diff)

#7 @afragen
7 weeks ago

  • Description modified (diff)

#8 @dd32
7 weeks ago

Looking at 51928.3.diff:

'fs_method_direct' => ! empty( $GLOBALS['_wp_filesystem_direct_method'] ) ? $GLOB['_wp_filesystem_direct_method'] : ''

$GLOB variable truncated.

'wp_error' => $result,

I'm not sure how WP_Error will JSON serialize here, it might be better to simply add the error_code and error_data instead. It'd also be worthwhile checking what error_data actually holds in these cases, as for core we specifically "anonymised" it by removing ABSPATH from the data: https://core.trac.wordpress.org/browser/tags/5.5.1/src/wp-admin/includes/update-core.php?marks=1106#L1103

I can make the WordPress.org APIs store/process/etc the data that gets sent from this in whatever format is decided, it's just easier to be able to do less processing as it makes aggregating the data far simpler.

Another data point that might make sense adding here, is whether it was an auto-update or manually triggered update. It'll allow the resulting aggregated data to focus on failures that happen via autoupdates, to add protections to avoid autoupdates when that scenario is detected.

@afragen
7 weeks ago

pass at identifying automatic vs manual update

#9 @afragen
7 weeks ago

51928.3.2.diff changed to per recommendations by @dd32 .

Also passes data re:automatic vs manual update. Unfortunately the Automatic Upgrader Skin doesn't contain much specific data.

#10 @afragen
7 weeks ago

51928.4.diff puts the telemetry data into a method and calls that at each failure point in WP_Upgrader::run().

@afragen
7 weeks ago

put into method and call at each failure point

#11 @afragen
7 weeks ago

51928.5.diff adds the creation of a ZipArchive of the plugin/theme being upgraded to wp-content/upgrade/rollback

The rollback directory gets cleared at the next plugin/theme upgrade.

@afragen
7 weeks ago

include ziparchive of plugin/theme being upgraded to rollback directory

#12 @afragen
7 weeks ago

51928.6.diff this is an initial pass at zipping the upgrade from the source and if a WP_Error is encountered restoring the original package.

Requires patch in #51857 or WP_Error returned from copy_dir()

Last edited 7 weeks ago by afragen (previous) (diff)

@afragen
7 weeks ago

zip and unzip of rollback now working

This ticket was mentioned in Slack in #core by afragen. View the logs.


7 weeks ago

@afragen
7 weeks ago

just the telemetry stuff for dot org

#14 @afragen
7 weeks ago

51928.7.diff removes all the rollback stuff as it belongs in #51857

@afragen
6 weeks ago

adding so much data that Dion will either jump for joy or cry

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


6 weeks ago

#16 @audrasjb
6 weeks ago

  • Milestone changed from Awaiting Review to 5.7

This ticket was mentioned in Slack in #core by afragen. View the logs.


6 weeks ago

This ticket was mentioned in Slack in #core by afragen. View the logs.


4 weeks ago

This ticket was mentioned in Slack in #core by afragen. View the logs.


4 weeks ago

#20 @audrasjb
4 weeks ago

  • Milestone changed from 5.7 to 5.6.1

As per today's devchat, let's consider this for 5.6.1 inclusion so it could hoepfully give some insights for #51857 :-)

#21 @audrasjb
4 weeks ago

  • Type changed from enhancement to task (blessed)

This ticket was mentioned in PR #850 on WordPress/wordpress-develop by hellofromtonya.


3 weeks ago

  • Keywords has-unit-tests added; needs-unit-tests removed

Trac ticket: https://core.trac.wordpress.org/ticket/51928

  • Applies telemetry patch
  • Changes time() to microtime( true )
  • Adds unit tests for sending (or not sending) error report

This ticket was mentioned in Slack in #core-auto-updates by audrasjb. View the logs.


3 weeks ago

#24 follow-up: @hellofromTonya
3 weeks ago

  • Keywords dev-feedback removed

The PR 850 is ready for review. It includes unit tests for:

  • Theme_Upgrader::install
  • Theme_Upgrader::upgrade
  • Theme_Upgrader::bulk_upgrade
  • Plugin_Upgrader::install
  • Plugin_Upgrader::upgrade
  • Plugin_Upgrader::bulk_upgrade

Hey @dd32, Andy and I were talking about whether to use time() or microtime(). Do you have a preference for the time_taken in the error data?

#25 in reply to: ↑ 24 @dd32
3 weeks ago

Replying to hellofromTonya:

whether to use time() or microtime(). Do you have a preference for the time_taken in the error data?

IMHO: time(); measuring anything less than seconds (or parts thereof) doesn't seem like it's useful to anyone to me. The only data points you would really want to track in bulk out of it are things like "90% of updates complete in 5s or less" or "1% of updates takes more than 30s, why?"

Measuring in microtime() would make sense for local logging inside WordPress though (of which there is none currently... ) but not on dotorg. Of course, if you were to use microtime() here it could still be stored as an int on the dotorg side in stats.

This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.


3 weeks ago

#27 @hellofromTonya
3 weeks ago

Thank you @dd32 for your explanation and insights. Reverted it back to time() in this commit.

This ticket was mentioned in Slack in #core-auto-updates by pbiron. View the logs.


3 weeks ago

This ticket was mentioned in Slack in #core-auto-updates by afragen. View the logs.


12 days ago

This ticket was mentioned in Slack in #core by afragen. View the logs.


11 days ago

This ticket was mentioned in Slack in #core-auto-updates by audrasjb. View the logs.


5 days ago

#32 follow-up: @dd32
3 days ago

Sorry for the delays in getting back to y'all on this.

Reviewing PR 850 above, I have two main questions, neither of which is a blocker:

https://github.com/WordPress/wordpress-develop/pull/850/files#diff-d85d82fa0b14395e2edaaa49f3671189988bc3d7af07e3fccb6b93a00fc99480R982-R999

  1. Is it possible to send the Slug of the plugin/theme being updated, rather than it's Name?
  2. Like above, is it possible to pass through the Slug/Version of automatic updates too?

I suspect I know the answer here, being that it's hard to retrieve those details, but I'm thinking maybe the Automatic updater skin could act as a temporary store of what is currently being processed. This could be done in a follow up change I guess.

https://github.com/WordPress/wordpress-develop/pull/850/files#diff-d85d82fa0b14395e2edaaa49f3671189988bc3d7af07e3fccb6b93a00fc99480R965-R979

  1. Due to error_message being translated it's not really storable for aggregate stats, Is process & error_code enough of a unique error with error_data as context?

It can either be kept in the stats blob for easier debugging or removed, either is fine, it just won't get stored api-server-side.

I'm going to attempt to get some storage running for this today/tomorrow, but don't let that block this ticket proceeding. Nothing should break by sending this through (at worst, it'll be disregarded as junk input, at best, it'll be recorded as successful rather than a failure).

This ticket was mentioned in Slack in #core by afragen. View the logs.


2 days ago

#34 in reply to: ↑ 32 ; follow-up: @afragen
2 days ago

Replying to dd32:

Reviewing PR 850 above, I have two main questions, neither of which is a blocker:

https://github.com/WordPress/wordpress-develop/pull/850/files#diff-d85d82fa0b14395e2edaaa49f3671189988bc3d7af07e3fccb6b93a00fc99480R982-R999

  1. Is it possible to send the Slug of the plugin/theme being updated, rather than it's Name?
  2. Like above, is it possible to pass through the Slug/Version of automatic updates too?

I suspect I know the answer here, being that it's hard to retrieve those details, but I'm thinking maybe the Automatic updater skin could act as a temporary store of what is currently being processed. This could be done in a follow up change I guess.

Unfortunately that data isn't available in the Upgrader object. Depending upon the data passed in the WP_Error, it might show the slug by reference to the destination folder. I know, not necessarily the same thing.

https://github.com/WordPress/wordpress-develop/pull/850/files#diff-d85d82fa0b14395e2edaaa49f3671189988bc3d7af07e3fccb6b93a00fc99480R965-R979

  1. Due to error_message being translated it's not really storable for aggregate stats, Is process & error_code enough of a unique error with error_data as context?

It can either be kept in the stats blob for easier debugging or removed, either is fine, it just won't get stored api-server-side.

I'm going to attempt to get some storage running for this today/tomorrow, but don't let that block this ticket proceeding. Nothing should break by sending this through (at worst, it'll be disregarded as junk input, at best, it'll be recorded as successful rather than a failure).

I'm of the opinion that more data is better, but you're the one crunching the data so I'll defer to whatever you need/want here. Certainly something that can be adjusted in another ticket/later time.

I have made a small PR to @hellofromTonya's PR https://github.com/WordPress/wordpress-develop/pull/850 to include error reporting for copy_dir() errors.

#35 in reply to: ↑ 34 @dd32
2 days ago

I've been testing PR 850 and it's data storage on dotorg, and I can happily say the data is being saved correctly now, however..

Replying to afragen:

I suspect I know the answer here, being that it's hard to retrieve those details, but I'm thinking maybe the Automatic updater skin could act as a temporary store of what is currently being processed. This could be done in a follow up change I guess.

Unfortunately that data isn't available in the Upgrader object. Depending upon the data passed in the WP_Error, it might show the slug by reference to the destination folder. I know, not necessarily the same thing.

In my testing of this PR, it looks like it's actually very common for it to not have even the name available. The current way that it's retrieving the details isn't reliable enough and only covers a singular use-case (Plugin/Theme upgrade via the ajax upgrader).

It also incorrectly identifies most plugin upgrade/install attempts as automatic_plugin_update.

I have made a small PR to @hellofromTonya's PR https://github.com/WordPress/wordpress-develop/pull/850 to include error reporting for copy_dir() errors.

Currently the PR as is, actually sends two API pings, with your proposed changes, it sends 3 :)

I'm not sure I understand what the 'process' field is for, as the WP_Error objects should have a unique error_code value that specifies the issues being encountered... The inclusion of that is what was causing double api pings to be sent.

I've made some significant-ish changes via PR2 on PR850 (my branch) which simplifies the process a little, and ensures that slug/version context is always sent with the failure pings, which makes the data much more useful.

I've ditched the process key, relying upon the WP_Error error codes being unique enough.
I've also removed most of the pings from WP_Upgrader, relegating the failure catching to ::install(), ::upgrade(), and ::bulk_upgrade() instead.
I haven't tested this with Background auto-updates, but I think it should catch those upgrades correctly, and identify them as such.

I've ignored the unit tests, because I'm almost 100% certain it'll now be failing after these changes, and as far as I can tell was probably too decoupled from it anyway to catch anything useful.
The tests also include some platform-specific error codes, PCLZIP_ERR_MISSING_FILE which to me suggests the existing tests probably would've failed on a PHP with the zip extension enabled.
I'd highly support not including any unit testing what so ever of this change.

Note: See TracTickets for help on using tickets.