WordPress.org

Make WordPress Core

Opened 3 months ago

Closed 5 weeks ago

Last modified 8 days ago

#46130 closed defect (bug) (fixed)

Only pause extensions for users with recovery access

Reported by: flixos90 Owned by: TimothyBlynJacobs
Milestone: 5.2 Priority: high
Severity: normal Version: 5.1
Component: Bootstrap/Load Keywords: servehappy has-patch
Focuses: Cc:

Description

After introduction of the fatal error recovery mechanism in #44458, several follow-up tickets were opened. While some were regarding regular bugs, quite a few included security concerns, particularly about how a malicious attack could cause even safe plugins and themes to be paused. Security researchers outside of the WordPress space have already published about the problems as well: https://medium.com/websec/wordpress-serve-happy-ffd93c94f5ac

While we have since investigated reducing the attack vectors by pausing extensions less aggressively (see #45888, #45940, #46066), it appears that it is impossible to reliably prevent all loopholes there.

In today's PHP meeting, an alternative approach came up:

  • Instead of pausing extensions globally on certain protected endpoints, extensions should only be paused for specific users.
  • An admin user could receive a "recovery link" with a secret code, for example with an email.
  • When opening the recovery link, the secret code is verified, and then a "recovery cookie" is stored on the user's machine, containing a nonce.
  • Whenever such a recovery cookie is present and valid, extension pausing is enabled.

This approach allows reusing most logic that is already in place. Instead of pausing on protected endpoints, pausing will happen whenever the cookie is present, in any area. In other words, the protected endpoint logic can be removed and instead be replaced by the recovery mode check.

More concrete suggestions regarding the implementation:

  • To generate a new recovery link, use something like wp_generate_password( 20, false ), and store a hashed version of that value as recovery_key in the wp_user_meta table. Include the raw value in the recovery link as a query parameter to be given to the user.
  • When a user clicks the link, check the query parameter. Hash it in the same process, then query the user by that recovery_key meta value. If no user is found, the recovery key is invalid. If one is found, delete the key so that it is no longer valid and set a cookie on the device. The cookie should use some of the *_KEY constants from wp-config.php to have an unpredictable name.
  • After setting the cookie and deleting the recovery key, the user should be redirected to the admin. At this point, the cookie is already placed, so recovery mode is de-facto active.
  • Introduce wp_is_recovery_mode() or similar. This function should check if the cookie exists and is valid. We can then replace all checks for is_protected_endpoint() with checks for this new function.

Considerations:

  • What should we store in the cookie? Should the existence of the cookie limited to a few minutes/hours/days? Should there be a nonce in the cookie value (that itself restricts duration already)?
  • How would an administrator get a recovery link? Should the fatal error handler send an email when it detects an error? Should there be an area in the admin where a user can request one that is then displayed to them just once so that they can copy/paste it?

We need to find a good measure here that is both secure but also stable enough for users. For example consider that, when a site is broken and nobody has a valid recovery link, the site is still as inaccessible as before, so that should be mitigated.

Attachments (4)

46130.diff (27.2 KB) - added by TimothyBlynJacobs 3 months ago.
46130.1.diff (26.5 KB) - added by TimothyBlynJacobs 3 months ago.
46130.2.diff (87.5 KB) - added by TimothyBlynJacobs 5 weeks ago.
46130.3.diff (83.4 KB) - added by flixos90 5 weeks ago.

Download all attachments as: .zip

Change History (27)

This ticket was mentioned in Slack in #core-php by flixos90. View the logs.


3 months ago

#2 @flixos90
3 months ago

#46086 was marked as a duplicate.

#3 @flixos90
3 months ago

#46086 was opened for a similar reason and came before this. Let's continue work here, but I leave this comment here just to make sure we give props to @WFMattR eventually.

This ticket was mentioned in Slack in #core-php by flixos90. View the logs.


3 months ago

This ticket was mentioned in Slack in #core by timothybjacobs. View the logs.


3 months ago

#6 @TimothyBlynJacobs
3 months ago

I've uploaded a patch which requires users to enter into Recovery Mode via a link.

The Recovery Mode is broken down into a number of parts.

1) Requesting and Sending a link to enable recovery mode.
2) Validating the recovery link.
3) Handling the recovery cookie.

Part 1
When a fatal error occurs, we add a link to wp_die generated by get_recovery_mode_request_url().

When the user visits that URL, handle_recovery_mode_actions() sends an email to the admin_email via maybe_send_recovery_mode_email(). This function rate limits the email to only be sent at most once an hour. We then immediately wp_die() with a message indicating whether the email was sent or not.

The email generates a "recovery key" with generate_and_store_recovery_mode_key(). This generates a random string with wp_generate_password(). ( The length of the string is 20 to match the behavior in get_password_reset_key() but could be increased ). The string is then hashed by PasswordHash and that hash saved to the recovery_key site option along with the creation time.

This un-hashed key is added as a query arg to wp_login_url() along with an action set to begin_recovery_mode().

Part 2
When the user clicks the link, handle_recovery_mode_actions() verifies that the key is valid and not expired. The persistent recovery cookie is then set. We then wp_die() with a JavaScript based redirect to the login page.

This is done so that the cookie can actually be persisted on the client's machine. In testing, just setting a cookie and redirecting didn't work. Additionally, from the brief research I had time to do, it looks like different cache configurations might make persisting a cookie and doing a redirect at once not possible.

The user is redirect to the login page with an action set to begun_recovery_mode. This is so we can then print a message indicating that recovery mode was enabled successfully on the login page.

Part 3
The cookie value is generated by generate_recovery_mode_cookie(). The majority of the security related functions are defined in pluggable.php. When we load pluggable.php in other calls, we can immediately die() or wp_redirect. Since we have to validate this cookie on every page load, those functions aren't an option.

After discussing with @aaroncampbell, it looked like making a version of wp_salt that has its own fallbacks for invalid salts was a viable option.

A separate function, recovery_mode_hash() is introduced that performs a hash_hmac specific to recovery mode. This tries to use the AUTH_KEY and AUTH_SALT values, but if they aren't available, it will load pluggable.php and then generate random values and store them to site options, similarly to wp_salt().

The cookie value is an hmac with the data set to a string of recovery_mode|%1$s|%2$s. 1. is the current time, and 2. is a randomly generated value by wp_generate_password(). The signature generated by hash_hmac is then appended to the string.

Validating the cookie is done by recalculating the expected hash, and checking if the hashes are equal. Additionally, the code checks that the cookie has not expired. By default the cookie value is valid for 7 days. The cookie itself expires at the end of the session. This should be discussed.

The actual cookie implementation beyond using a version of wp_salt was not reviewed by @aaroncampbell. This should be reviewed and alternatives considered.

Some possible alternatives:

a) Use the PasswordHash library directly. The hash would be stored in a site option, with the creation time. This would require the secret to be stored in plain text in the cookie. This might not be terrible in this scenario since it's not a user's password, but a randomly generated value.
b) Just use a shared plaintext secret that is stored in the DB with the creation time there.
c) Same as b) but use recovery_mode_hash to hash the shared secret on either side so that the Recovery Mode sessions would expire when the AUTH* salts are changed.

Moving on, if the cookie is not validated, it is cleared and we wp_die with the reason why and a prompt to request a new link.

If the cookie is valid, set a constant WP_RECOVERY_MODE to true. The wp_is_recovery_mode() function checks for this constant, and allows filtering its output.

The cookie is cleared when the user logs out by wp_clear_auth_cookie().

Additional Changes

  • Plugins/Themes are not paused unless wp_is_recovery_mode()
  • The redirection mechanism to discover multiple plugins at fault in one go now only happens when wp_is_recovery_mode() otherwise no progress can be made since the plugins aren't being paused one by one.
  • All errors are now stored, regardless of if they occurred in a protected endpoint. Alongside the error we record if the error happened in a protected endpoint and adjust the language in the Plugins List Table.

Additionally, after discussion with @flixos90, we decided to drop Network Activated plugins from pausing. There are a number of things that WordPress loads between MU/Network Activated plugins being loaded and regular plugins being loaded. Most importantly to us, the cookie constants. Also from @flixos90, pausing network plugins open their whole own permission/scope can of worms and is multisite is generally run by more technical users.

When handling the error, the email is sent automatically. I'm not sure if this should stay or not, but I included it to demonstrate.

handle_recovery_mode_actions() is called immediately before loading regular plugins.

Initially, I tried to match this to a user and store the keys in user meta. After developing it further, it didn't appear workable because this code is executed before we can work with the current user and often without a user available ( such as when trying to login ).


After writing this, I think we can drop the recovery_mode_email_last_sent option and just use the creation time from generate_and_store_recovery_mode_key().

Last edited 3 months ago by TimothyBlynJacobs (previous) (diff)

#7 @TimothyBlynJacobs
3 months ago

Thinking some more, I think an alternative approach where we store some form of a secret on the server for validating the cookie would be better than the hmac setup. Which approach exactly would be great to get feedback on.

#8 @afragen
3 months ago

  • Keywords has-patch added; needs-patch removed

#9 @flixos90
3 months ago

  • Milestone changed from 5.1 to 5.2

#10 @flixos90
3 months ago

  • Priority changed from highest omg bbq to high

This ticket was mentioned in Slack in #core by flixos90. View the logs.


3 months ago

This ticket was mentioned in Slack in #core-php by flixos90. View the logs.


3 months ago

This ticket was mentioned in Slack in #core-php by flixos90. View the logs.


2 months ago

#14 @flixos90
6 weeks ago

  • Milestone changed from 5.2 to 5.3

#15 follow-up: @TimothyBlynJacobs
5 weeks ago

Uploading a patch from the latest version of the PR. At the least, this needs file header docs and a decision on whether we want to determine the location the error occurred on ( think wp-login.php or wp-admin/edit.php?post_type=whichever ) and include that in the email. Right now it is just stubbed as "TBD".

#16 @flixos90
5 weeks ago

  • Milestone changed from 5.3 to 5.2

This ticket was mentioned in Slack in #design by timothybjacobs. View the logs.


5 weeks ago

#18 @flixos90
5 weeks ago

In 44962:

Bootstrap/Load: Introduce fatal error handler.

This changeset introduces a WP_Fatal_Error_Handler class that detects fatal errors and displays a more user-friendly message about the site experiencing technical difficulties.

Websites that have custom requirements in that regard can implement their own fatal error handler by adding a fatal-error-handler.php drop-in that returns the handler instance to use, which must be based on a class that inherits WP_Fatal_Error_Handler. That handler will then be used in place of the default one. Alternatively, the fatal error handler feature can be completely disable through a constant WP_DISABLE_FATAL_ERROR_HANDLER.

Websites that would like to modify specifically the error template displayed in the frontend can add a php-error.php drop-in that works similarly to the existing db-error.php drop-in. For more granular customization, the fatal error handler also includes new filters wp_should_handle_php_error, wp_php_error_message and wp_php_error_args.

Props afragen, bradleyt, flixos90, ocean90, schlessera, SergeyBiryukov, spacedmonkey, timothyblynjacobs.
See #46130, #44458.

@flixos90
5 weeks ago

#19 @flixos90
5 weeks ago

  • Keywords 2nd-opinion removed

46130.3.diff is created from the latest state of the PR.

#20 @flixos90
5 weeks ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 44973:

Bootstrap/Load: Introduce a recovery mode for fixing fatal errors.

Using the new fatal handler introduced in [44962], an email is sent to the admin when a fatal error occurs. This email includes a secret link to enter recovery mode. When clicked, the link will be validated and on success a cookie will be placed on the client, enabling recovery mode for that user. This functionality is executed early before plugins and themes are loaded, in order to be unaffected by potential fatal errors these might be causing.

When in recovery mode, broken plugins and themes will be paused for that client, so that they are able to access the admin backend despite of these errors. They are notified about the broken extensions and the errors caused, and can then decide whether they would like to temporarily deactivate the extension or fix the problem and resume the extension.

A link in the admin bar allows the client to exit recovery mode.

Props timothyblynjacobs, afragen, flixos90, nerrad, miss_jwo, schlessera, spacedmonkey, swissspidy.
Fixes #46130, #44458.

#21 @flixos90
5 weeks ago

In 44974:

Bootstrap/Load: Fix WPCS violation in a new test following [44973].

See #46130.

#22 in reply to: ↑ 15 @kraftbj
4 weeks ago

Replying to TimothyBlynJacobs:

Uploading a patch from the latest version of the PR. At the least, this needs file header docs and a decision on whether we want to determine the location the error occurred on ( think wp-login.php or wp-admin/edit.php?post_type=whichever ) and include that in the email. Right now it is just stubbed as "TBD".

Is there a ticket to track updating the "TBD"?

#23 @flixos90
8 days ago

In 45211:

Bootstrap/Load: Allow more than one recovery link to be valid at a time.

While currently a recovery link is only made available via the admin email address, this will be expanded in the future. In order to accomplish that, the mechanisms to store and validate recovery keys must support multiple keys to be valid at the same time.

This changeset adds that support, adding an additional token parameter which is part of a recovery link in addition to the key. A key itself is always associated with a token, so the two are only valid in combination. These associations are stored in a new recovery_keys option, which is regularly cleared in a new Cron hook, to prevent potential cluttering from unused recovery keys.

This changeset does not have any user-facing implications otherwise.

Props pbearne, timothyblynjacobs.
Fixes #46595. See #46130.

Note: See TracTickets for help on using tickets.