Make WordPress Core

Opened 9 months ago

Closed 8 months ago

Last modified 8 months ago

#58998 closed enhancement (fixed)

Prevent races and stampedes when flush_rewrite_rules() is called

Reported by: icaleb's profile iCaleb Owned by: joemcgill's profile joemcgill
Milestone: 6.4 Priority: normal
Severity: normal Version:
Component: Rewrite Rules Keywords: has-patch needs-testing commit
Focuses: performance Cc:

Description (last modified by SergeyBiryukov)

This ticket is more or less a follow-up to #11384.

Currently when flush_rewrite_rules() is called, WP_Rewrite->flush_rules() will set the rewrite_rules option to an empty string first. And then afterwards it will go and generate the rules and save the results to the option.

On a high traffic site there could be potentially hundreds of requests all coming in during this time period where the option is empty (considerably more-so if it either takes a particularly long time to generate, or if the site is utilizing replica databases). Each request during the time period will find rewrite_rules empty and attempt to generate and save the rules. This is the first of the stampedes that result from this behavior. You now have all these frontend requests for a period of time consuming more processing time than usual, and each having to connect to the primary database to perform a database write (which can further delay replication lag).

Eventually that stampede should come to an end, but not before potentially triggering other stampedes. Since rewrite_rules is stored in alloptions, you have that cache key being invalidated and set quite frequently because all the requests are touching it. That introduces both some fun alloptions race conditions in general, but also makes it possible that the object cache (Memcached specifically) can enter into a stampede condition.

In short, each memcached add() has to find a correctly-sized slab and starts to write it's data into memory. A few things happen in between and then at the end the item is linked to the hash table so it becomes the official record for that key. However if there is a large amount of add() requests all coming in at the same time all wanting to be apart of the same memory slab - each will essentially be evicting each other so nobody is the winner. This can very quickly cascade the site into severe instability when alloptions is not being cached. This is of course a problem in it's own right and rewrite_rules is not directly to blame, but thought it was worth mentioning to add more context.

Lastly, because a request could be at any point in it's logic when rewrite_rules is made into an empty string, you open up to the possibility of just some weird race conditions with execution order and whatnot. A plugin could be doing something admittedly funky for example (WP_Rewrite leaves all it's properties open to being directly changed), resulting in a small percentage of requests possibly coming up with a different/partial result for rewrite_rules than others. I've seen this first-hand a number of times. It's not always obvious bugs either, sometimes just very particular load order / hooks / current cached state. At best you end up with a few 404s during the initial stampede. At worst, the last request to save was the faulty path and now you're stuck with incorrect rules until the next flush. And it's very tricky to narrow down what happened. We of course can't prevent all incorrect code, but we can give it less chances to mess things up.

Ideally, and with a seemingly easy fix with limited to no downside, we can eliminate these problems and make the rewrite_rules generation more consistent and predictable.

Change History (10)

This ticket was mentioned in PR #4972 on WordPress/wordpress-develop by WPprodigy.

9 months ago

  • Keywords has-patch added

Rather than deleting/setting the rewrite_rules to an empty string, which has an effect on all other incoming requests at the same time, only the request that asked to flush the rewrite rules should be responsible for generating them.

The only scenario I can think of where this will introduce problematic behavior is if a flush is requested and the wp_loaded hook as not been called and a non-traditional boot-up path is chosen where it will never request the rewrite rules. In that case, the rules won't actually be flushed. We could account for this, in a means similar to $do_hard_later in flush_rules(), but I'm not entirely sure that's really necessary/likely.

#2 @SergeyBiryukov
9 months ago

  • Description modified (diff)

#3 @mukesh27
8 months ago

  • Keywords needs-testing added
  • Version 6.3 deleted

#4 @swissspidy
8 months ago

  • Milestone changed from Awaiting Review to Future Release

WPprodigy commented on PR #4972:

8 months ago

I'm curious why the rules property needs to be set to an empty string here given that it will be replaced when the rewreite_rules() method is called.

Yeah I'm not sure either. The one caveat I mentioned above is the only thing that comes to mind, but wp_loaded not being hit seems outside the scope we need to be concerned with best I can tell?

An alternate approach would be to move all of the logic for repopulating the rewrite_rules option to a new private method

I like that idea! Adjusted the PR.

#6 @joemcgill
8 months ago

  • Keywords commit added
  • Milestone changed from Future Release to 6.4

Planning to commit this today after WCUS 2023 contributor day, so it has some time to soak this cycle. @iCaleb if you could test this further and give feedback about any side effects that might pop up, we can still adjust before we get into betas.

#7 @joemcgill
8 months ago

  • Owner set to joemcgill
  • Resolution set to fixed
  • Status changed from new to closed

In 56448:

Rewrite Rules: Prevent stampedes when flush_rewrite_rules() is called

This ensures that the rewrite_rules option is not emptied until the new value has been recalculated and the option is updated. The logic for refreshing the option value is moved to a new private method named WP_Rewrite::refresh_rewrite_rules which is used by both the flush_rules and refresh_rewrite_rules methods.

Props iCaleb, joemcgill, flixos90, mukesh27.
Fixes #58998.

This ticket was mentioned in Slack in #core-performance by joemcgill. View the logs.

8 months ago

This ticket was mentioned in Slack in #core by joemcgill. View the logs.

8 months ago

Note: See TracTickets for help on using tickets.