WordPress.org

Make WordPress Core

Opened 22 months ago

Last modified 17 months ago

#37958 new enhancement

Improve looping through sites and restoring

Reported by: tfrommen Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version:
Component: Networks and Sites Keywords:
Focuses: multisite, performance Cc:

Description

As discussed in the recent multisite hours minutes, I would like to propose (and discuss) a better means to looping through a number of (or even all) sites and finally restoring to the state before the loop.

The naive approach looks like the following:

<?php
foreach ( $some_or_all_site_ids as $site_id ) {
    switch_to_blog( $site_id );

    // Do some (maybe expensive) things on the site.

    restore_current_blog();
}

This is not ideal as in all but the last case we don't really want to restore, but just switch to the next site.

An improved version then would look like this:

<?php
$__site_id  = get_current_blog_id();
$__stack    = $GLOBALS['_wp_switched_stack'];
$__switched = $GLOBALS['switched'];

foreach ( $some_or_all_site_ids as $site_id ) {
    switch_to_blog( $site_id );

    // Do some (maybe expensive) things on the site.
}

switch_to_blog( $__site_id );
$GLOBALS['_wp_switched_stack'] = $__stack;
$GLOBALS['switched']           = $__switched;

This, however, doesn't look nice. Moreover, you would have to do this for every loop again and again.

So, why not abstract out the logic into a statically creatable snapshot of the current network state?

My suggestion for such a class looks like the following (which is already in use in the wild):

<?php
/**
 * Save and restore the current network state.
 *
 * By using this class, you can avoid unnecessary
 * switch_to_blog()-restore_current_blog()-switch_to_blog()-... excesses.
 */
class NetworkState {

        /**
         * @var int
         */
        private $site_id;

        /**
         * @var int[]
         */
        private $stack;

        /**
         * @var bool
         */
        private $switched;

        /**
         * Constructor. Sets up the properties.
         */
        public function __construct() {

                global $_wp_switched_stack, $switched;

                $this->site_id = get_current_blog_id();

                $this->stack = $_wp_switched_stack;

                $this->switched = $switched;
        }

        /**
         * Returns a new instance representing the current network state.
         *
         * @return static Network state object.
         */
        public static function create() {

                return new static();
        }

        /**
         * Restores the saved network state.
         *
         * @return void
         */
        public function restore() {

                switch_to_blog( $this->site_id );

                $GLOBALS['_wp_switched_stack'] = $this->stack;

                $GLOBALS['switched'] = $this->switched;
        }
}

With this class, the foreach-loop code from before can become this:

<?php
$network_state = NetworkState:create();

foreach ( $some_or_all_site_ids as $site_id ) {
    switch_to_blog( $site_id );

    // Do some (maybe expensive) things on the site.
}

$network_state->restore();

No matter what happens in the loop (and thus in any called function or method), we restore to the exact same state the network was in before the loop.

The above implementation works with the (currently available and used) globals. As soon as there is some other way (see #37699), this can easily be adapted as it's internals only.

The naming is just a suggestion, and we can, of course, baptize all the things differently.

Core could make use of it, too. For example in the `wpmu_delete_user() function, during Network upgrade, or in the wp_admin_bar_my_sites_menu() function (!), which means: On. Every. Single. Page. Load.

I'd be happy to provide a full patch against trunk. I just wanted to propose this first and give opportunity to discuss this.

Attachments (1)

37958.patch (4.9 KB) - added by tfrommen 19 months ago.

Download all attachments as: .zip

Change History (17)

#1 @tfrommen
22 months ago

Also, if introducing a static class method might be seen as something too confusing for non-OOP developers (as discussed in #37699), we could (also?) provide a global function that just returns a new NetworkState instance.

<?php
/**
 * Returns a new network state instance.
 *
 * @return NetworkState A fresh network state object.
 */
function get_network_state() {

    return new NetworkState();
}

Or just do regular instatiation (i.e., $network_state = new NetworkState();).

#2 follow-up: @johnjamesjacoby
22 months ago

Neat idea. And IIRC, this is actually how switch_to_blog() originally worked in the early days, around WordPress 2.5. The adverse effect for the rest of core was noticed in places where switching happened on actions & filters, where it wouldn't/couldn't properly clean up after itself, leaving the environment in an unpredictable state, so the bulkier more-thorough implementation won out.

I've always felt this type of approach is most useful when you already know ahead of time an array of site ID's you want to quickly switch between. If we can count on that always being true, maybe the way to implement this is actually on-top of the existing API with a $quick parameter, and a stash_current_blog() helper that gets called before the loop happens.

// Fetch blog IDs
$blogs = get_sites( $whatever );
$quick = true;

// Stash
stash_current_blog();

// Loop
foreach ( $blogs as $blog ) {
    switch_to_blog( $blog->blog_id, $quick );
}

// Restore
restore_current_blog();
Last edited 22 months ago by johnjamesjacoby (previous) (diff)

#3 follow-up: @swissspidy
22 months ago

Given our recent work on #26511, does it make sense to make this class more similar to the one proposed there in terms of naming and functionality?

#4 in reply to: ↑ 2 @tfrommen
22 months ago

Replying to johnjamesjacoby:

I've always felt this type of approach is most useful when you already know ahead of time an array of site ID's you want to quickly switch between. If we can count on that always being true, maybe the way to implement this is actually on-top of the existing API with a $quick parameter, and a stash_current_blog() helper that gets called before the loop happens.

How would the stash_current_blog() and the quick switch_to_blog( $id, true ) functions work exactly?

The main difference between our approaches is that your code actively works with (and in) the global scope, while my implementation does not. If I create a NetworkState object and call its restore method, I know exactly what I get.

Your example breaks, if you call something in the loop that stashes the current (already switched) site on its own.

#5 in reply to: ↑ 3 @tfrommen
22 months ago

Replying to swissspidy:

Given our recent work on #26511, does it make sense to make this class more similar to the one proposed there in terms of naming and functionality?

How would that look like? I don't see any similarity - except from that there is some restoring method in each class.

If it's just naming, then suggest whatever you like. :)

#6 @flixos90
22 months ago

Reference: I added an experimental patch in #25293, which also takes this ticket into account.

#7 @swissspidy
22 months ago

#37983 was marked as a duplicate.

#8 @swissspidy
21 months ago

Opened a new ticket to discuss naming convention: #38098.

Last edited 21 months ago by swissspidy (previous) (diff)

@tfrommen
19 months ago

#9 @tfrommen
19 months ago

@swissspidy and I just had a longer discussion at WordCamp Geneva, and we agreed on removing the static factory method from the suggested class, but instead provide two functions to store the current site state as well as restore a specific site state.

37958.patch includes the new WP_Site_State class and the mentioned functions, and also makes use of the new functions in three core files.

In a further step one could also introduce a set of functions in order to switch to a specific site and a specific locale in one go.

Feedback welcome!

This ticket was mentioned in Slack in #core-multisite by tfrommen. View the logs.


17 months ago

#11 follow-ups: @johnjamesjacoby
17 months ago

See also: WP_Locale_Switcher for work that's gone into this.

https://core.trac.wordpress.org/browser/trunk/src/wp-includes/class-wp-locale-switcher.php

We'd basically want to make a base class out of this, and have this extend that.

#12 in reply to: ↑ 11 ; follow-up: @tfrommen
17 months ago

Replying to johnjamesjacoby:

See also: WP_Locale_Switcher for work that's gone into this.

https://core.trac.wordpress.org/browser/trunk/src/wp-includes/class-wp-locale-switcher.php

We'd basically want to make a base class out of this, and have this extend that.

Did you have a look at the patch? :) I don't see anything in storing and restoring a single (!) site state that really urges to be handled similar to stacking up locales.

Could you elaborate? Or, in other words, what is it that you don't like in the current patch (and thus idea)?

#13 in reply to: ↑ 12 @johnjamesjacoby
17 months ago

Replying to tfrommen:

Did you have a look at the patch? :)

Yes.

I don't see anything in storing and restoring a single (!) site state that really urges to be handled similar to stacking up locales.

There are 4 methods for switching:

  • switch_to_locale()
  • restore_previous_locale()
  • restore_current_locale()
  • change_locale()

Or, in other words, what is it that you don't like in the current patch (and thus idea)?

I like the idea, because it's what the Locale switcher already does. I don't like the patch because states are already a named thing in WordPress, typically meaning a combination of statuses. While sites aren't filterable yet, they should be soon (see #37684) and these names would conflict, creating more & different confusion.

It's also too ambiguous: state could mean visibility, status, availability, or any combination there-of, but actually means none of those things in this implementation.

Historically, what you're proposing is originally how the site-switching API worked (back in 2005) but as a whole we've decided long ago to always switch back to the current site as quickly as possible, for two reasons:

  • maintain code clarity
  • not accidentally avoid actions or filters running deep in the stack (that might not be obvious in the current scope)

We decided on stack because that's how it works. It's an ordered array of sites in a stack, and you can traverse as deep into that stack (and back out again) as you'd like for your given purpose.

There isn't anything wrong with "stack" other than it being foreign, and since developers aren't exposed to the name of the globals, it doesn't even really effect anyone other than people working on core, which makes the overhead for re-learning a 10 year old API high with a low return value because we still need to maintain the old global incase someone touches it manually.

Phew!

#14 in reply to: ↑ 11 @swissspidy
17 months ago

I agree with both of you :-)

Right now, 37958.patch is great when you want to switch to multiple blogs in a loop and want to easily go back to the pre-loop state in one step (restore_site_state() in the patch).

Making individual switches suck less (e.g. easy to use stack, no globals) could be achieved by having a class similar to WP_Locale_Switcher. That's where a base class makes sense.

It seems like one talks about the former, the other one about the latter. Why not have both?

#15 @johnjamesjacoby
17 months ago

WP_Locale_Switcher:: restore_current_locale() does exactly what restore_site_state() does.

If we do end up with a switcher class, I don't think we need a fourth layer of API's around globals, objects, and procedures.

Also, the proposed patch here is kinda doing-it-wrong. Why create a local variable to encapsulate the states of a few globals when you can already look at the globals directly or avoid them procedurally?

In the get_site_state() function, a new WP_Site_State object is created with each call, so we'd end up with many copies of those globals with different signatures in PHP that will consume an ever increasing amount of memory when used multiple times within the same scope.

I think this is trying to solve a problem that doesn't exist, and introduces more complexity than it adds value.


Consider also that this patch is not a backwards compatible replacement for what's in core, because the switch_blog action does not get fired on the way out of the iterator anymore, only at the beginning. And currently, the switched stack is only really ever 2 items deep, not several, though it's certainly possible to do so on your own.

Not gonna lie, it's gonna be hard to win me over on this one. :)

Last edited 17 months ago by johnjamesjacoby (previous) (diff)

#16 @flixos90
17 months ago

I agree with @swissspidy that we're talking about two things here.

In my opinion, creating a WP_Site_Switcher class that works similarly to WP_Locale_Switcher should be the first iteration. It would allow us to get rid of the globals (needs-dev-note still, just for the edge-case someone directly accesses these). As far as this thread goes, I don't think there's anything in the way of doing that.

For the second step, I see @tfrommen's idea just as valuable. While switching and restoring a site are fortunately rather inexpensive operations, there's still unnecessary overhead when always switching back (restoring).

Backward-compatibility is a concern as @johnjamesjacoby mentioned, but I think it only is a concern if we change the core code to use it in areas where it already exists. To be a 100% safe, we could only apply the new idea to new code. But even if we use it elsewhere, it might not be that terrible in my opinion. The switch_blog hook only fires when a site is switched. Whether it actually switches or just restores, shouldn't matter. I know it has been a convention to always restore immediately afterwards, but I generally don't see any harm in going deeper into the stack. As we would need a dev-note for any of these changes anyway, I think we can carefully explain the changes and make people aware. @johnjamesjacoby Do you have a specific use-case where this would be a problem?

Note: See TracTickets for help on using tickets.