Make WordPress Core

Opened 11 years ago

Closed 11 years ago

#27003 closed task (blessed) (fixed)

Introduce wp_get_network() and begin cleanup of multisite load

Reported by: jeremyfelt's profile jeremyfelt Owned by:
Milestone: 3.9 Priority: normal
Severity: normal Version: 3.9
Component: Bootstrap/Load Keywords: needs-testing
Focuses: multisite Cc:

Description

During the multisite load process, we use wpmu_current_site() to process the requested URL into a matching network. The access to this function is marked as private, and it really does perform the one task—setting the $current_site global.

It wouldn't hurt to have a network retrieval function separate from this load process. Secondarily, the logic in wpmu_current_site() could use some cleanup.

The attached patch introduces wp_get_network() and reduces some of the cruft that is in wpmu_current_site().

Some notes:

  • I dropped the $sites = query for sites that we use to check if only one network has been installed. I'm not convinced this saves any time, and from what I can tell we don't use the $sites global for anything else in core. This would clear out #26942
  • Unfortunately, wp_get_network() must remain in ms-load.php for now, as ms-functions.php is not available yet. Not sure if this is something we should think about reorganizing.
  • I have not had a chance to test this with multiple networks yet. A multisite, single network appears to work fine. More testing and feedback is very much necessary.

Attachments (25)

27003.diff (5.8 KB) - added by jeremyfelt 11 years ago.
27003.2.diff (5.6 KB) - added by jeremyfelt 11 years ago.
27003.3.diff (9.8 KB) - added by nacin 11 years ago.
27003.4.diff (10.5 KB) - added by nacin 11 years ago.
27003.5.diff (11.9 KB) - added by nacin 11 years ago.
Reduces cookie_domain to nada
27003.6.diff (12.2 KB) - added by nacin 11 years ago.
Adds wp_load_core_site_options() calls to missing places
27003.7.diff (12.5 KB) - added by nacin 11 years ago.
27003.8.diff (12.6 KB) - added by jeremyfelt 11 years ago.
27003.9.diff (12.8 KB) - added by jeremyfelt 11 years ago.
27003.10.diff (12.7 KB) - added by nacin 11 years ago.
27003-docs-fix.patch (584 bytes) - added by TobiasBg 11 years ago.
Fix docs for wp_get_network()
27003.11.diff (13.2 KB) - added by nacin 11 years ago.
27003.12.diff (14.3 KB) - added by nacin 11 years ago.
27003.13.diff (23.1 KB) - added by nacin 11 years ago.
Adds hypothetical filters for short-circuiting, for governing how many path segments to consider, etc.
27003.14.diff (23.1 KB) - added by nacin 11 years ago.
27003.15.diff (460 bytes) - added by johnjamesjacoby 11 years ago.
$networks is not keyed by column ID in wpmu_current_site()
27003.16.diff (23.6 KB) - added by jeremyfelt 11 years ago.
27003.17.diff (23.1 KB) - added by jeremyfelt 11 years ago.
Built on 27003.14.diff, fixes single site path issue - via nacin in IRC
27003.18.diff (23.4 KB) - added by markjaquith 11 years ago.
refresh 27003.17.diff after last commit
27003-docs-cleanup.patch (1.7 KB) - added by TobiasBg 11 years ago.
Docs fixes and cleanup.
27003.19.diff (599 bytes) - added by jeremyfelt 11 years ago.
Missing data parameter in wp_cache_set
27003.20.diff (1.7 KB) - added by jeremyfelt 11 years ago.
Deprecate get_current_site_name()
27003.21.diff (852 bytes) - added by jeremyfelt 11 years ago.
Proper cookie domain
27003.22.diff (3.9 KB) - added by jeremyfelt 11 years ago.
Documents a dark area of ms-settings
27003.23.diff (1.1 KB) - added by jeremyfelt 11 years ago.

Download all attachments as: .zip

Change History (80)

@jeremyfelt
11 years ago

#1 @jeremyfelt
11 years ago

  • Keywords has-patch needs-testing dev-feedback added

#2 follow-up: @Denis-de-Bernardy
11 years ago

Slightly off topic, but since the ticket mentions "cleanup of multisite load" in its title: there's a lot of code in wp-includes/ms-*.php that runs very early and assumes that WP is already installed. When it isn't and you're hoping to get redirected to wp-admin/install.php, it ultimately fails with an extremely misleading database connection error.

Would it be desirable (in a separate ticket, of course) to make it so that adding the relevant multisite defines in advance, when installing WP, would make the installer just work and set things up with multisite already up and running?

Desirable or not, I raised the question here because a few isset() calls here and there are currently needed to avoid E_NOTICE errors when loading wp-admin/install.php with these defines around, so might as well add them in advance during the cleanup.

Last edited 11 years ago by Denis-de-Bernardy (previous) (diff)

#3 in reply to: ↑ 2 ; follow-up: @jeremyfelt
11 years ago

Related: #25316 (to this ticket, not the comment)

Replying to Denis-de-Bernardy:

Would it be desirable (in a separate ticket, of course) to make it so that adding the relevant multisite defines in advance, when installing WP, would make the installer just work and set things up with multisite already up and running?

I thought there was a ticket for this, but I can't find it now. I'd be open to cleaning that up.

#4 in reply to: ↑ 3 @Denis-de-Bernardy
11 years ago

Replying to jeremyfelt:

I thought there was a ticket for this, but I can't find it now. I'd be open to cleaning that up.

I've created #27012.

This ticket was mentioned in IRC in #wordpress-dev by jeremyfelt. View the logs.


11 years ago

@jeremyfelt
11 years ago

#6 @jeremyfelt
11 years ago

Ok. 27003.2.diff takes another approach via get_network_by_path(). This function accepts only $domain and $path as arguments and walks through some possibilities in an attempt to find a matching network.

On a request for http://foo.bar.network.com/path/page-title/, the matching occurs like this:

  1. foo.bar.network.com/path/
  2. foo.bar.network.com/
  3. bar.network.com/path/
  4. bar.network.com/
  5. network.com/path/
  6. network.com/

If no matching network is found from those possibilities, false is returned. I've tested this a bit in a multinetwork setup and it seems to be doing the trick thus far. I got rid of my sunrise.php and have been flipping between networks. My favorite part is that it ditches is_subdomain_install().

#7 @nacin
11 years ago

Yes please. Let's unit test this function (I can write some tomorrow if you think fresh eyes will help) and drop this in.

@nacin
11 years ago

This ticket was mentioned in IRC in #wordpress-dev by nacin. View the logs.


11 years ago

@nacin
11 years ago

@nacin
11 years ago

Reduces cookie_domain to nada

@nacin
11 years ago

Adds wp_load_core_site_options() calls to missing places

@nacin
11 years ago

@jeremyfelt
11 years ago

#9 @jeremyfelt
11 years ago

27003.8.diff adds another foreach() to walk through the possible domains from the originally passed domain. We were missing foo.bar.network.com when bar.network.com was the network.

I'll take a look at updating tests to match this scenario as well.

@jeremyfelt
11 years ago

#10 @jeremyfelt
11 years ago

  • Milestone changed from Awaiting Review to 3.9

27003.9.diff adds additional assertions for subdomains.

  • site1.wordpress.net/ should match the wordpress.net/ domain
  • site1.wordpress.net/four/ should match the wordpress.net/four/ domain

This ticket was mentioned in IRC in #wordpress-dev by jeremyfelt. View the logs.


11 years ago

@nacin
11 years ago

#12 @nacin
11 years ago

In 27178:

Multisite: Add get_network_by_path() and wp_get_network() to begin cleanup of multisite load.

Tries to get network detection under control by simplifying wpmu_current_site(). It now also pops off each subdomain to find a more general match. Adds unit tests for get_network_by_path() and a new network factory for unit tests.

Much of this is likely to change in 3.9 as more of ms-load.php and ms-settings.php gets hacked to bits.

props jeremyfelt.
see #27003.

@TobiasBg
11 years ago

Fix docs for wp_get_network()

#13 follow-up: @TobiasBg
11 years ago

27003-docs-fix.patch corrects inline docs param types and name.

Then, in this line, we allow $networks to be an empty array (happens, in theory, if there are no rows in $wpdb->site), but in the next line we access $networks[0]. I'm not sure how likely that is to happen in practice, but we should probably avoid it to not trigger a notice.

#14 in reply to: ↑ 13 @jeremyfelt
11 years ago

Replying to TobiasBg:

(happens, in theory, if there are no rows in $wpdb->site)

Would it make sense to use another wp_die() here if the count is 0 to explain that no networks are setup?

@nacin
11 years ago

@nacin
11 years ago

@nacin
11 years ago

Adds hypothetical filters for short-circuiting, for governing how many path segments to consider, etc.

#15 @jeremyfelt
11 years ago

I like where 27003.13.diff is going. To sum up:

  1. If the network and site is defined, use it.
  2. If additional sites are paths of a primary network domain, lookup the network before finding the site.
  3. If additional sites are different domains (or subdomains), lookup the site before finding the network. (The site tells us what its parent network is)

So much cleaner.

get_site_by_path() and the changes to get_network_by_path() look great so far. I'm still parsing in detail. Seems it would be fairly straightforward to add caching to each of these based on the requested args. The filters are going to go a ways toward reducing the use of sunrise entirely for domain lookups.

I wonder if there's a place for a "every thing you're looking for" cache key after step 1 and before steps 2/3 that is able to retrieve site and network information based on the requested domain/path or if the processing required to extract the requested domain/path makes it not worth the effort.

Version 0, edited 11 years ago by jeremyfelt (next)

@nacin
11 years ago

#16 @nacin
11 years ago

Yeah, so, 27003.13.diff is a pretty sharp departure from how multisite site detection works now, which starts with identifying the network. The patch tries hard to find the site first, then use the site's network ID to set the current network. To find a site, it takes the domain plus the first path segment. Now, that doesn't always work:

  • If we're dealing with a "subdirectory" install, our network may have a path. (A "subdomain" install, which really is an "anything goes" install, ideally does not have a network path.) So we need to identify the network first, then use the first path segment following the network's path segment to identify the site.
  • If constants are defined for the network (as they are per instructions), then we can simply use those first. Note the patch doesn't currently then restrict the resulting site to that network. In fact in 3.8 if the located site isn't on the declared network, then it resets the network object anyway. (There was actually no reason to find the network first in 3.8.) I've updated that in 27003.14.diff, otherwise the same as 13.diff.

It also tries to optimize for some common situations. Some notes:

  • If it's a "subdirectory" situation where constants are not defined, it queries the DB to see if there are even two networks installed. If there's just one, then that's obviously our network, and it caches it (just as it did in 3.8). While constants are going to be in place almost always, when they're not, jeremyfelt and I had agreed in IRC that it's going to be more common for a single network to be used.
  • If you're running an external object cache, it caches whether any network has a path. If none do, then it can bypass querying the networks table for paths all together, which results in a much cleaner (and much more easily cached) query.
  • It's probable that these should be network transients.
  • Filters are in place to not only short-circuit the site-finding code all together, but to also specifically declare how many path segments are in play. So an install could say that domain X actually has sites for two path segments, while domain Y has none. If we're dealing with a "subdomain" only install, then trying to query for every single path is a terrible query. To avoid that, a future patch should use the external object cache to see if any sites at all have a path. This is more or less necessary to maintain the same performance of subdomains pre-patch.
  • It requires additional caching, and needs a review from pento for how to best make these queries. I went for fewest queries possible when in reality it might make more sense to add a few more queries but be sure they are being cached to a T.
  • Because this code opens up the possibility that *any* initial path segment is actually a site, this can cause some performance issues. At least with subdomains you have DNS to prevent essentially random strings in a URL from causing a boatload of queries — and, yes, even flooding memcache. (If using a wildcard DNS, then the whole redirect-to-signup kicks in.) These queries are not the absolute cheapest and haven't been tested against a massive blogs table, yet.

There are likely further optimizations to make (not to mention a number of declared @todo's), there are likely some assumptions made that will need to be challenged (for example — can a subdomain install have a network path, as it exists now?), and it needs a lot of unit tests. Also, the *_by_path_segments_count filter names aren't great. But it's a great step forward.

#17 @nacin
11 years ago

At the time of these new filters, plugin.php and functions.php are loaded — and little else. jeremyfelt points out that these filters mean you won't need to do much domain/path/network/site manipulation in sunrise anymore, but you'll need to use a sunrise file to actually register your plugin callbacks. I find that to be OK. It ends up being much easier in some respects.

Say make.wordpress.org was our network and /core/ was our site, and then we wanted to add a /core/handbook/ site. (I had to do this at one point via a nice hack; though they're merged now.) It'd be easy for me to use the site_by_path_segments_count filter to change the 1 segment to a 2 when the requested path starts with /core, or more so /core/handbook. Or I could use the pre_* filter to actually return a site object (we'll need a wp_get_site() to match wp_get_network() and to replace basic usage of get_blog_details()) for exact matches.

Of course, you could also just do $current_site = wp_get_network( X ) and $current_blog = wp_get_site( X ). It really depends what kind of lifting you want to do and where you want it done. If you had a second level of path segments you wanted considered on a dynamic basis (so not simply "handbook") then passing this off to find_site_by_path() will make your life easier.

And then there's the question: why limit it to just one path segment by default? Mainly because it's so dang hard to predict and optimize for all of the setups out there. You might have a network with thousands of sites, all with a path of / (so, domains or subdomains), yet you'll need to take the entire URL (/2012/01/05/something/page/2/feed/whatever/) and check to see if any of those path segments add up to a site. That means a unique query for every single URL hitting the install, which is a nightmare to cache.

Now, you could query for all sites on that domain, cache those results, and then simply see if any of the paths match up, but how do I know if it's one site with a path of /, or if *every* one of those thousands of sites are on that particular domain?

There are only so many "large" networks out there (networks with > 10,000 sites) that would run into performance issues; it's possible we could have different optimization "tracks" depending on the kind and size of install you're running. That's kind of what the proposed filters start to get at, by letting you control the path segments.

I do think we can get smarter here, and these patches make some headway, but it's not straightforward.

#18 @nacin
11 years ago

Some thoughts on function naming:

wp_get_network() without any arguments could replace get_current_site(). It would be lovely to ditch "$current_site" absolutely everywhere. Even in this file; we could possibly make it a reference to a new variable.

wp_get_site() is probably a bit too broadly named. Though we don't use it all that much, the "ms" prefix here could help. but, this function isn't yet a must-have. We could wait for some cleanups and deprecations for when "site" actually refers to "network". We already have is_main_site() but also get_current_blog_id()... we're getting there.

#19 @jeremyfelt
11 years ago

We should take #26403 into consideration while determining the requested path. I don't think a strtolower() would hurt anything here.

This ticket was mentioned in IRC in #wordpress-dev by jeremyfelt. View the logs.


11 years ago

@johnjamesjacoby
11 years ago

$networks is not keyed by column ID in wpmu_current_site()

#21 follow-up: @johnjamesjacoby
11 years ago

27003.15.diff fixes the barrage of debug notices currently fired off from trunk when multisite is enabled and the possibility of multiple networks exists.

#22 in reply to: ↑ 21 @jeremyfelt
11 years ago

Replying to johnjamesjacoby:

27003.15.diff fixes the barrage of debug notices currently fired off from trunk when multisite is enabled and the possibility of multiple networks exists.

This is needed against current trunk. The changes introduced in 13.diff and 14.diff will effectively deprecate wpmu_current_site() if/when merged.

@jeremyfelt
11 years ago

#23 @jeremyfelt
11 years ago

27003.16.diff builds on Nacin's 27003.14.diff and adds caching for requests made to get_network_by_path(). Keys are an md5 hash of the requested domain and path, along the same lines of the key creation in WP_Theme.

$request_cache_hash = md5( $domain . '/' . $path ); 
$network_id = wp_cache_get( 'network_' . $request_cache_hash, 'networks' );

I'm not sure where invalidation would be needed here. This is a cache of a specific request and for it to become invalid, the network ID matched with the domain and path would have to change. This doesn't seem too common, though we can clear or reset the cache value way down the road when some sort of network management appears in the admin. Maybe?

#24 follow-up: @nacin
11 years ago

I think everything (even the old stuff) should actually be wp_cache_add(). There's no reason for an unconditional set. If it already exists somewhere, we should bail. (set() forces a cache update, even across multiple servers, which isn't necessary when we are simply caching a value that wasn't already cached, versus updating an invalid value.)

While a 'networks' bucket is good to think about, it's not a declared global bucket. Let's keep using site-options to keep it clean.

The one problem with this is it'll result in a new cache key for every unique URI request. That's a lot of cache writes. The second thing to think about is that get_network_by_path() is actually the cheaper function. The networks table will without a doubt smaller than the blogs table, and it's often sped up or bypassed via a constant, or there being only one network (this is a cached state, though populate_network() should clear that cache), or by their being no networks with paths. The function is also only used in one situation, and only because we're only searching one path level (past the network's path, which is why we need it). Caching the get_site_by_path() request is going to be more useful.

If networks_have_paths is false, then we only need the domain in the cache key. Or even better, this is when networks_have_paths comes in useful. Now that I think about it, if no networks have a path, then we don't actually need to run get_network_by_path(), and can instead go right to get_site_by_path().

We can actually leverage networks_have_paths also outside of get_network_by_path() as a way to replace the is_subdomain_install() check. As it is now, if you're running a "subdomain" install and add a network with a path, you'd have to add your own handling (well, adjust one of the new filters) to make sure any deeper-pathed sites off that network are handled. Instead, we can simply use networks_have_paths as a barometer. No current subdomain install would be adversely affected, as none of them have a network with a path. Presumably.

Not going to lie, I kind of just want to load all networks into memory and bypass that whole mess.

#25 in reply to: ↑ 24 @jeremyfelt
11 years ago

Replying to nacin:

The one problem with this is it'll result in a new cache key for every unique URI request.

Not going to lie, I kind of just want to load all networks into memory and bypass that whole mess.

Good point(s). There's something in this combo. I was looking at adding caching to wp_get_network() and the path stuff was making less sense. My guess is that the number of multi network installs where the number of networks is larger than 10, 20, 50 (?) is very low. Storing that into memory with the ability to handle larger setups via sunrise makes sense.

#26 @jeremyfelt
11 years ago

27003.14.diff is having trouble with an initial single network install with one site.

  • src.wordpress-develop.dev loads fine, as does src.wordpress-develop.dev/wp-admin/
  • src.wordpress-develop.dev/2014/01/07/hello-world/ initiates a get_site_path() call with a segment count of 2, only passing /2014/01/ and /2014/ as arguments to the site query. The / path is not found and results in ms_not_installed().

@jeremyfelt
11 years ago

Built on 27003.14.diff, fixes single site path issue - via nacin in IRC

#27 follow-up: @johnjamesjacoby
11 years ago

Can we get 27003.15.diff committed while we think this through, so trunk stops throwing notices?

#28 in reply to: ↑ 27 @jeremyfelt
11 years ago

Related #21143 - a hook for when no site is found.

Replying to johnjamesjacoby:

Can we get 27003.15.diff committed while we think this through, so trunk stops throwing notices?

+1

#29 @markjaquith
11 years ago

In 27275:

Fix a sometimes notice in wpmu_current_site()

props johnjamesjacoby. see #27003.

@markjaquith
11 years ago

refresh 27003.17.diff after last commit

#30 @markjaquith
11 years ago

Not going to lie, I kind of just want to load all networks into memory and bypass that whole mess.

Considering that having a huge number of networks is rare, this might be a good optimization.

#31 @nacin
11 years ago

Things before commit:

  • When testing this on WordPress.org I found a few issues related to the use of get_site_option() prior to setting $wpdb->siteid, so site_id = 0 was getting queried. So we're now going to pass the network ID to wp_load_core_site_options() and wait to set the site_name property until after $wpdb is init'd.
  • If path segments are > 2, only the longest two are used. There needs to be a loop to query /a/b/c/d/, , /a/b/c/, /a/b/, /a/, and /.
  • get_site_by_path() needs unit tests.

Hope to commit this in the next few hours or so, once I'm stationary and can fix inevitable WP.org breakage.

#32 follow-up: @nacin
11 years ago

In 27359:

Introduce get_site_by_path() and further rewrite the site detection process for multisite.

This is the first big step to supporting arbitrary domains and paths. In this new approach, sites are detected first where possible, then the network is inferred. Allows filtering for arbitrary path segments, smooths out some weirdness, and removes various restrictions. A sunrise plugin could do much of its work by adding filters, if those are even needed.

see #27003.

#33 @nacin
11 years ago

  • Keywords has-patch dev-feedback removed
  • Type changed from enhancement to task (blessed)

Leaving this to jeremyfelt for any additional tweaks or cleanups.

@TobiasBg
11 years ago

Docs fixes and cleanup.

#34 @TobiasBg
11 years ago

  • Keywords has-patch added

27003-docs-cleanup.patch is a refresh of 27003-docs-fix.patch plus some white space cleanup.

#35 @nacin
11 years ago

In 27381:

Doc fixes for wp_get_network().

props TobiasBg.
see #27003.

#36 follow-up: @jeremyfelt
11 years ago

  • Keywords has-patch removed

General caveat...

The default path segments number for both networks and sites is 1. If you have a network of foo.bar/network1/ and a site of foo.bar/network1/site1/, use of the site path segments filter is required. If 2 is not specified via filter, then get_site_by_path() will try to match network1 for your site rather than /network1/site1/.

Some extra web server config for wp-* to work and likely a sanity check is required as well.

We *could* think about automatically upping the site path count if networks have paths, but this seems like something to let lie for a bit.

#37 in reply to: ↑ 36 ; follow-up: @nacin
11 years ago

Replying to jeremyfelt:

The default path segments number for both networks and sites is 1. ... We *could* think about automatically upping the site path count if networks have paths, but this seems like something to let lie for a bit.

Actually, the default is all paths, but the functions receive 1 in most cases. If a network has a path (which only happens for "subdirectory" installs), then get_network_by_path() is run first with a segment number of 1. Then get_site_by_path() is run with a segment number of 1 + however many segments are in the network's path. Look for the is_subdomain_install() call in [27359]. So, your caveat was accounted for. :-)

#38 in reply to: ↑ 32 @gradyetc
11 years ago

Replying to nacin:

In 27178:

Multisite: Add get_network_by_path() and wp_get_network() to begin cleanup of multisite load.

Tries to get network detection under control by simplifying wpmu_current_site(). It now also pops off each subdomain to find a more general match. Adds unit tests for get_network_by_path() and a new network factory for unit tests.

Much of this is likely to change in 3.9 as more of ms-load.php and ms-settings.php gets hacked to bits.

props jeremyfelt.
see #27003.

This commit introduced a bug in wp-includes/ms-settings.php:

if ( ! $current_site = wp_cache_get( 'current_network', 'site-options' ) ) {
	// Are there even two networks installed?
	$one_network = $wpdb->get_row( "SELECT * FROM $wpdb->site LIMIT 2" ); // [sic]
	if ( 1 === $wpdb->num_rows ) {
		$current_site = wp_get_network( $one_network );
		wp_cache_set( 'current_network', 'site-options' );
	} elseif ( 0 === $wpdb->num_rows ) {
		ms_not_installed();
	}
}

The call to wp_cache_set on line 78 is missing the $data argument.

It should be:

wp_cache_set( 'current_network', $current_site, 'site-options' );
Last edited 11 years ago by gradyetc (previous) (diff)

@jeremyfelt
11 years ago

Missing data parameter in wp_cache_set

#39 @jeremyfelt
11 years ago

Via @gradyetc's comment, 27003.19.diff adds the missing data to wp_cache_set().

@jeremyfelt
11 years ago

Deprecate get_current_site_name()

#40 @jeremyfelt
11 years ago

27003.20.diff deprecates get_current_site_name() now that it is no longer used (and is private). wp_get_network() provides some replacement functionality, though we may be able to get by without specifying a replacement.

Related #27199, which was fixed in [27359].

#41 in reply to: ↑ 37 @jeremyfelt
11 years ago

Replying to nacin:

Replying to jeremyfelt:

The default path segments number for both networks and sites is 1. ... We *could* think about automatically upping the site path count if networks have paths, but this seems like something to let lie for a bit.

Actually, the default is all paths, but the functions receive 1 in most cases. If a network has a path (which only happens for "subdirectory" installs), then get_network_by_path() is run first with a segment number of 1.

Ahh, I was mixing subdomains and subdirectories. Caveat Empty. :)

@jeremyfelt
11 years ago

Proper cookie domain

#42 @jeremyfelt
11 years ago

In 27003.21.diff, if we have found a site and a network and the site's domain differs from the network domain, use the site domain for cookies.

This is a little ugly because we're setting the network's cookie domain with site information, but that's how we set COOKIE_DOMAIN.

27003.19.diff and 27003.20.diff are separate from this.

#43 @nacin
11 years ago

In 27406:

Multisite load: Properly call wp_cache_set().

props gradyetc.
see #27003.

#44 follow-up: @nacin
11 years ago

Deprecating something in load.php or ms-load.php is tough because these files are available earlier than deprecated.php or ms-deprecated.php would be. If someone is calling get_current_site_name() in sunrise.php this would turn into a fatal error.

get_current_site_name() is for populating the object with the site_name property. While it accepts a $current_site, it is before $wpdb->siteid is actually set, which means get_site_option() doesn't actually work as a replacement. That's why it does these weird direct DB queries and cache calls, and that's why I side-stepped it and chose to set site_name after the fact.

If wp_get_network() returned a WP_Network object we could lazy-load site_name and it wouldn't be a problem. But we have no proper way to "switch" to a network so it's not like we could make get_site_option() do what we wanted it to do.

get_current_site()->site_name is used in a decent number of places. However, the changes I referenced earlier also mean that site_name will be set even if sunrise had set $current_site and $current_blog (it got pulled out of the conditional), thus it doesn't matter if it doesn't do anything.


Seems like our default handling of cookie domains was always such that we assumed the current network would completely encompass all domains on that respective network. That, obviously, is wrong. I imagine if a site doesn't match a network now, with no special handling (a plugin, etc.), COOKIE_DOMAIN would be set to the network, and the site wouldn't have functioning cookies? If so, then we could aim to think outside the box a bit here.

#45 @nacin
11 years ago

In 27407:

Deprecate get_current_site_name(). see #27003.

#46 in reply to: ↑ 44 @jeremyfelt
11 years ago

Replying to nacin:

I imagine if a site doesn't match a network now, with no special handling (a plugin, etc.), COOKIE_DOMAIN would be set to the network, and the site wouldn't have functioning cookies?

Exactly. 27003.21.diff (very weak) forced the site domain rather than the network domain, but that only works when the paths on each are the same.

We need to determine the best way to handle COOKIE_DOMAIN, COOKIEPATH, SITECOOKIEPATH, and ADMIN_COOKIE_PATH, all which are set in ms_cookie_constants() if not previously set elsewhere.

One option is to continue with each page load resulting in one set of cookies. This would be relatively straight forward and requires that ms_cookie_constants() have access to current site and network information. Cookie constants would then be set based on the site that loaded, not the network.

Another option is to consider cases when setting cookies for multiple domains might be desirable. Though a user with access to one site doesn't necessarily need (or want) cookies for the network's primary admin site. This is probably best left to developers with sunrise or elsewhere. We could add filters in ms_cookie_constant() to provide a nice way in.

Also possible alongside either approach—wrapping setcookie().

@jeremyfelt
11 years ago

Documents a dark area of ms-settings

#47 @jeremyfelt
11 years ago

Some thoughts.

  • Should a requested path be considered case sensitive for finding site/network? #26403 has a strtolower() patch.
  • Some sort of site_not_found hook (see #21143) can be useful as a redirect opportunity.
  • Some sort of hook in ms_not_installed() may also be useful. This is less of a redirect opportunity and more of an "oh, crap. now what." opportunity.
  • ms_not_installed() docs are misleading. We mostly call on it when a network cannot be found, but the errors seem centered around missing sites. The errors are also (purposely?) vague. We could probably be a bit more explicit about what has happened when is_admin(). This is probably worth a new ticket for 4.0.
  • I think the current cookie domain stuff works for current configurations. Everything can already be (and is being) overridden in sunrise. It is worth looking at in 4.0 to try and support more arbitrary domains and paths.

27003.22.diff moves the site/network not found block up a bit and joins it with the empty $current_site check. This can all happen before the cookie domain is set and before we try to find a network's main site.

The main goal for this patch is to add documentation to some of the redirect behavior and to identify where we can hook in for customization.

#48 @nacin
11 years ago

In 27439:

In get_site_by_path(), avoid passing $paths through prepare(). If a path contains a %, we end up with problems. see #27003.

#49 follow-up: @jeremyfelt
11 years ago

  • In get_network_by_path(), we use wp_get_network() to retrieve the same object we just retrieved. It may be worth dropping that additional call for now until we know it's needed.
  • Actions for found_site_by_path and found_network_by_path when sites and networks are being returned by their respective functions could be useful as a way for sunrise to hook in and set custom caching (or other functionality) for repeat requests.
  • nacin made mention of using wp_cache_add() in a previous comment. We're still using wp_cache_set() during the bootstrap in ms-settings.

@jeremyfelt
11 years ago

#50 @jeremyfelt
11 years ago

  • 27003.23.diff caches $current_site->blog_id when found the first time under the 'network_' . $current_site->id . '_blog_id' key.
  • 27003.22.diff is still on the table.
  • Thinking through a filter for setting the cookie domain to allow sunrise to hook in there as well in advance of any future cookie handling that we do.

#51 @nacin
11 years ago

In 27663:

Introduce a ms_site_not_found filter to replace NOBLOGREDIRECT.

Move some processing down below the point where we bail if there's no site. Add more documentation.

props jeremyfelt.
fixes #21143, see #27003.

#52 @nacin
11 years ago

In 27664:

Multisite load: Cache the main site lookup query.

props jeremyfelt.
see #27003.

#53 @nacin
11 years ago

In 27665:

Use wp_cache_add() instead of wp_cache_set() when we don't want an unconditional set.

props jeremyfelt.
see #27003.

#54 in reply to: ↑ 49 @nacin
11 years ago

I'm closing this out. New tickets for any other issues. Great work, jeremyfelt, I'm loving this. :-)

Replying to jeremyfelt:

  • In get_network_by_path(), we use wp_get_network() to retrieve the same object we just retrieved. It may be worth dropping that additional call for now until we know it's needed.

I figure that wp_get_network() will eventually start returning a proper object, so I wanted to go ahead with this abstraction now. Also, originally, there was some decoration that occurred in wp_get_network(). It doesn't matter for the moment. That said, emphasizing that a DB row should be passed through to wp_get_network() sets a good standard for others to do so, which means the migration to a proper object might be smoother.

  • Actions for found_site_by_path and found_network_by_path when sites and networks are being returned by their respective functions could be useful as a way for sunrise to hook in and set custom caching (or other functionality) for repeat requests.

Seems OK; I imagine you're referring to then leveraging that information in a pre_* hook. Let's spin off a separate ticket for this, though.

  • Thinking through a filter for setting the cookie domain to allow sunrise to hook in there as well in advance of any future cookie handling that we do.

Since cookie handling is a separate can of worms, let's also make that a separate ticket.

#55 @nacin
11 years ago

  • Resolution set to fixed
  • Status changed from new to closed

I'm closing this out. New tickets for any other issues.

Note: See TracTickets for help on using tickets.