WordPress.org

Make WordPress Core

Opened 22 months ago

Closed 21 months ago

Last modified 2 months ago

#25030 closed defect (bug) (fixed)

Overlapping upload paths with multiple networks

Reported by: ddean Owned by: nacin
Milestone: 3.7 Priority: normal
Severity: normal Version:
Component: Multisite Keywords: has-patch
Focuses: Cc:

Description

When ms_files_rewriting is disabled on a WP install with multiple networks, wp_upload_dir() assigns the "main site" for each network the same upload path.

Expected: Every site other than the main site of the first network will be assigned a sites/[id] upload path.

Observed: The function does not distinguish between the main site on the primary network and the main site on a secondary network, so multiple sites are assigned wp-content/uploads.

Attachments (8)

25030.diff (1.3 KB) - added by jeremyfelt 22 months ago.
25030.2.diff (1.7 KB) - added by jeremyfelt 22 months ago.
25030.3.diff (1.5 KB) - added by jeremyfelt 22 months ago.
25030.4.diff (1.6 KB) - added by jeremyfelt 22 months ago.
25030.5.diff (1.8 KB) - added by jeremyfelt 22 months ago.
25030.6.diff (1.8 KB) - added by jeremyfelt 21 months ago.
25030.7.diff (2.7 KB) - added by nacin 21 months ago.
25030.8.diff (2.7 KB) - added by jeremyfelt 21 months ago.

Download all attachments as: .zip

Change History (40)

comment:1 @ddean22 months ago

Changing is_main_site() to return false for all sites on secondary networks would break signup and Network Update on those networks.

A new function, is_main_network(), could be a solution as could checking the $current_site global within wp_upload_dir().

comment:2 follow-up: @nacin22 months ago

  • Milestone changed from Awaiting Review to 3.7

Hmmmm.

Yeah, we could definitely benefit from an is_main_network().

comment:3 in reply to: ↑ 2 @ryan22 months ago

Yeah, we could definitely benefit from an is_main_network().

Sure.

@jeremyfelt22 months ago

comment:4 follow-up: @jeremyfelt22 months ago

  • Keywords has-patch added

25030.diff introduces is_main_network() and makes use of it in wp_upload_dir() to avoid overlapping upload paths.

comment:5 in reply to: ↑ 4 ; follow-up: @ddean22 months ago

Replying to jeremyfelt:

25030.diff introduces is_main_network() and makes use of it in wp_upload_dir() to avoid overlapping upload paths.

Awesome patch! That is really handy and should work seamlessly on most existing installs.

It does lock SITE_ID_CURRENT_SITE into designating the primary network, though, which is somewhat at odds with its name.

comment:6 in reply to: ↑ 5 ; follow-up: @jeremyfelt22 months ago

Replying to ddean:

It does lock SITE_ID_CURRENT_SITE into designating the primary network, though, which is somewhat at odds with its name.

Yeah, I debated that, but from all I can tell that's what the constant is there for. Of course, it would probably confuse the wp_upload_dir() if it were changed to another existing site ID all of a sudden.

comment:7 in reply to: ↑ 6 @ddean22 months ago

Replying to jeremyfelt:

Yeah, I debated that, but from all I can tell that's what the constant is there for. Of course, it would probably confuse the wp_upload_dir() if it were changed to another existing site ID all of a sudden.

It's not used for anything on a multiple network install so in that respect it's free for the taking. :) And you're right that changing to a different site ID after deployment would wreck things a bit, but I think wp-config.php is probably the most stable place to store a value like that.

comment:8 @nacin22 months ago

SITE_ID_CURRENT_SITE is actually designed for the current site of that pageload. It, along with DOMAIN_CURRENT_SITE, PATH_CURRENT_SITE, and BLOG_ID_CURRENT_SITE, are used to avoid any database queries for site information. If you have those defined unconditionally, you actually won't be able to have a multi-network install. You'll see how they short-circuit things in wpmu_current_site(). (It's also something you can define in sunrise.php if you wanted to.)

So, bigger problem. I'm actually not sure how we are to detect the main network. I suppose it is site_id = 1. If the site_id is not one, then we need to ensure that the site_id of 1 exists — and if not, we're dealing with the main network for whichever site_id is highest (least). So essentially, the main network is SELECT id FROM $wpdb->site ORDER BY id LIMIT 1. Yes, it would absolutely make sense for there to be some kind of constant that defines the primary network. I don't think we need to define it as we're nearly always going to be single-network with a site_id of 1. (Related: I think populate_network() needs to change to always force the value into site_id, as in #16568.)

wpmu_current_site() also does something really weird. It does a SELECT * FROM $wpdb->site but then only cares if there is > 1 or not, in which case we could do a LIMIT 2. Or, we could keep that query and then do things in PHP rather than the additional WHERE clauses we potentially use for $wpdb->site. Clearly, our multi-network support is weak in core. WordPress.org uses multiple networks but we actually define the above constants based on HTTP_HOST.

comment:9 @jeremyfelt22 months ago

Focusing on the is_main_network() check.

How about this order:

  1. Introduce new constant for PRIMARY_SITE_ID that is optional via wp-config.php, sunrise.php, or later. Check $current_site->id against this first.
  2. Check if 1 === $current_site->id;. If so, we can assume that site_id 1 still exists and that it is most likely the primary network, if not the only network.
  3. Else, query for the smallest site_id, cache the result, and compare with the $current_site->id.

comment:10 @nacin22 months ago

Yep — that seems to jibe perfectly with my comment. I would come up with a constant that uses NETWORK instead of SITE, just to try to make things less confusing. PRIMARY_NETWORK_ID? Might be confusing if paired with SITE_ID_CURRENT_SITE but, well, I guess this will be confusing for a long, long time.

comment:11 @jeremyfelt22 months ago

I really wanted to use PRIMARY_NETWORK_ID, so I'm game for that. I do wonder if it makes it more confusing only because that's how we roll with our blogs as sites and sites as networks.

comment:12 follow-up: @nacin22 months ago

I don't have a grandmaster plan*, but we've tried really hard to use network wherever possible, at the API and UI levels. We definitely no longer introduce "site" things when they mean network as it is just too damn confusing.

We haven't quite made the switch to always going with site over blog at the API level, though we have at the UI level, but that's just because it is more difficult for "site" to mean two different things, than it is to start using "network". Either way you still need to figure out whether a particular "site" means "blog" or "network", so we might as well use "network" wherever possible and lessen the number of ambiguous "site" instances (especially those that refer to the old MU definition). Hope that makes sense.

I think as long as we keep consistently moving toward "network", it may be easier to redefine "site" to mean "blog" in the long run. There are also quite a few functions to rename, though I have been hesitant renaming functions for the sake of simply renaming them, and would rather do things like completely replace their internals in the process. Three plus years later, we're still finding oddities in old MU code, so I've been trying to bide our time and only undergo renames when they let us shed dead weight.

(*) Well, I guess I do now. See this comment.

@jeremyfelt22 months ago

comment:13 in reply to: ↑ 12 ; follow-up: @jeremyfelt22 months ago

Replying to nacin:

(*) Well, I guess I do now. See this comment.

Love it.

25030.2.diff is a temporary refresh with a couple open ends.

  1. What are the repercussions of returning true when ! is_multisite()?
  2. Do we account for a scenario where $current_site->id is 1 and PRIMARY_NETWORK_ID is not?
  3. Is is_main_network() the best place to cache a backup query for PRIMARY_NETWORK_ID?

comment:14 in reply to: ↑ 13 @nacin22 months ago

Replying to jeremyfelt:

  1. What are the repercussions of returning true when ! is_multisite()?

None, since this is a new function. Given that is_main_site() returns true, it would make sense for is_main_network() to be consistent in that regard. If you must know whether you are running multisite, then is_multisite() is the proper and really only valid check.

  1. Do we account for a scenario where $current_site->id is 1 and PRIMARY_NETWORK_ID is not?

If PRIMARY_NETWORK_ID is defined, we should return $site_id == PRIMARY_NETWORK_ID;.

  1. Is is_main_network() the best place to cache a backup query for PRIMARY_NETWORK_ID?

Actually, no, because a switch_to_blog() could hypothetically affect $current_site. It doesn't currently, but I imagine we'll upend switching madness at some point. The site-options cache bucket is probably good.

Let's do site_id => network_id for our local variables here, same with all of the inline documentation. The only use of "site" should be in the $current_site variable name, is_multisite(), and $wpdb->site.

@jeremyfelt22 months ago

@jeremyfelt22 months ago

comment:15 follow-up: @jeremyfelt22 months ago

Oh wow.

Please ignore 25030.4.diff.

25030.3.diff is a temporary refresh.

  1. If PRIMARY_NETWORK_ID is not defined, and $current_site->id is 1, then we can assume 1 is the primary network and check against it.
  2. Still not entirely sure if the query for the lowest site ID should be stored in cache in the is_main_network() function? Probably, because this will be pretty authoritative. Even if switch_to_blog is used, as long as we store the result of the query then it shouldn't be affected.

comment:16 in reply to: ↑ 15 @jeremyfelt22 months ago

Replying to jeremyfelt:

  1. Still not entirely sure if the query for the lowest site ID should be stored in cache in the is_main_network() function? Probably, because this will be pretty authoritative. Even if switch_to_blog is used, as long as we store the result of the query then it shouldn't be affected.

I guess the real question is - should this happen here or in another function like get_primary_network_id()?

comment:17 @cgrymala22 months ago

I love this discussion. It's going to make a huge difference for the way we're using things at UMW (where we have a multi-network install with ~250 sites spread across ~50 networks), and I would love it if plugin authors began accounting for this type of system (of course, we also still need to train some plugin authors to even account for the fact that we have multisite).

This may be something to bring up in a separate ticket, but since we've begun discussing is_multisite() in the discussion about is_main_network(), it would be great to also have a core function to check to see if there is more than one network in the installation (is_multinetwork() maybe?). I have a custom is_multinetwork() function that we're using at UMW, which does a lot more checks than a core function would probably need to do (in addition to counting the rows in the site table, my function currently checks to see if any of the 3 known multi-network plugins are active, etc.), but I would imagine a core function probably wouldn't need to do much more but to check to see if there is more than one row in the site table. That could look something like:

if ( ! function_exists( 'is_multinetwork' ) ) {
    function is_multinetwork() {
        if ( ! is_multisite() )
            return false;
        global $wpdb;
        $networks = $wpdb->get_var( "SELECT COUNT(site_id) FROM {$wpdb->site}" );
        return is_numeric( $networks ) && $networks > 1;
    }
}

comment:18 @cgrymala22 months ago

  • Cc curtiss@… added

comment:19 follow-up: @nacin22 months ago

I don't really have a problem with keeping things limited to is_main_network() for now. Sure, we could add get_primary_network_id(), but its usefulness is also extremely limited at this time.

is_multi_network() would probably make sense when we actually have better overall support for multiple networks. Un-breaking upload paths is, I think, different from actually redoing a lot of the multisite/network/subdirectory/subdomain/mapping paradigms and then formally introducing new concepts then.

Side note, little trick, rather than doing a COUNT, just do: SELECT id FROM $wpdb->site LIMIT 2 then see whether you get back merely one row, or two. We do this in is_multi_author() too.

comment:20 in reply to: ↑ 19 @ddean22 months ago

Replying to nacin:

I don't really have a problem with keeping things limited to is_main_network() for now. Sure, we could add get_primary_network_id(), but its usefulness is also extremely limited at this time.

is_multi_network() would probably make sense when we actually have better overall support for multiple networks. Un-breaking upload paths is, I think, different from actually redoing a lot of the multisite/network/subdirectory/subdomain/mapping paradigms and then formally introducing new concepts then.

This. AFAIK, the upload path issue is the first case in a long time where WP core has actually needed to know whether the install had multiple networks to work as expected. I think this is a great opportunity to introduce an is_main_network() function, but not if that's going to open a can of worms or delay fixing upload paths.

@jeremyfelt22 months ago

comment:21 @jeremyfelt22 months ago

25030.5.diff caches primary_network_id if is_main_network() is required to check the database.

This seems to be working pretty smoothly in my local multi network setup right now.

@jeremyfelt21 months ago

comment:22 @jeremyfelt21 months ago

Updated 25030.6.diff changes the $network_id param to a default of false instead of an empty string. Also checks that PRIMARY_NETWORK_ID isn't defined as false for some reason.

@nacin21 months ago

comment:23 @nacin21 months ago

25030.7.diff enforces integers and strict checks everywhere and updates phpdoc. Seems to work.

comment:24 @jeremyfelt21 months ago

+1, 25030.7.diff works great in my current setup.

@jeremyfelt21 months ago

comment:25 @jeremyfelt21 months ago

Rereading... what should the expected behavior be when the cast to integer occurs on PRIMARY_NETWORK_ID?

if ( defined( 'PRIMARY_NETWORK_ID' ) ) 
    return $network_id === (int) PRIMARY_NETWORK_ID;

Anything not an integer evaluates as 0 and will most likely return as false. It may make more sense to fallback on our assumption that 1 is the main network at that point by passing it to the next logic block.

if( defined( 'PRIMARY_NETWORK_ID' ) && 0 !== (int) PRIMARY_NETWORK_ID )
    return $network_id === PRIMARY_NETWORK_ID;

if( 1 === $current_network_id )
    return $network_id === $current_network_id;

25030.8.diff does this and changes the param hint from string to int for $network_id

comment:26 follow-up: @nacin21 months ago

My thought was, if PRIMARY_NETWORK_ID is defined, you better get it right. But, we should let it be a numeric string. Which makes me want to keep 25030.7.diff, aside from the @param hint.

comment:27 in reply to: ↑ 26 @jeremyfelt21 months ago

Replying to nacin:

My thought was, if PRIMARY_NETWORK_ID is defined, you better get it right.

That sounds fair to me. 25030.7.diff looks good. Assuming @param change on commit rather than a new patch.

comment:28 @nacin21 months ago

In 25147:

Introduce is_main_network().

By default, a network ID of 1 is assumed to be the main network.
Otherwise, it is the first network listed in the wp_site table.

If PRIMARY_NETWORK_ID is defined, it is considered main network.

props jeremyfelt.
see #25030.

comment:29 @nacin21 months ago

  • Owner set to nacin
  • Resolution set to fixed
  • Status changed from new to closed

In 25148:

The main site of a secondary network should not use the original wp-content/uploads upload path.

props jeremyfelt.
fixes #25030.

comment:30 @nacin21 months ago

My thought was, if PRIMARY_NETWORK_ID is defined, you better get it right. But, we should let it be a numeric string.

Part of that stems from the possibility that this constant is defined based on a $wpdb->get_var() of $wpdb->site, which would be a string by default.

comment:31 @nacin20 months ago

In 25827:

Avoid a notice in is_main_network() when called in single site. see #25030.

comment:32 @slackbot2 months ago

This ticket was mentioned in Slack in #core-multisite by jeremyfelt. View the logs.

Note: See TracTickets for help on using tickets.