WordPress.org

Make WordPress Core

Opened 14 months ago

Last modified 10 days ago

#37923 assigned feature request

Introduce shared wp_blogmeta database table for multisite installation

Reported by: johnjamesjacoby Owned by: spacedmonkey
Milestone: 5.0 Priority: normal
Severity: normal Version:
Component: Networks and Sites Keywords: has-patch has-unit-tests needs-testing ms-roadmap
Focuses: multisite Cc:

Description

In a multisite installation, it would be nice to be able to query for sites using some kind of other arbitrary method. This currently is not possible because each site has it's own _options database table. (This is totally fine, and makes sense considering the huge number of options each site will usually have.)

However, with the introduction of WP_Site_Query and now having the ability to query for sites in a multitude of ways, a dedicated blogmeta database table would pair nicely with the WP_Site object.

This may not be necessary for core right away, but eventually, some multisite-specific options could get moved here to avoid polluting the per-site options table. For example, the new_user_ options when inviting existing users to an existing site, are a multisite specific option that is auto-loaded, causing a potentially costly recalculation of alloptions just to invite a user to your site.

A few other neat potential core usages off-the-top-of-my-head:

  • Store for the total number of users in each site, and query by most popular
  • Store for the total number of published posts in each site, and query by most active
  • Store the Blog Name, and query for sites alphabetically
  • Store a site_owner user ID, and query for sites the user "owns" relatively efficiently (compared to the LIKE query on usermeta)

Attachments (16)

sitemeta.php (2.7 KB) - added by spacedmonkey 9 months ago.
sitemeta1.php (3.6 KB) - added by spacedmonkey 8 months ago.
37923-phase1.diff (39.9 KB) - added by flixos90 7 months ago.
37923.diff (35.1 KB) - added by flixos90 7 months ago.
37923.2.diff (19.8 KB) - added by spacedmonkey 3 months ago.
37923.3.diff (25.9 KB) - added by spacedmonkey 3 months ago.
37923.4.diff (17.3 KB) - added by spacedmonkey 2 months ago.
37923.5.diff (19.0 KB) - added by spacedmonkey 2 months ago.
37923.6.diff (18.7 KB) - added by spacedmonkey 2 months ago.
37923.7.diff (19.8 KB) - added by spacedmonkey 2 months ago.
37923.eachnetwork.diff (19.4 KB) - added by spacedmonkey 2 months ago.
37923.8.diff (36.6 KB) - added by flixos90 2 months ago.
37923.9.diff (29.1 KB) - added by spacedmonkey 2 months ago.
37923.10.diff (29.0 KB) - added by spacedmonkey 2 months ago.
37923.11.diff (42.2 KB) - added by spacedmonkey 2 months ago.
37923.12.diff (42.2 KB) - added by spacedmonkey 2 months ago.

Download all attachments as: .zip

Change History (85)

#1 @johnjamesjacoby
14 months ago

A first-pass proof of concept on GitHub: https://github.com/stuttter/wp-blog-meta

This ticket was mentioned in Slack in #core-multisite by jeremyfelt. View the logs.


9 months ago

#3 @iamfriendly
9 months ago

A list of all options created, by default, (4.7.1) when a new site is added to a network. Additionally, suggested candidates for what might move from wp_#_options to wp_blogmeta are indented and in bold.

https://docs.google.com/document/d/1bMajucj7vVLP2xFIoTC5BdOTzF5jEfnYcrxrVEdVH5I/edit?usp=sharing

Namely:

admin_email - getting a list of all site's default admins seems useful (we do it regularly, for example)
blog_public - finding out which sites are public more easily seems like something that would speed up other queries for the REST API for example
blogdescription - seems to go in parallel with other info currently in my-sites
blogname - used in my-sites
db_version - I could be dissuaded on this, but seems useful to check if all sites on a network are up-to-date
Edit: Ignore. finished_splitting_shared_terms - same logic as above, can be dissuaded (I dissuaded myself)
home - my-sites
siteurl - my-sites

Whilst I can think of ways in which many of the other options would be useful, I've taken a somewhat minimal approach here to try and prevent a huge amount of data being transferred during the initial migration as well as simply polluting this new table.

Last edited 9 months ago by iamfriendly (previous) (diff)

This ticket was mentioned in Slack in #core-multisite by jeremyfelt. View the logs.


9 months ago

#5 @mnelson4
9 months ago

So to clarify, the main problem right now is that if you want to query for all sites' names, you would need to do an SQL join to ALL site option tables (or loop through them all). Yeah, that makes it pretty well impossible doesn't it (especially once you have a network with over 100 sites or so).

So this site_meta table would only be used for options you know you want to query, right?

Admittedly, it would be a little confusing having two places for this site-specific data (is it in the options table or in the site meta table? Depends on how much we want to do a multi-site query involving it). This could probably be alleviated by having get_site_option magically check in the blog meta table for entries it knows are stored there and not in the options table. Eh?

#6 @flixos90
9 months ago

@mnelson4 Indeed that's what this table would be for. The name blogmeta is only since sitemeta is unfortunately already taken for network meta. We'd love to find a better name, but currently we don't have a better idea.

The data would probably be stored in one or the other table only. Our plan is to have the *_option() functions check whether the requested option is in the whitelist for blogmeta (only when multisite is activated of course) and choose the table depending on that. We'll have to be careful how we approach this though as we of course will also have a migration path to go, and it might happen that a blogmeta option is not actually in that table just because it has not been migrated yet.

This ticket was mentioned in Slack in #core-multisite by flixos90. View the logs.


9 months ago

#8 @MaximeCulea
9 months ago

I was also thinking about dates, for querying and cache purpose, by adding to this list the registration_date and last_modification_date.
What do you think about this ?

#9 @mnelson4
9 months ago

We'd love to find a better name, but currently we don't have a better idea.

I like blogmeta. If it were up to me the two terms in use would just be blogs (entries in the wp_blogs table) and networks (entries in the wp_sites table, which would ideally be renamed wp_networks), and we'd entirely avoid the now-rather-ambiguous term sites (I always need to double-check what that means, especially seeing how I only deal with multisite stuff periodically). But that's a bit off-topic.

So that plan sounds good to me!

Last edited 9 months ago by mnelson4 (previous) (diff)

#10 @spacedmonkey
9 months ago

  • Summary changed from Introduce shared wp_blogmeta database table for multisite installations to Introduce shared wp_blogmeta database table for multisite installation

Blogmeta has to be the name, as the meta api needs that naming conversation to work. It is also a better naming, make everything be meta. The name is really unimportant for a table.

So I don't think that it should be a copy of options table. Storing transient and rubbish options plugins add in this table seems a little pointless. We should have an array of key names (option names) that is filter able that are copied over. We can only copy over useful data that way, but plugin creates add data where desired. The list of option should be heavily vetted, but home URL, site URL and blog title are a sure thing for me.

Keeping the data betweeen the data two tablessynced is going to be interesting. But we if hook in on the add_option / edit_option / delete_option actions in the option functions on multisite, you can use get_current_blog_id and also save data blog meta table. This means that unless there are plugins that are touching options table directly (which I am sure there are), if the correct functions are used, everything is fine.

Roll out for this functionality is going to be hard. We can create the table on upgrade, but the first time sync would be pretty painful on a massive multisite. Maybe it would have to work the term split. Creating a single cron event for each site to sync the data over. The roll out will need a lot of thought. I willing to help with this ticket..

This ticket was mentioned in Slack in #core-multisite by spacedmonkey. View the logs.


9 months ago

#12 @flixos90
9 months ago

As a reminder of Tuesday's multisite conversation and also to possibly discuss this more: The general consent on Tuesday was to call the database table blogmeta, but name all the functions interacting with it *_site_meta(). This would allow for better parity if we at some point handle network meta to (the table is sitemeta, and the functions should then be called *_network_meta()).

While far from optimal, I think it might be an acceptable pattern to call DB stuff the old way (since there's no easy way to rename) while trying to use new naming conventions in code.

#13 @spacedmonkey
9 months ago

Looking into this again, I noticed if we have the follow options in the blogmeta table

blogname
siteurl
post_count
home
allowedthemes

we could remove a tonne of switch_to_blog in core. Switch_to_blog is a pig, so removing it in some key places (specially in the WP_Site class) would a performance win.

#14 @spacedmonkey
9 months ago

  • Keywords has-patch added

I have put together a patch of my ideas for the blog meta functions. This patch is very simple. It does the following

  • add the *_site_meta functions
  • Creates a new function, which returns a white list of options that will be synced with the options
  • hook into the add / update / delete actions in the options functions and copy the data into the blog meta table.

This patch is missing a create table and migration of the data. But it is good first start and is a simple clean implementation.

This ticket was mentioned in Slack in #core-multisite by jeremyfelt. View the logs.


8 months ago

#16 @spacedmonkey
8 months ago

I have updated my "patch" . Here are the changes

  • Better type checking of an array
  • Fallback to options if database isn't updated
  • Add db_version to list of options
  • Added a start to the migration of data

This ticket was mentioned in Slack in #core-multisite by jeremyfelt. View the logs.


8 months ago

This ticket was mentioned in Slack in #core-multisite by jeremyfelt. View the logs.


7 months ago

#19 @flixos90
7 months ago

  • Keywords dev-feedback added

I have spent some time on a first patch ready to apply to trunk, based on our discussions.

As discussed, the site meta project will have 3 (or 4) stages:

  1. Introduce wp_blogmeta database table, its surrounding site meta API and related functionality like meta query support in WP_Site_Query, however none of this should be used by core at this point. The list of core site meta keys should be used as a blacklist so that developers cannot store their own site meta values for these keys.
  2. Migrate the core site meta keys from their options tables to the wp_blogmeta table and keep them in sync (*_site_meta() should also update *_site_option() and vice-versa). There's still some discussion to do regarding whether we should sync at all or only migrate (delete options immediately and only provide a fallback so that get_option() calls get_site_meta()). If we end up syncing, it will only be a temporary solution to make sure things work, before we actually remove the options a few versions later.
  3. Start making use of the site meta query functionality in core.
  4. If in 2. it was decided to sync the core keys instead of only moving them from options to site meta, we should at this point stop syncing and actually perform the migration.

For the next version we'll likely only need stage 1, and that is what I implemented in https://github.com/jeremyfelt/wordpress-develop/pull/3. I also provided it as a regular patch with 37923-phase1.diff, so that you can easily apply it to trunk.

The stages 2-3 (or 4) will likely end up in follow-up tickets later.

This ticket was mentioned in Slack in #core-multisite by flixos90. View the logs.


7 months ago

#21 @spacedmonkey
7 months ago

Good first patch. First I think we should break out the meta query into another ticket. I think it is to much to put in on this. Debug will become a mess.

I noticed that the reserved keys array is not filterable. I think this is a good move, but we should have an filter to add more keys to the end.

There is nothing in there to sync. I think the best option is to hook into the add/update/delete option action and sync data on option save. A one way sync to option -> blogmeta. If users write to blogmeta, it is there responsibility.

I think we should also be using the meta function in place of many of the switch_to_blog in core. I will create another ticket for this.

#22 @flixos90
7 months ago

@spacedmonkey

Regarding meta query, let's continue to work within the GitHub repo. The separate ticket #40229 can be used later when we get closer to commit to break this feature into smaller pieces.

Making the array filterable might make sense, but I tend to say let's not add a filter to keep this function core-only. If a filter is introduced, it should only be able to add keys, not remove.

As stated in my comment above, this patch is only for our planned stage 1 of site meta, which does not have any sync or does not even migrate options. It only introduces the functionality, which I think is what we should focus on for now code-wise.

As above, actually using site meta in core will be part of a later stage that is likely going to be introduced 1 or 2 releases after the initial functionality.

#23 @flixos90
7 months ago

Bookmarking #37928, #37929 and #37930 here which would need to be implemented in one way or the other for the sync/fallback to blogmeta table part. Not important yet, but once we start getting into detail with migration and sync implementation.

#24 follow-up: @andy
7 months ago

it would be nice to be able to query for sites using some kind of other arbitrary method

There is no need to move the primary source of truth into a global table. What you need is secondary index that covers all of the data you are interested in and supports an acceptable level of consistency. You can build that using filters and any data store you like: mysql, elasticsearch, etc.

There is prior art in multisite search via external indexes. Proponents of a core patch should explore the needs, solutions and problems already understood by those who have done it. Use cases might differ too much to make a unified solution viable.

I have browsed 37923-phase1.diff and it looks like quality code but I'm not convinced any of this is better than adding or improving the API for plugins.

#25 in reply to: ↑ 24 @flixos90
7 months ago

Replying to andy:

it would be nice to be able to query for sites using some kind of other arbitrary method

There is no need to move the primary source of truth into a global table. What you need is secondary index that covers all of the data you are interested in and supports an acceptable level of consistency.

From my POV there is a difference to be made between a site's metadata and site options:

  • I'd describe metadata as two things: first, it's miscellaneous data that is associated with the site (for example post_count or db_version); second, it's what we'd put into the main database table if we had the possibility to always adjust it to our needs (which we obviously don't) (for example blogname and blogdescription) - only the first is what I would actually consider meta even if we had the possibility to put it into the main table. The latter is actually data that describes the site in a network context (core also treats these in a special way, see the 4 extra "properties" in WP_Site or get_blog_details()).
  • Options however are what their name implies, they define how a piece of functionality on the site works rather than providing some kind of identification for it.

My approach would therefore be to migrate (i.e.) move the "almost-site" data and site metadata into blogmeta as the source of truth. Everything else is options and thus can remain as is. Proxying *_option() calls to blogmeta for the moved values would only need to happen for backward-compatibility. So I don't consider that table an implementation of a secondary index, although I have to admit we started discussing that because of the lack of a possibility for querying by these values. But I assume the reason this has not come up before is that we rather not touch things if we don't actually need a new functionality from them.

There is prior art in multisite search via external indexes. Proponents of a core patch should explore the needs, solutions and problems already understood by those who have done it. Use cases might differ too much to make a unified solution viable.

I have browsed 37923-phase1.diff and it looks like quality code but I'm not convinced any of this is better than adding or improving the API for plugins.

What I stated above is obviously my opinion, maybe the general concensus ends up in this being a means to a secondary index only. I'm certainly open to introducing the feature in a flexible manner, for example by supporting override via a drop-in. I think however the underlying functionality will be essential for parts of multisite, therefore core should provide a default solution to the problem.

#26 @spacedmonkey
7 months ago

Even if the blogmeta isn't used to sync data from the options table doesn't mean we shouldn't add it. Adding blogmeta, would be extremely useful. For two reasons.

  1. Currently many multisite plugins (domain mapping being a good example) add tables to store data. Blog meta will be a wordpressy way of extending the wp_site object.
  2. It means we can removing the blog_versions table and replace it with a much more useful table.

We should go ahead with adding this new table and api. Syncing the data from options is another issue.

#27 @andy
7 months ago

There is nothing unWordPressy about plugins creating tables. WordPress never claimed to have a one-size-fits-all database schema.

Another question came up in IRL chat last night: what about export/import?

#28 @johnjamesjacoby
7 months ago

Another caveat: core's existing _site_transient() functions make _site_meta() function names confusing.

Yes... we all pretty much understand functions with _site_ in their name could mean blogs or networks, not pointing at the same object or database table. But transients, meta, and options traditionally are saved in the same place, and if we introduce a _blogmeta database table, it's plausible we might want _blog_transient() functions also.

  • Do we want _blog_meta() functions instead, to match the _blogmeta table?
  • Do we want to easily enable transients to exist in _blogmeta vs (or in addition to) _options? If yes, does this finally more easily enable multisite cron handling without resorting to building an entire external job queue?
  • My plugin uses _blog_ for function names instead of _site_ and I'd like to update it to whatever direction we think is best for WordPress core going forward, before adoption rises and backwards compatibility becomes a concern.
  • Should we add a wp_ prefix to any new transient functions, to allow both site & network transients? Then we could deprecate the old _site_transient() functions with new wp_*_network_transient() ones?
Last edited 7 months ago by johnjamesjacoby (previous) (diff)

#29 @flixos90
7 months ago

@johnjamesjacoby Good point. I think we shouldn't introduce new blog functions unless it would be totally confusing to call something site (for parity regarding related functionality).

In this case I think:

  • Use site_meta and network_meta (if we do this) since those are completely new APIs.
  • We should introduce network_transient functions to match the network_option functions.
  • We should introduce blog_transient functions to match blog_option functions. We can't call it site_transient because of the naming collision. I don't think it's necessarily expected that this function name matches the new site_meta since it is not apparent where blog_transient data is stored, sometimes in object cache if available. I actually would expect blog_transient to work similarly to blog_option since this is where those transients have been stored. I think you're talking about storing transients in the new table, but blog_transient would be confusing as a name for that functionality. I don't have a better name, but I also don't know yet whether this is necessary to have.
  • Prefixing with wp_ should be the final step only if we really can't figure this out.

@flixos90
7 months ago

#30 follow-up: @flixos90
7 months ago

As discussed during today's office-hours, we're probably going for introducing the blogmeta table unrelated to our original plans for an options index, which needs to be investigated independently, since blogmeta is probably not the right approach, as determined during recent discussions.

Main benefits:

  • Enables developers to store arbitrary data about a site in the multisite scope.
  • Opens up the future /sites REST API endpoint for developers so that they can register_meta() for sites that is handled by the API (both read and write access, will be implemented in the actual endpoint for which no ticket has yet been created).
  • Allows more flexible querying in WP_Site_Query.
  • Brings parity with other core objects, all of which have metadata tables.

Related: We are likely going to introduce *_network_meta() API functions as well (see #37181), the table is already in place.

37923.diff is an updated patch that only contains the blogmeta table and its surrounding API. The PR at https://github.com/jeremyfelt/wordpress-develop/pull/3 has been updated as well; however, since the introduction of this table is a much more straightforward process and since there were some merge conflicts with the previous patch that I already resolved in 37923.diff, I'd prefer if we continued to work on this through patches from now on.

#31 in reply to: ↑ 30 @jeremyfelt
7 months ago

Replying to flixos90:

I'd prefer if we continued to work on this through patches from now on.

+1

This ticket was mentioned in Slack in #core-multisite by flixos90. View the logs.


7 months ago

This ticket was mentioned in Slack in #core-multisite by streetlamp. View the logs.


5 months ago

#34 @jeremyfelt
5 months ago

  • Keywords has-unit-tests needs-testing added; 2nd-opinion removed
  • Milestone changed from Awaiting Review to Future Release

Moving this to future release so that we can keep progress going. The objectives that @flixos90 lays out above are good.

I'm not sure about _site_meta() vs _blog_meta() since the table name in this case is wp_blogmeta and a wp_sitemeta table exists for network meta.

This ticket was mentioned in Slack in #core-multisite by flixos90. View the logs.


5 months ago

#36 @flixos90
5 months ago

Pasting my comments from Slack here:

<slack>
I think we should be talking about site and network meta together. While they don't need to happen at a similar time (maybe not even in the same release), we should have a clear fixed path going forward for both before we start finalizing/merging one of them.

My opinion is that both should be introduced. Site meta for the reasons I outlined in #37923, and network meta because its API is more structured, standardized throughout core, and the db table was made for meta after all.

Also I think their functions should be *_site_meta() and *_network_meta(). We have discussed that a couple times, and as far as I can remember we agreed on using current naming conventions for new functionality. The mixture is obviously terrible, and introducing new functions with the old naming will make it even harder to work our way away from it. The database, options etc can and should continue to have their legacy names, but all APIs should reflect what at some point was decided to be the terminology for... well, sites and networks.
</slack>

Also adding @jeremyfelt's comments from Slack:

I've been treating network meta as lower priority behind the /sites and /users endpoints and work to support that. I think we need to land those features before we start introducing other big changes.

I agree with that statement, site meta should be introduced in the same release or soon after /sites as the endpoint will be another argument for introducing the table. Network meta could follow a bit later, as the change would also be more justifiable with a site meta API in place.

Let's put this on our list of things to discuss in office-hours next week. From my POV the goal would be to also put #25344 in the "Future Release" milestone, but the primary objective should be determining a roadmap for both projects considered as a whole.

This ticket was mentioned in Slack in #core-multisite by spacedmonkey. View the logs.


4 months ago

This ticket was mentioned in Slack in #core-multisite by spacedmonkey. View the logs.


3 months ago

#39 @spacedmonkey
3 months ago

  • Milestone changed from Future Release to 4.9
  • Owner set to spacedmonkey
  • Status changed from new to assigned

#40 @flixos90
3 months ago

  • Keywords ms-roadmap added

These tickets belong to our planned roadmap (a few of them not per final decision), so flagging with a keyword for better overview.

#41 follow-up: @spacedmonkey
3 months ago

  • Keywords changed from has-patch, dev-feedback, has-unit-tests, needs-testing, ms-roadmap to has-patch dev-feedback has-unit-tests needs-testing ms-roadmap

In 37923.2.diff I made some changes to @flixos90 patch.

  • Removed references to meta_query. Works on this functionality should continue here #40229
  • Added a new network option when site meta is installed. This is similar to how ms_files_rewriting was handled in core.
  • Moved *_site_meta function into the option.php file. This means plugin developers can use the functions on none multisite sites with getting function exists errors. We may think about falling back to options table on non multisite.
  • Added update_site_meta_cache to wp_site_cache
  • Moved formatting.php and meta.php earlier in bootstrap. See #37181
  • Changed _prime_site_caches to work similar to other prime functions.
  • Updated doc to 4.9.

#42 @spacedmonkey
3 months ago

37923.3.diff adds in missing tests.

This ticket was mentioned in Slack in #core-multisite by flixos90. View the logs.


2 months ago

#44 in reply to: ↑ 41 @flixos90
2 months ago

  • Keywords dev-feedback removed

Replying to spacedmonkey:

I reviewed 37923.3.diff. It looks good for the most part, I just have a few notes about fixes/suggestions:

Removed references to meta_query. Works on this functionality should continue here #40229

The changes from blog_id to $wpdb->blogs.blog_id can be removed too, plus there's some incorrect indentation in that file.

Added a new network option when site meta is installed. This is similar to how ms_files_rewriting was handled in core.

Let's further think about this. It's a good initial approach, but this is the most complex problem that we need to solve. One thing that should be changed is that the *_site_meta() functions should use get_network_option() instead of get_site_option(). get_site_option() is outdated and confusing especially being used there, where site_meta means site meta and site_option actually means network option. If we're really precise, we actually also need to pass the correct network ID for the site for which metadata is requested, but I don't see a reasonable way to do that. I'm wondering whether we should put a higher priority on our global storage discussion from last week. It would help a lot with this too. Let's bring it up in next office-hours.

Moved *_site_meta function into the option.php file. This means plugin developers can use the functions on none multisite sites with getting function exists errors. We may think about falling back to options table on non multisite.

I'm strongly against this. What benefit does it have to move the functions to option.php? They are multisite-only, so they should reside in a multisite file. For now it should be located in ms-blogs.php, and later be moved into ms-site.php, as suggested in #40647, which would bring the location in line with the other meta functions (all of which are located in the file specific to their object type).

Moved formatting.php and meta.php earlier in bootstrap. See #37181

I'm fine with this being included in the patch, since it's needed, however as a reminder, this should be committed separately in advance, as part of #40948. Why is formatting.php moved too? Is that required somewhere?

Another thing that needs to be fixed is that get_site_meta() must not return false, but an empty array or string, depending on the $single parameter. This is weird, but get_metadata() works like that, so it should use that same return type.

A random question, not related to the patch: Where do the database versions actually come from? Is that a changeset number? Just highlighting since we're randomly using 40001 in here.

Last edited 2 months ago by flixos90 (previous) (diff)

#45 follow-up: @spacedmonkey
2 months ago

37923.4.diff
Key changes

  • No longer move formatting.php in wp-settings.php. Only move sanitize_key as it is used in meta api
  • Moved *_site_meta to ms-blogs.php next to *_blog_option
  • Fixed formatting
  • Remove blog_id references from wp_site_query as not needed until meta query.

I don't agree with the use of get_network_option. I personally prefer network_options of course, but it is a function that will remain in core forever and can now just be considered a helper function.

As for get_site_meta returning false if not supported, this is how get_term_meta works. Forcing false is a nice way to return that site meta is disabled and not result in false positives.

I do not know where the datatbase version numbers come from. @jeremyfelt can you shine some light on this?

#46 in reply to: ↑ 45 @flixos90
2 months ago

Replying to spacedmonkey:

Thanks for updating the patch! A small detail I just noticed is that the @access tag in wp-db.php should be removed (see #41452). I have some responses to your latest comments:

No longer move formatting.php in wp-settings.php. Only move sanitize_key as it is used in meta api

Ah, I just wasn't sure why you moved the formatting.php file loading order, but as sanitize_key() is used, I think we should leave it in that file and just load it earlier (like you did in 37923.3.diff). Sorry for the confusion there.

Fixed formatting

What do you mean with this?

I don't agree with the use of get_network_option. I personally prefer network_options of course, but it is a function that will remain in core forever and can now just be considered a helper function.

Although *_site_option() will probably remain in core for a long time, or even forever, I think we should use the latest versions of functions in core code itself, so I'm leaning towards *_network_option(). But that is only a small detail, so I don't have a strong opinion on that.

As for get_site_meta returning false if not supported, this is how get_term_meta works. Forcing false is a nice way to return that site meta is disabled and not result in false positives.

I noticed that get_term_meta() does that, but I assumed that happened mistakenly - @boonebgorges Can you weigh in on this? I consider false an unexpected return value for that function (although the docs state "mixed"). If someone runs get_site_meta() with $single = false like the default, they should safely be able to assume it's an array. I think we should rather fix that in get_term_meta() and do it right here. Returning a different value just to indicate the table is not installed sounds like a hacky workaround for me, we should rather introduce something like is_site_meta_supported() if necessary.

#47 @boonebgorges
2 months ago

I noticed that get_term_meta() does that, but I assumed that happened mistakenly - @boonebgorges Can you weigh in on this? I consider false an unexpected return value for that function (although the docs state "mixed"). If someone runs get_site_meta() with $single = false like the default, they should safely be able to assume it's an array. I think we should rather fix that in get_term_meta() and do it right here. Returning a different value just to indicate the table is not installed sounds like a hacky workaround for me, we should rather introduce something like is_site_meta_supported() if necessary.

From my recollection, I/we considered returning WP_Error in the case that the tables weren't yet installed, but decided not to because there's already a convention for returning false from meta functions on "error" conditions. See https://core.trac.wordpress.org/browser/tags/4.8.1/src/wp-includes/meta.php?marks=463,468#L461. Speaking personally, I have done if ( false === get_*_meta() ) checks to distinguish between failures and null results. I don't particularly like this pattern, and I wish that more descriptive return values were returned, but that was the reasoning. Changing existing functions to return something other than false in case of failures is going to be very problematic, so I lean in favor of following the pattern with new functions, but I could probably be convinced otherwise.

A random question, not related to the patch: Where do the database versions actually come from? Is that a changeset number? Just highlighting since we're randomly using 40001 in here.

Yes, the committer should bump it to the relevant commit number at the time that the code is committed.

#48 @spacedmonkey
2 months ago

In there 37923.5.diff there are a number of key changes.

  • There is now a check to see if the table exists before write flag to database.
  • Default network options now populate value of site_meta_supported
  • Use get_network_option, even through I think it looks a little ugly.
  • I have kept the change of sanitize_key in load.php. I think it is has deeper effects of moving the formatting.php file might be larger than we thought. It is 6k of code, loading it early might have performance effects.
  • Do check for network option in delete_site_meta_by_key and update_sitemeta_cache

I agree with @boonebgorges that, where I do not like the return false pattern, it is what was decided and should either be changed everywhere and kept the same.

#49 @flixos90
2 months ago

@boonebgorges Ah okay, I didn't have in mind that get_metadata() can already return false. In that case, I agree we should follow the pattern for "invalid" requests. For consistency it may be better to always return a result so that the code calling these functions wouldn't have to consider this false special return value - on the other hand we'd need another way to check if a meta request was successful or not. But that could be part of another discussion.

@spacedmonkey I'm good with the current state of the patch.

  • A minor thing is the @access private in _prime_site_caches() needs to remain there, I previously only referred to the other access modifiers, as access modifiers should only be used in core, if it's a regular function that should be indicated as private (note that this is a pattern we shouldn't ever use again, but there are several legacy functions that do this).
  • The other is that we'd probably need to check for site_meta_supported before deleting site metadata in wpmu_delete_blog().
  • Let's discuss the approach of using a network option to detect the table in today's office hours. We can also briefly ask for feedback on whether we should load formatting.php earlier or move the function out of it.

This ticket was mentioned in Slack in #core-multisite by jeremyfelt. View the logs.


2 months ago

#51 @spacedmonkey
2 months ago

37923.6.diff

Key highlights

  • Introducted is_site_meta_supported() function. This is a wrapper to the network option for the main network. It stores an int value so type checking can be used to check the value isn't in the database. The function can be found in functions.php, similar to global_terms_enabled.
  • Populate network now uses is_site_meta_supported(). This is so when multisite installed, the setting is populated.

I was wonder if we should look at #40418 at part of this ticket. What are your thoughts @johnjamesjacoby ?

#52 @flixos90
2 months ago

@spacedmonkey Some notes on the latest patch:

  • I think you forgot to upload the actual is_site_meta_supported() function itself. :)
  • As discussed yesterday, let's not populate the setting in populate_network() as this would cause the setting to be populated on every network. Instead let's ensure the setting is populated during the multisite bootstrap process. I think adding add_action( 'ms_loaded', 'is_site_meta_supported', 1, 0 ) would be a simple way to do so. This ensures the "global" setting is set at all times, as it populates it on-the-fly if it isn't set.
  • With the suggested $force parameter of is_site_meta_supported(), the logic in wp-admin/includes/upgrade.php can be simplified as well: The two lines currently there could be replaced with a simple call to is_site_meta_supported( true ).
  • I'm not sure the function should be in functions.php just because global_terms_enabled() is. Site meta is entirely multisite functionality, so it may be better in ms-blogs.php. @jeremyfelt @johnjamesjacoby What's your thought on this?

Regarding the other ticket, I'd rather deal with it separately. We probably don't want that discussion to delay the merge of site meta. :)

#53 follow-up: @spacedmonkey
2 months ago

37923.7.diff
Key features

  • Added back in missing is_site_meta_supported function.
  • Wraps delete site meta in site delete in is_site_meta_supported conditional.

Regarding feedback.

  • Saving the value in popular network, stops the need for it checking in the bootstrap process.
  • is_site_meta_supported is designed to lazy detect if table exists. The first to access that function will populate the value and the response is cached in network options forever more.

I still think it much better to have the setting on each network. It means you can disable site meta per network, which something you might want to do if are lots of networks, such as WordPress.com.

#54 @spacedmonkey
2 months ago

As a talking point I create 37923.eachnetwork.diff.

How it works is the following. Core is updated. One of the networks runs the database upgrade, doesn't have to be main. This then installs the table. Now because the detect is lazy, the next time any of the site on a network tries and access site meta, the detect will be run locally and the setting set. This means, that if you have code running on all networks that touch site meta, the setting will spread to all networks quickly. If for some reason, there is a race condition and the setting is set to 0 on network then later the table is installed, all that means to be done is to run the upgrade on the network and the setting will be set correctly.

Having a per network setting, it is gives each network that ability to disable blog meta on some or all networks, if required.

#55 in reply to: ↑ 53 @flixos90
2 months ago

Replying to spacedmonkey:

Saving the value in popular network, stops the need for it checking in the bootstrap process.

The issue with this is that multi-network setups out there possibly use populate_network() to fill their additional networks with settings, and populate_network() actually supports that (you can pass a network ID, and if the setup is not yet a multisite, the new network will copy some settings from the current network. Therefore your current patch will essentially enable the option on every network for those setups. We should either wrap it in a conditional if ( ! is_multisite() ) to only do it for the first network, or remove it completely (I'm in favor of the latter). The performance benefit of looking up the table directly just once is really negligible IMO.

is_site_meta_supported is designed to lazy detect if table exists. The first to access that function will populate the value and the response is cached in network options forever more.

Right. That actually means that we don't even need to populate the main network option anywhere. Just wait until someone calls is_site_meta_supported() for the first time, and it's set.

I still think it much better to have the setting on each network. It means you can disable site meta per network, which something you might want to do if are lots of networks, such as WordPress.com.

I don't think there should be a way to disable it per network. Either you install the table, or you don't - the table is global, so the setting (which I do not consider a setting, but just a flag) should be global too (of course it isn't actually global here, but the main network idea is the workaround we figured out). Working with cases where site meta is not supported may be painful enough in the future, and allowing to disable it per network would overcomplicate that even further.

#56 @spacedmonkey
2 months ago

In 37923.eachnetwork.diff the network option is only set in two places.
First on a network upgrade, which is a reasonable thing to do. Second in the is_site_meta_enabled function.

It also fixes the following condition.

  1. Upgrade to WP 4.9+
  2. Code calls get_site_meta
  3. Check is run, table is not found, option is set to 0.
  4. All following calls fail because check is set to 0.
  5. Upgrade is run, table is installed, network option is set 1.
  6. Following calls work, as table exists and check is correctly set.

Site meta can be disabled by the filter site_meta_supported. Having it as a network option, just makes that a little easier. But if you want to disable the function on network, you can just do this.

add_filter('site_meta_supported', function($enabled){
   $network = get_network();
   if($network->id == 2){
     $enabled = false;
   }
   return $enabled;
});

Worth noting that global terms and ms_files_rewriting can be disabled per network.

It also allows you to write a custom install script that installs the table and set the value to 1 on all networks. This makes it is easier for people running a large multisite setup.

Like term meta, we will never to able to truly know that this table is installed.

@flixos90
2 months ago

#57 @flixos90
2 months ago

37923.8.diff makes the following changes:

  • Introduce an optional $force parameter for is_site_meta_supported() to allow developers to bypass the option lookup and run a direct query to the database. This parameter should only be used to detect whether the database exists after specific changes that could possibly have installed it have occurred.
  • Return a boolean instead of integer from is_site_meta_supported().
  • Run the is_site_meta_supported filter before the actual logic to allow short-circuiting. Developers can use that filter to bypass the database lookups completely and also to prevent the network setting from being populated at all.
  • Add verbose documentation to is_site_meta_supported() so that developers know how its internals work and how they can modify it.
  • Do not set the site_meta_supported network option in populate_network(), since it should only be set on the main network. This furthermore allows setups using the is_site_meta_supported filter to not have to deal with that main network option at all.
  • In upgrade_network(), call is_site_meta_supported() with the $force parameter set to true, so that the database is checked directly, and the main network option is set accordingly. Similar to the above, when using the filter, the main network option is not even populated.
  • Keep the sanitize_key() function in formatting.php, and load the file earlier. People using the meta API may possibly use other functions from that file, and since it's loaded a bit later anyway, I don't think there are any performance implications for loading it earlier.
  • Load WP_Meta_Query and WP_Metadata_Lazyloader earlier as well, since they are part of the meta API. Note that this should eventually be committed through #40948.
  • Fix spacing in WP_Site_Query class to account for the new parameter. Makes a lot of code churn, but I thought we should include it sooner than later.
  • Bring back the original tests for site meta from 37923.diff, and adjust a few things to account for the new is_site_meta_supported() function. The actual site meta tests are skipped in case the table is not available. This should not commonly happen, but will ensure the tests are stable and do not produce unexpected errors.

#58 @flixos90
2 months ago

@spacedmonkey

It also allows you to write a custom install script that installs the table and set the value to 1 on all networks. This makes it is easier for people running a large multisite setup.

That is why I don't like the idea of storing it per network: For almost every setup, it would be the same value in every network.
In 37923.8.diff it is possible to say add_filter( 'is_site_meta_supported', '__return_true' ), and this will ensure that the setting is never even set anywhere, which is the cleanest way of dealing with this change IMO. Of course we cannot expect the common setup to use that strategy, and that's what we'd use the setting on the main network for, as a fallback flag basically.

#59 follow-up: @spacedmonkey
2 months ago

Relying on the main network as a global store is bad. Imagine if you will, the following seniro.

I provide multi network as a service to my clients. Each client has their own network and each network is unrelated to each other. I haven't read the release documents and don't know site meta is coming. I upgrade to 4.9 all the sites. Network 2 does upgrade and installs the table. But until network 1 logs in does the upgrade, all other networks can't use site meta. What if that user never logs in?

For record I don't like having to store that value in the database, that is really is that way to get all network using site meta.

#60 in reply to: ↑ 59 @flixos90
2 months ago

Replying to spacedmonkey:

I provide multi network as a service to my clients. Each client has their own network and each network is unrelated to each other. I haven't read the release documents and don't know site meta is coming. I upgrade to 4.9 all the sites. Network 2 does upgrade and installs the table. But until network 1 logs in does the upgrade, all other networks can't use site meta. What if that user never logs in?

In 37923.8.diff, it's always the main network option that is checked/refreshed, no other one. So if you for some reason only upgrade network 2, it will still update the main network option for site_meta_supported. Regardless of which network is the main network, once one of them runs the upgrade and this results in the blogmeta table being installed, site meta will work on every network.

As we've discussed on Tuesday, all of this is surely just a big hack, since we don't have a global storage. But introducing a global storage is a whole different discussion that shouldn't be a priority right now and would delay more important projects like this one significantly. I'd rather think about global storage when it's time, and until then, figure the least hacky way that is best to migrate over at some point.

#61 @spacedmonkey
2 months ago

37923.10.diff

Only a couple of changes in this one.

  • Removed $force param on is_site_meta_supported. It is bad to have a flag that changes what logic runs in a function. Deleting the network option on upgrade has the same effect and is much clearer.
  • There is no need to move meta query and the lazy load meta in this ticket. It is not required. For a clearer commit history, it should done in another ticket.

#62 @spacedmonkey
2 months ago

37923.12.diff

  • Returns tests.
  • Make int unsigned on table creation. See #9422

#63 @spacedmonkey
2 months ago

I think there is a performance issue around loading from a different network's option. In standard bootstrap process, the current networks options are primed in memory. This is so even if you are not using object caching, some of the network options are loaded into memory. See
wp_load_core_site_options. Calling a network option from a different network will result in another query / cache get. This is made worse by #37181 as the meta api, saves all of it's data in one cache key. The result would mean, calling an network option, it would load the lot into memory. There are network transient that can be bulky.

This ticket was mentioned in Slack in #core-multisite by jeremyfelt. View the logs.


7 weeks ago

This ticket was mentioned in Slack in #core-multisite by jeremyfelt. View the logs.


7 weeks ago

#66 @spacedmonkey
7 weeks ago

I have created another ticket so discuss the idea of a global data store for WordPress at #41771 . I think we should continue with per network option for now. We should work on a global data config management solution. There are enough filters in network options, that implementing global store would be simple and not work effect backward compatibility.

This ticket was mentioned in Slack in #core-multisite by jeremyfelt. View the logs.


4 weeks ago

#68 @flixos90
4 weeks ago

  • Keywords early added
  • Milestone changed from 4.9 to Future Release

This should go in in early 5.0.

#69 @flixos90
10 days ago

  • Keywords early removed
  • Milestone changed from Future Release to 5.0
Note: See TracTickets for help on using tickets.