Running get_space_used() on multisite can be dangerously slow on large installs
|Reported by:||wpdavis||Owned by:|
We're finally upgraded off 3.4 after a few frustrating attempts at an upgrade. You see, we have almost an exact replica staging environment — database, caching, etc. — and everything worked fine when we tested the upgrade on it. But then, trying to push the upgrade live things kept timing out inexplicably. We got a few false flag errors, and were pulling our hair out, until we finally realized the one thing that isn't replicated on staging: our media.
Turns out, running get_dirsize() on a few hundred gigs worth of media can take a little while. Starting in 3.4, that function is called anytime the media options are included, and even with caching it could read the assets as often as once an hour.
I understand why this is done, but I also think running filesize() on every file in a WordPress install can cause some scary situations. The filter added in #21181 is a great enhancement, but I still think WordPress could be more proactive to avoid some potentially crippling scenarios. A few options:
- For new installs, make upload usage tracking opt-in for network admins rather than opt-out. I'd be interested to see stats from the WordPress survey, but anecdotally most of the people using networks are more enterprise than creating open hosting networks. The option to turn off tracking requires serious digging and may not be apparent to admins if the scanning causes problems.
- Run a timer in recurse_dirsize() and kill the function if time > x seconds, then disable size checks and alert the network admin. Could be helpful for network admins to track inefficient file systems.
- Do an initial size scan and store it in the options table, then increment the option during file upload (or delete). WordPress already stores term counts in the term_taxonomy table, and it's worth discussing how precise the storage scan really needs to be. And, you could always revert using a filter.