#54221 closed defect (bug) (fixed)
_transient_dirsize_cache is set to autoload=yes and kills db performance if it grows (20MB in our case)
Reported by: | archon810 | Owned by: | spacedmonkey |
---|---|---|---|
Milestone: | 6.4 | Priority: | normal |
Severity: | major | Version: | 2.8 |
Component: | Database | Keywords: | has-patch commit |
Focuses: | administration, performance | Cc: |
Description (last modified by )
Hi there, folks. After months of perf issues on a fairly large site (10s of millions of monthly PVs), I finally dug in properly and discovered an absolutely massive _transient_dirsize_cache
option which was set to autoload=yes... currently sitting at 207221 entries worth 20MB of data. That's 20MB every
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'
loads each time. The query takes 1s+ on 32GB RAM 8-CPU db servers.
From the looks of it, it was added in 5.6.0 https://core.trac.wordpress.org/ticket/19879#comment:52.
For now, I set the transient to autoload=no, which considering its potential size, it should be set by default to no.
If this functionality is only used in multisite and only in some specific quota codepaths, it should definitely be set to autoload=no
.
I'm also wondering if shortcircuitting it with
add_filter( "transient_dirsize_cache", function() {return; } );
and deleting the transient to avoid such a huge 20MB db entry would have any side effects. Or if it'd be a better idea to disable the cache altogether?
See https://wordpress.slack.com/archives/C02RQBWTW/p1633393189369400.
Attachments (8)
Change History (66)
#3
@
3 years ago
Hi Ian,
Correct, we're not using multisite, don't have object caching turned on via a plugin of some sort (though considering something like W3 Total Cache), and have probably over 2x that amount of folders by my calculation in the uploads dir.
It's now been a day since I've checked on the transient and set it to autload=no, and its size is still the same 207221 elements. I sort of expected it to move by now, given that there are hundreds such directories created daily, so I'm not sure what's up with that, or if it'll flip back to autoload=yes next time it decides to update. When does that happen, by the way? How frequently or in what scenarios?
Based on your other observations, what is the plan to fix this permanently? Should we add autoload=no support to set_transient
?
#4
@
3 years ago
I also looked at why our mysqldump backups broke a couple of days ago:
Error 2020: Got packet bigger than 'max_allowed_packet' when dumping table `wp_options` at row: 459
Now I understand why... the option is probably so big that it actually breaks mysqldump or mysql.
This got me thinking, and not that we're in danger here yet, but if this field reaches 4GB, it'll actually hit the longtext max. WP will probably explode long before then though.
#5
@
3 years ago
if it'll flip back to autoload=yes next time it decides to update. When does that happen, by the way? How frequently or in what scenarios?
I think it will if the transient is deleted and recreated, but it's possible that'll never happen in practice.
Since it doesn't have an expiration, it shouldn't get cleared when upgrading WP. Since there's no object cache, it won't get cleared when rebooting the server. Since it's not Multisite, clean_dirsize_cache()
won't be called.
To be safe, you could do something like this until a fix is available:
- Add a
pre_set_transient_dirsize_cache
callback that sets the value to an empty array - Delete the transient
That would probably result Site Health, the Dashboard, and file upload pages showing wrong data, but IIRC some of that is only for Multisite.
what is the plan to fix this permanently? Should we add autoload=no support to set_transient?
I'm guessing this only affects sites w/ a very large number of subfolders, so it may be a bit of an edge case, and therefore a lower priority. I'm curious to hear what others think, though.
I think there are two problems:
set_transient()
makes sketchy assumptions about whether or not a transient should be autoloaded. Adding a new$autoload
param to the function could allow us to specify one in cases like this. I'm guessing there are other transients that are only used on a few pages despite being autoloaded. More research into the original intent, benchmarks to verify assumptions, etc might be a good next step.dirsize_cache
can become unwieldy on sites with tons of folders. Given its limited usage, one option would be to just disable it on those sites (alawp_is_large_network
). I'm assuming that the admins on those types of sites will usually capable of doing something custom if they actually need the functionality. Alternatively, the value could be sharded across multiple options, moved onto disk, etc. There could be better ideas too.
have probably over 2x that amount of folders ... its size is still the same 207221 elements [even though] there are hundreds such directories created daily
the option is probably so big that it actually breaks mysqldump
Huh, maybe those are related? Maybe there's an error when trying to add more entries to the value too? I'd assume MySQL could handle that just fine, but maybe there's a config option that limits it?
#6
follow-up:
↓ 7
@
3 years ago
Another idea that might be a better solution for the dirsize_cache
expansion: instead of storing the raw size of every folder, we could just store the total size (for each site in Multisite). AFAIK that's the only thing we need, and storing the individual sizes is for performance.
Instead of calculating the size synchronously (but with the data partially cached), we could do a full calculation asynchronously, and cache the result. That way the user always sees cached data, which is faster and avoids storing megabytes of data in an option.
It would get stale when files are uploaded/removed, but we could hook into those processes and manually add/subtract the file size from the cached total size. That should keep the size accurate enough until the cron job runs again.
That would need some more research to make sure it'd meet all the current requirements that dirsize_cache
serves, including #19879 and #46645.
#7
in reply to:
↑ 6
@
3 years ago
Replying to iandunn:
Instead of calculating the size synchronously (but with the data partially cached), we could do a full calculation asynchronously, and cache the result. That way the user always sees cached data, which is faster and avoids storing megabytes of data in an option.
That sounds like a good approach.
This ticket was mentioned in Slack in #core-test by boniu91. View the logs.
3 years ago
#10
@
3 years ago
Hi all!
Just faced the same issue!
Mine is smaller, though:
mysql> SELECT option_name, length(option_value) AS option_value_length FROM wp_options WHERE autoload='yes' ORDER BY option_value_length DESC LIMIT 200;
+---------------------------------------------------------------+---------------------+ | option_name | option_value_length | +---------------------------------------------------------------+---------------------+ | _transient_dirsize_cache | 515393 |
However, this is the biggest wp_option!
#11
@
3 years ago
I have a client site with _transient_dirsize_cache size 1,222,157 -- by far the largest row in the table.
#12
@
3 years ago
- Keywords needs-patch added
- Milestone changed from Awaiting Review to Future Release
#13
@
3 years ago
- Focuses administration added; multisite removed
I'm having a similar issue, got a huge uploads directory, so the _transient_dirsize_cache stands at 1379775 which is obviously slowing down the DB a load, is it safe to clear the record, or change the autoload, or should we wait for the patch?
#14
@
3 years ago
So far changing it to autoload=no has worked, and the option hasn't reverted to yes.
Ours is now up to 1,019,740 entries, 5-fold from when I originally reported the issue. It's a total waste of resources.
There should be a way to turn this off and avoid the strain on resources entirely, especially since our uploads dir sits on block storage with gluster on top, which is much slower than local disk access.
#15
@
3 years ago
Interesting read, I have a client with:
_transient_dirsize_cache 1344451
However only 551 directories in uploads:
[~/public_html/wp-content/uploads]# ls -lR | grep "^d" | wc -l
551
Strangely, I see all directories inside public_html listed. And when I pipe to wc this number is higher yet not the 1344451...
wp option get _transient_dirsize_cache | wc -l
11367
Will set to autoload=no since it's not a multisite...
#16
@
3 years ago
I also have the same issue on a brand new site so there can not be that much on it!
It is the biggest option! (1161500)
Following this thread for support.
#17
@
3 years ago
I also have the same issue on a brand new site so there can not be that much on it!
It is the biggest option! (1161500)
Following this thread for support.
#18
@
2 years ago
Any updates? Oddly, I'm still seeing the same 1,019,740 entries in the _transient_dirsize_cache
option as 5 months ago. It stopped growing, it seems, but still occupies a massive amount of space.
If I empty it, will it not grow anymore? Did any recent versions of WP change some logic, or is it not growing for some other reason?
#19
follow-up:
↓ 21
@
2 years ago
@iandunn should the version be 2.8? Seems like the previous 5.8.2 or newer is affected... At least that is what I was running when I noticed the problem.
#20
@
2 years ago
I deleted this wp_option row weeks ago and it hasn't returned with the latest version of WordPress. In scanning the wp-includes folder I see dirsize being called for Multisite as well as WP REST API upload functions. It does not appear to be triggering from the Media Library nor the Site Health tool.
Here's a clue I ran into with the client who had this issue. I could tell they used to have a Subdirectory installation (where WordPress is served from a folder within the site root), because there was an entire copy of WordPress plugins and all in a sub-folder that was no longer bring used. So the dirsize cache grew large based on all of those folders being left behind.
Object caching is enabled and working fine now. WP Engine recommends an 800K limit on autoload size and this site used to be several times that! They cap the buffer size at 1MB. No wonder object caching was turned off until we fixed this. Working great now.
#21
in reply to:
↑ 19
@
2 years ago
should the version be 2.8?
I think it's supposed to represent the version where the bug was introduced.
This doesn't seem directly related to r49744; it looks like it goes all the way back to when the concept of expiration was added to transients in r10603. ... dirsize_cache was added for Multisite instances when Multisite was merged to Core in r12603.
#22
@
2 years ago
Hello, I hope it's okay for me to "chime in". I have also found this in my database.
I am not running multisite and do not have Object Cache happening.
Mine is at a bit over 0.8MB, which is still a lot, pushing my overall wp_options "autoload" well over the recommended limit.
From my table:
_transient_dirsize_cache 823714
I'm afraid to delete it because I'm unsure what will happen, but I'm really needing to reduce my autoload and this is by far the largest entry.
Is is safe to delete? I read through this thread but am still unclear what to do. Thank you for your time.
#24
@
2 years ago
Thought I'd update. I deleted this transient on 2022-07-01. It was at the size of 823714 bytes
Today (2022-07-13) I noticed it was back, though I'm unsure when it came back. It was 588465 bytes this time.
I've deleted it again, but would love for this to stop being created. Again, I am not running multisite and do not have Object Cache happening.
Thank you!
#26
@
2 years ago
Same problem here. Is there a plan to fix it?
There simply should not be an entry that grows potentially indefinitely.
#27
@
2 years ago
@iandunn @SergeyBiryukov Acknowledging that the size of this transient is a problem, my take here would be that the even larger issue is that it is being autoloaded, thus affecting every basically every request.
How about we break the work into two parts?
- Change the autoload behavior so that this value is not autoloaded. That could happen as part of this ticket, since that is also the primary point of the ticket based on the title.
- Open a separate ticket to further explore how the calculation can be optimized.
For 1.: How about we change set_transient()
to determine the autoload value based on a certain threshold of the value? I'm not exactly sure which value that should be, but I think it's fair to say that once a value goes beyond a certain threshold in size, it should never be autoloaded as it would notably slow down that entire query.
Related: #55969 would add an
$autoload
param toset_transient()
.
I would advise against this for now. While today this could be useful, options and transients misusing (or probably, for the most part, developers not thinking about) autoloading and what that means is a large problem in general. We are currently exploring an overhaul of that entire approach via https://github.com/WordPress/performance/issues/526, and part of that would potentially be to deprecate that parameter. Probably the biggest mistake with it was that it's optional - resulting in many people to ignore it and maybe not even knowing it exists. Anyway, long story short, I'm not convinced adding that parameter to set_transient()
is the right answer.
#28
@
2 years ago
Question:
(NOT using Multisite; YES using object cache)
Is there a downside to disabling the _transient_dirsize_cache autoload?
I'm getting a caution in Site Health that I have 1293 autoloaded options (size: 1 MB) in the options table, which could slow my site. The largest (though under 800KB) is _transient_dirsize_cache. I do have Redis/ object cache.
What is the potential consequence of disabling that autoload? Any reason not to?
Thanks!
#29
@
23 months ago
Hello, I wonder why we need to store all the sizes of all directories. Why don't we store just the total size? It would be a number, not a so huge array.
I don't see anywhere in the core the need to know all the sizes, but only the global size.
When you upload or delete a file you can update that single number, and that's it.
Maybe you can calculate again the total size every time you upload or delete a file, but I would never save all the details in a transient. For what reason?
Probably, I missing something. Where do you need so much detailed information?
As I see in the core files the function get_dirsize() is used only wp-includes/ms-functions.php.
Here are the lines of code where it is used:
<?php $upload_dir = wp_upload_dir(); $space_used = get_dirsize( $upload_dir['basedir'] ) / MB_IN_BYTES;
Then the function get_dirsizes() calls the function recurse_dirsize() which is defined in wp-includes/functions.php. Inside this last function, we have get_transient( 'dirsize_cache' ).
In no other place in the core, do I see anything that needs to get the transient 'dirsize_cache'.
So we have this huge option only to easily get the size of the upload directory? One for one single value?
Are we boating the database and slowing down every page because of information that nobody needs? And this option is even autoloaded?
And this is still not fixed?
If I'm not missing anything I would only store the total size of the upload directory, and nothing else.
Or if someone knows that in some situations we need all the details, please let us know.
For me it's not a matter of autoloading or not autoloading, it's a matter of whether we need or we don't need all the details. And if we need them, they must not be autoloaded, no doubt about that. This is a serious issue for performance and should be very urgent.
#30
@
22 months ago
Still an issue on 6.1.1 Just found this field in one of our sites has grown to nearly 1MB.
#31
@
21 months ago
I think we have two options here to tackle the first issue according to @flixos90 's comment:27
- Change the transient
_transient_dirsize_cache
autoload option tono
in WordPress update.
- Add a temporray fix in WP using
query
filter so that wheneverwp_load_alloptions
function's query run to load options with autoload set to yes, we can exclude certain options from loading.
#32
@
21 months ago
I think the second approach would be useful because of following reasons.
- There can be some other options like
_transient_dirsize_cache
which can have performance impacts.
- There need to be a general provision to filter out such options from
alloptions
query which can hamper the performance.
- Changing the transient autoload value to
no
can have some unknown side-effects.
Adding the patch here which adds a function wp_filter_autoloaded_options
and a filter exlude_autoloaded_options
to filter out the undesirable options from alloptions
query.
#33
@
15 months ago
Pretty much like @archon810, we suffered months of bad performance, and database bandwidth usage without finding what was the cause.
I am on a single site, not multisite.
We had 15Mb of data in _transient_dirsize_cache
.
45250 folders in wp-content/uploads
.
One of our custom features created many folders in the uploads folder and slowly the perf decreased.
I just put autoload=no
like many of you.
Difference is huge, night and day.
I don't see why is it autoloaded it seems "useful" only in the backend, and especially for Multisite.
Update: below I added an expiration (1 month) to the set_transient call. This way the dirsize_cache
entry is autoload=no
`.
@
15 months ago
Adding expiration to dirsize_cache transient so that it can be updated with autoload=no
#34
@
14 months ago
- Milestone changed from Future Release to 6.4
Adding this to the 6.4 milestone for consisderation.
#35
@
13 months ago
I like the solution from @nicomollet .
The only thing I would recommend is using the year in seconds constant.
#37
follow-up:
↓ 38
@
13 months ago
- Keywords has-patch changes-requested added; needs-patch removed
@nicomollet can you address the comment:36 please?
#38
in reply to:
↑ 37
@
13 months ago
Replying to oglekler:
@nicomollet can you address the comment:36 please?
Sure I will publish a new patch on Monday.
@
13 months ago
Adding year expiration to dirsize_cache transient so that it can be updated with autoload=no
#39
@
13 months ago
@oglekler @spacedmonkey I added a new patch, with just a change in expiration duration (month to year).
#40
@
13 months ago
Looking into this ticket more. I think we should do the following.
At the moment, if you have using a persistent object cache, this change would be a regression, as the transient would expire after a year instead of forever. Let's do something like this
$expiration = ( wp_using_ext_object_cache() || wp_installing() ) ? 0 : 10 * YEAR_IN_SECONDS;
I also recommend upping cache time to 10 years, that is much longer.
What do you think @nicomollet ?
@
13 months ago
Add 10 years expiration to dirsize_cache transient so that it is not autoloaded (except persistent object cache and installing)
#41
@
13 months ago
@spacedmonkey Sure, I trust you on the exception for object cache.
Already submitted a patch with the change.
I had to duplicate the expiration condition for this, not sure if can be avoided without creating a new static function.
#42
follow-up:
↓ 43
@
13 months ago
I like your latest suggestion @spacedmonkey and dirsize_expiration.patch that you implemented @nicomollet. Two follow-up comments:
- Why do we set the expiration to
0
(none) whenwp_installing()
? - The
$expiration
variable definition in theclean_dirsize_cache()
function should be moved below the early return if clause.
I'd say this is almost good to commit :)
#43
in reply to:
↑ 42
@
13 months ago
Replying to flixos90:
- Why do we set the expiration to
0
(none) whenwp_installing()
?
It may not be needed but it was designed follow these lines. It feels a little edge case, so may not be needed.
The $expiration variable definition in the clean_dirsize_cache() function should be moved below the early return if clause.
I agree with this feedback @nicomollet .
Question, do we need to clean the transient / update the autoload value on upgrade? Say in wp_upgrade
?
@
13 months ago
Adding expiration to dirsize_cache transient so that it can be updated with autoload=no (fix no wp_install, expiration after early return)
#44
@
13 months ago
@spacedmonkey I added my new patch.
Took care of the expiration after early return.
It doesn't include wp_installing()
condition.
Question, do we need to clean the transient / update the autoload value on upgrade? Say in wp_upgrade?
I don't think this transient needs cleaning on upgrade. It only contains list of folders sizes in WP.
#45
@
13 months ago
- Keywords commit added; changes-requested removed
This patch looks good. I did some testing and it works well. Marking as ready to commit.
This ticket was mentioned in Slack in #core-performance by spacedmonkey. View the logs.
13 months ago
#47
@
13 months ago
Thank you for the updates @nicomollet @spacedmonkey, the latest patch LGTM.
@spacedmonkey Regarding your data, what kind of site did you use for this testing? Almost 3MB memory usage difference is huge, was that for a basic WP site with the initial setup or did you make any specific configuration changes to simulate something?
#48
@
13 months ago
@flixos90 I updated the value of the transient as a text file to this ticket. I have a number of popular plugins installed, like woo and yoast. These have lots of files and subdirectories. The more files, uploads and plugins, the biggest the transient.
#49
@
13 months ago
Thanks @spacedmonkey! That's a good list of plugins, it's impressive to see how much memory usage reduction this change brings.
#50
@
13 months ago
I have 30 plugins installed for testing. Only a number of which are active, like performance lab and query monitor.
This many plugins is a valid use case, I have worked on sites with 30+ plugins active, less along sites with plugin installed by not active..
The memory improvement is a big win for sure. But unserizialing such a large option on every page request is also wasteful as well.
#53
follow-up:
↓ 56
@
13 months ago
- Resolution set to fixed
- Status changed from assigned to closed
In 56522:
#56
in reply to:
↑ 53
@
11 months ago
Replying to spacedmonkey:
In 56522:
How can I fix this issue? We have a total autoload data size of more than 2 MB, and the biggest autoload data is from the "_transient_dirsize_cache" row in the "wp_options" table. How can I optimize it and increase the performance of the website? Since the autoload data is larger than 800 KB, the loading and other functionalities on the WP admin side are very slow.
Do we need to do the fix manually which is mentioned here in the patch? When can we expect the fix for all? In the meantime can we make the "_transient_dirsize_cache" autoload value to "no"? When can we expect a release for this issue update from WordPress?
#57
follow-up:
↓ 58
@
8 months ago
I also encountered this issue with the dirsize_cache transient option growing in size (it was pointed out by my host, WP Engine). After seeing this fix and upgrading to WordPress 6.4, I noticed the issue persisted, with the variable still set to autoload='yes' in the database. I think it's important for future visitors of this thread to understand that the fix introduced in this update is not retroactive, as previously mentioned. If you need an immediate fix, you can:
- Manually delete the transient from the database (it will be recreated later with the expiration date implemented in this patch).
- Alternatively, set autoload to 'no'.
These steps were crucial for the 6.4 patch to work.
#58
in reply to:
↑ 57
@
6 months ago
Replying to chvillanuevap:
I also encountered this issue with the dirsize_cache transient option growing in size (it was pointed out by my host, WP Engine). After seeing this fix and upgrading to WordPress 6.4, I noticed the issue persisted, with the variable still set to autoload='yes' in the database. I think it's important for future visitors of this thread to understand that the fix introduced in this update is not retroactive, as previously mentioned. If you need an immediate fix, you can:
- Manually delete the transient from the database (it will be recreated later with the expiration date implemented in this patch).
- Alternatively, set autoload to 'no'.
These steps were crucial for the 6.4 patch to work.
@chvillanuevap, I have faced this issue too and solved it today just by clearing domain cache in WPEngine, as the main problem was that we were not able to see changes in frontend resources from time to time, not even after deleting object, network, page and local cache on server side and ours, respectively. But regarding the record at WP_OPTIONS table, it was bloated too in dev environment; clearing it helps but definitely I would not change the autoload option, however I deleted that record just in case, and at this moment it has not caused any issues so far.
Thanks for reporting this!
I'm assuming the following about your site:
dirsize_cache
has 207k folders in its upload directorydirsize_cache
values, which are proportional to the # of foldersIs all of that correct?
At first glance, it does look like
dirsize_cache
can grow indefinitely based on the # of folders. The transient isn't set with an expiration, which seems intentional/correct, given the purpose. It also seems like it shouldn't be autoloaded, since it's only used on the Dashboard, file upload/deletion, and Site Health.It looks like
set_transient()
sets$autoload
toyes
for this situation, and it doesn't allow the caller to override that. This doesn't seem directly related to r49744; it looks like it goes all the way back to when the concept of expiration was added to transients in r10603.I don't see a ticket for that commit, though, or any documentation in the code. It's not obvious to me why
autoload
would be tied to expiration; it seems like it should be tied whether the average performance is better when it's loaded w/ all the options, or only when needed. Maybe expiration was thought to be a good proxy for that?dirsize_cache
was added for Multisite instances when Multisite was merged to Core in r12603. It was repurposed for Site Health (both single- and Multisite instances) in r45104.