WordPress.org

Make WordPress Core

Opened 2 years ago

Last modified 3 months ago

#19879 new enhancement

Better caching for get_dirsize

Reported by: batmoo Owned by:
Milestone: Future Release Priority: normal
Severity: normal Version: 3.3.1
Component: Filesystem API Keywords: 2nd-opinion
Focuses: multisite, performance Cc:

Description

In a multisite install, when trying to determine whether a site has exceeded its storage quota, WordPress will scan through a blog's upload directory and sum up the file sizes, by running filesize against each one. With a large number of files, this can significantly slow down the upload process or certain portions of the Dashboard.

get_dirsize has transient caching in place but this is a single cache entry for all folders. It might be better if WordPress has a separate cache entry for each folder and was invalidated based on context so that get_dirsize does not need to be run constantly on older, unchanged directories as frequently.

Change History (14)

comment:1 DJPaul2 years ago

  • Cc djpaul@… added

comment:2 danielbachhuber2 years ago

  • Cc wordpress@… added

comment:3 DJPaul2 years ago

We worked around this bug by making sure that upload_space_check_disabled was set; this resulted in an immediate speed-up.

comment:4 in reply to: ↑ description kurtpayne2 years ago

  • Cc kpayne@… added
  • Type changed from defect (bug) to enhancement

My first thought would be to change recurse_dirsize() like so:

} elseif (is_dir($path)) {
	$handlesize = recurse_dirsize( $path );
	if ($handlesize > 0)
		$size += $handlesize;
}

to:

} elseif (is_dir($path)) {
	$handlesize = get_dirsize( $path );
	if ($handlesize > 0)
		$size += $handlesize;
}

This change would check the cache as it's doing the recursive iteration. This is very optimistic, though. Cache misses would result in a database SELECT and UPDATE/INSERT. For most sites, this change would probably slow things down. Further, this could result in a lot of extra rows in the db that wouldn't serve a useful purpose.

comment:5 jamescollins2 years ago

See http://mu.trac.wordpress.org/ticket/1175 for the origins of the get_dirsize() caching.

comment:6 nacin2 years ago

In [19747]:

Remove copy-pasted comment from upload_is_user_over_quota()'s phpdoc. It came from recurse_dirsize(). see #19879.

comment:8 follow-ups: jeremyfelt8 months ago

  • Keywords 2nd-opinion close added

I think (???) this is a non issue now.

In [21387], get_space_used() was introduced to allow for an alternate method of checking used space to work via filter. I believe this takes care of the original issue.

When that was introduced, it started passing the directory to check via wp_upload_dir() rather than BLOGUPLOADDIR, which theoretically should mean nice things for checking directory sizes in smaller pieces.

Most likely worksforme, but further input is welcome.

comment:9 in reply to: ↑ 8 SergeyBiryukov8 months ago

Replying to jeremyfelt:

When that was introduced, it started passing the directory to check via wp_upload_dir() rather than BLOGUPLOADDIR

Related: [21823] (first I tried to find that in the original changeset, [21387]).

comment:10 in reply to: ↑ 8 nacin8 months ago

Replying to jeremyfelt:

In [21387], get_space_used() was introduced to allow for an alternate method of checking used space to work via filter. I believe this takes care of the original issue.

While it allows someone to do their own caching, this is still a bit lame.

One thing we could do would be to remove the timeout. The dirsize_cache transient is invalidated in wp_handle_upload() (though not wp_handle_sideload()) and wp_delete_attachment(). If we're confident in this, we should be able to use the transient as long as it exists.

It doesn't help with repeated invalidations followed by repeated recursive directory searches, but it'll at least lessen the number of times the directory is recursed for no reason at all.

When that was introduced, it started passing the directory to check via wp_upload_dir() rather than BLOGUPLOADDIR, which theoretically should mean nice things for checking directory sizes in smaller pieces.

BLOGUPLOADDIR contained a path to a site's individual /files/ folder, rather than all sites. So no change here.

comment:12 jeremyfelt5 months ago

  • Keywords close removed
  • Milestone changed from Awaiting Review to Future Release

We should see what we can do with this in 3.9.

comment:13 jeremyfelt3 months ago

  • Component changed from Multisite to Filesystem
  • Focuses multisite added

comment:14 jeremyfelt3 months ago

  • Focuses performance added
Note: See TracTickets for help on using tickets.