Make WordPress Core

Opened 6 years ago

Closed 6 years ago

Last modified 5 years ago

#46645 closed defect (bug) (fixed)

Site health: timeout on the "Info" tab

Reported by: azaozz's profile azaozz Owned by: azaozz's profile azaozz
Milestone: 5.2 Priority: normal
Severity: normal Version: 5.2
Component: Site Health Keywords: site-health has-patch
Focuses: multisite Cc:

Description

Getting:

Warning: An unexpected error occurred. Something may be wrong with WordPress.org 
or this server’s configuration. If you continue to have problems...

Fatal error: Maximum execution time of 30 seconds exceeded in 
[...]build\wp-admin\includes\class-wp-debug-data.php on line 955

when the number of files in one of the tested directories is very large. Tested with having the full source of Gutenberg (with node_modules, etc.) in the plugins dir.

Attachments (6)

46645.diff (7.4 KB) - added by pento 6 years ago.
46645.1.diff (13.7 KB) - added by azaozz 6 years ago.
46645.2.diff (13.7 KB) - added by azaozz 6 years ago.
Uh, disabled caching in get_dirsize() to better test it and forgot to enable it in 46645.1.diff. Disregard that diff, this is the proper one to test :)
46645.3.diff (13.7 KB) - added by azaozz 6 years ago.
46645.4.diff (13.9 KB) - added by azaozz 6 years ago.
46645.4_clean_run.jpg (112.8 KB) - added by xkon 6 years ago.

Download all attachments as: .zip

Change History (42)

#1 @azaozz
6 years ago

The $max_execution_time also includes the time spent connecting to wordpress.org which is added to the time spent in WP_Debug_data::get_directory_size(), so we end up exceeding ini_get( 'max_execution_time' ).

Doing $max_execution_time -= 5; in WP_Debug_data::get_directory_size() solves it here, but 5 seconds is an arbitrary value. Is there a better way to reduce $max_execution_time by the time spent connecting to the wp.org API?

Also, perhaps we can subtract 1 more second off $max_execution_time in case of any other server lag.

Last edited 6 years ago by azaozz (previous) (diff)

#2 follow-up: @pento
6 years ago

  • Keywords site-health added

I'm finding this page super slow to load at the best of times, with occasional timeouts. It definitely needs some performance work done.

#3 @Clorith
6 years ago

We can actually start tracking the execution time from the debug data starts getting gathered, which would help avoid arbitrary numbers, but I agree we should look at improving performance of this page.

Potential slowdown areas that stand out:

  • If multisite, we check blog count
  • Count users
  • Looking up if we can access WordPress.org
  • Getting directory sizes
  • Getting database sizes

We can transient the WordPress access I think, as it should be a pretty given constant once checked, we can even transient it from the Site Health check to speed things up, as it's done there too.

Blog and user counts should be handled by cache mechanics if available, if you have enough users to make an impact there's a near non-existent chance you're not using cache services.

Directory and database sizes though, we can keep them in a transient once collected, but collecting them is the complication now. One alternative is to do this collection as a WP_Cron instance, and do them in execution-time appropriate chunks. I've done this in other similar scenarios, it's not super elegant though, as it may not be done running through things, and won't be as current as one might like so open to input on this.

I would argue that large node_modules directories are an edge case that the average user wouldn't hit often, but I may be mistaken as we are moving more and more into a JS driven world. (I've so far only had one report of a timeout on the debug page for the plugin, although not solid statistics, it's at least a quick indicator)

#4 in reply to: ↑ 2 @azaozz
6 years ago

Replying to pento:

Yeah, ideally we should (be able to) get the sizes directly from the filesystem. Wondering if glob() may be faster, at least no need to loop in PHP to get each file size. If not, perhaps can try using exec() or shell_exec() or system() but these will probably be disabled on many hosts.

If none of the above works, perhaps we can set a "sensible" value to limit that loop. As far as I see the size of the plugins and themes dirs would be nice to have, but is not strictly essential debug info :)

#5 @azaozz
6 years ago

I would argue that large node_modules directories are an edge case

Yep, I agree. But a production site may have git or svn checkouts of plugins. These tend to have few thousands extra files in them too. Imagine that times 50 :)

Potential slowdown areas that stand out

Was just thinking the same. Even if we get the actual delay of making that wp.org API call, there are numerous other delays/slowness that we will need to account for to be able to catch all edge cases like DB on different server, other API calls, etc.

  1. Best would be to get the $start_execution_time as soon as WP starts to load, perhaps at the top of wp_initial_constants(). Thinking it would be better to have a new constant instead of reusing the $timestart global.
  1. Currently we test the max time in the middle of the loop $runtime = ( microtime( true ) - $start_execution_time );, but in one test I saw it failing just after that at $size += $file->getSize(); (since that's somewhat slow filesystem operation). The runtime test will need to account for the time needed to finish running PHP. Don't see a good way to "predict" how long that may take. Seems 1-2 seconds should be plenty.

@pento
6 years ago

#6 follow-up: @pento
6 years ago

  • Component changed from General to Administration
  • Focuses multisite added
  • Keywords has-patch needs-testing added

I did some performance testing, and found that the two major slowdowns were the WordPress.org check (~1s), and the filesystem size check (~10s).

Multisite already has the get_dirsize() function, which is faster on the initial run than WP_Debug_data::get_directory_size() (6s), and due to caching, takes negligible time on subsequent runs.

46645.diff switches to using get_dirsize(). Performance enhancements should probably land in get_dirsize() (related: #19879, #26135).

@jeremyfelt: Any concerns about making get_dirsize() and recurse_dirsize() available on all sites?

#7 @azaozz
6 years ago

Testing 46645.diff, it's a very nice speed bump, almost twice as fast :)

Basic test in PHP 7.3 default build (no additional themes or plugins):

RecursiveDirectoryIterator ABSPATH size 50 times: t = 8.6705460548401, size = 50768266
get_dirsize() ABSPATH size 50 times: t = 4.9799430370331, size = 47831434

The size is different as filesize( $path ) gets the exact size in bytes, $file->getSize() when using RecursiveDirectoryIterator seems to get the "size on disk", however directory size as reported by the OS is different:

Size:           45.6 MB (47,831,434 bytes)
Size on disk:   46.7 MB (48,984,064 bytes)
Contains:       2,162 Files, 256 Folders

However having the Gutenberg source in the plugins dir still gets it to time out. This time the fatal is in \wp-includes\functions.php on line 6960. Thinking we'll need to add the "timeout protection" to get_dirsize(). That will also be useful to multisite as suggested in #26135.

Fixing the cache/transient as in #19879 would also be needed, but is somewhat out of scope here.

IMHO the next steps are:

  • Add a microtime constant at the top of wp_initial_constants() so we have a more precise "start of running" time and it's not changeable by plugins;
  • get_dirsize() will need a "sensible default" of how long it's allowed to run before giving up. Add another param to it so it's easy to set.

Patch coming up :)

#8 @afercia
6 years ago

Testing on my VVV (v2.5.2) – PHP 7.2: 49 plugins installed (all deactivated). Some of them, including Gutenberg, Yoast SEO, and a few others, have node_modules and vendor directories:

  • without this patch: the Info page loads after ~25 seconds
  • with this patch: timeouts after 30 seconds

I'd agree node_modules directories are an edge case, however doesn't appear to be faster on my local env.

That said, I'd suggest to consider there's also an usability issue. Very likely, users will 1) click the link 2) see nothing happens 3) click again after a few seconds... no one will wait more than 10 seconds. They will click again.

Not sure what is the best option but I'd tend to think something should be displayed immediately to warn users the analysis is running,

Last edited 6 years ago by afercia (previous) (diff)

@azaozz
6 years ago

#9 @azaozz
6 years ago

In 46645.1.diff:

  • Continue from 46645.diff.
  • Fix and simplify creating the array of output strings.
  • Add another param $max_execution_time to get_dirsize() and recurse_dirsize(). Default to the current ini_get( 'max_execution_time' ); to not change the functions behavior.
  • Add timeout protection to get_dirsize() and recurse_dirsize().
  • Add WP_START_TIMESTAMP constant set to microtime( true ) before WP loads.

Still TODO:

  • We need to output some "sign" that the server is working while the user waits for the size calculation. Currently the timeout for that is 20 seconds, and that's a looong time to wait :)
  • May need better naming/wording of error messages.

Please test :)

#10 @azaozz
6 years ago

@afercia the code I tested with is here https://gist.github.com/azaozz/3c30573f78b319e0ec6ce641b086651c (paste in an existing plugin or make it into a mini-plugin, warning: it prevents any other output). Not particularly good, but does the job :)

When running it I get:

RecursiveDirectoryIterator ABSPATH size 50 times: t = 8.8491477966309, size = 50770653
get_dirsize() ABSPATH size 50 times: t = 5.0800640583038, size = 47833821

@azaozz
6 years ago

Uh, disabled caching in get_dirsize() to better test it and forgot to enable it in 46645.1.diff. Disregard that diff, this is the proper one to test :)

#11 @xkon
6 years ago

I've run the latest 46645.2.diff along with the gist provided and this is the result on my end still running on 20s limit:

RecursiveDirectoryIterator ABSPATH size 50 times: t = 20.003628015518, size = -1
get_dirsize() ABSPATH size 50 times: t = 0.00084877014160156, size = 70494190

To be more precise it's the /src/ from trunk but I've added various plugins that I had available and the plugins folder counts 25,751 Files, 5,221 Folders and 378 MB, the total src folder is 28,313 Files, 5,527 Folders and 443 MB

This is the same but by removing everything from plugins leaving an /src/ as is:

RecursiveDirectoryIterator ABSPATH size 50 times: t = 3.8340690135956, size = 71711733
get_dirsize() ABSPATH size 50 times: t = 0.00084996223449707, size = 70494190

edit: I'm thinking that most likely I'll have to remove caching as well and retest just to get the numbers right if needed.

Last edited 6 years ago by xkon (previous) (diff)

#12 @afercia
6 years ago

With all my 49 plugins, I get similar results with both patches .1 and .2 but please ignore me :) I don't have a standard setup: my plugins directory is moved out from VVV and synced via Vagrant (can't even get the dirs size):

RecursiveDirectoryIterator ABSPATH size 50 times: t = 20.033728122711, size = -1
get_dirsize(): timeouts

Removing all plugins:

patch .0
RecursiveDirectoryIterator ABSPATH size 50 times: t = 20.031002044678, size = -1
get_dirsize() ABSPATH size 50 times: t = 0.00058889389038086, size = 155625102

patch .1
RecursiveDirectoryIterator ABSPATH size 50 times: t = 20.029039144516, size = -1
get_dirsize() timeouts

patch .2
RecursiveDirectoryIterator ABSPATH size 50 times: t = 20.028195858002, size = -1
get_dirsize() ABSPATH size 50 times: t = 0.00058102607727051, size = 155625102

#13 @azaozz
6 years ago

@xkon @afercia thanks for testing! Please feel free to reduce the number of passes down to something more... suitable. I used 50 as that seemed to make it run for several seconds here (clean install), but that can be 20 or even 10 for test installs with plugins/themes :)

I'm thinking that most likely I'll have to remove caching as well and retest just to get the numbers right if needed.

Right, to be able to compare running times the caching in get_dirsize() has to be turned off. Here is another gist with 20 passes that uses recurse_dirsize() directly (no caching) if you want to test it again :) https://gist.github.com/azaozz/7d29a2505557668ed0e948139cf2b58b.

With this one I get:

RecursiveDirectoryIterator ABSPATH size 20 times: t = 4.2132198810577, size = 50770644
get_dirsize() ABSPATH size 20 times: t = 2.7172060012817, size = 47833812
Last edited 6 years ago by azaozz (previous) (diff)

#14 follow-up: @Clorith
6 years ago

For the timing, apparently core has a timer_start() that sets a global from wp-settings.php, should we use that instead, or is that initiated too late?

This ticket was mentioned in Slack in #core by azaozz. View the logs.


6 years ago

#16 @xkon
6 years ago

Thanks for the 2nd gist @azaozz .

I gave it a bit more time to gather more data + re-tested with the first gist as well and no-cache just to see the difference on calls themselves as well.

I've also run tests x2-3 times each one right after the other as a "stress test" and depending on what my pc was doing in parallel I've seen my CPU / php burning at some points haha. That as well shows on the occasion of a recurse_dirsize that went up to t=131 but then droped down to 60. Always using 46645.2.diff .

Hopefully this can give better info overall.

Here goes:

my-test-dir-size-2.php ( recurse_dirsize() / no-cache )

RecursiveDirectoryIterator ABSPATH size 2 times: t = 14.121313095093, size = 508766358
recurse_dirsize() ABSPATH size 2 times: t = 10.926609992981, size = 471668886
RecursiveDirectoryIterator ABSPATH size 2 times: t = 7.2513349056244, size = 508766358
recurse_dirsize() ABSPATH size 2 times: t = 20.991071939468, size = 471668886
RecursiveDirectoryIterator ABSPATH size 5 times: t = 11.580253839493, size = 508766358
recurse_dirsize() ABSPATH size 5 times: t = 46.954424142838, size = 471668886
RecursiveDirectoryIterator ABSPATH size 5 times: t = 20.000507831573, size = -1
recurse_dirsize() ABSPATH size 5 times: t = 25.045972108841, size = 471668886
RecursiveDirectoryIterator ABSPATH size 10 times: t = 20.000966072083, size = -1
recurse_dirsize() ABSPATH size 10 times: t = 131.0886080265, size = 471668890
RecursiveDirectoryIterator ABSPATH size 10 times: t = 20.001437902451, size = -1
recurse_dirsize() ABSPATH size 10 times: t = 78.639563083649, size = 471668890
RecursiveDirectoryIterator ABSPATH size 10 times: t = 20.000829219818, size = -1
recurse_dirsize() ABSPATH size 10 times: t = 60.278604984283, size = 471668890

my-test-dir-size.php ( get_dirsize() / no-cache )

RecursiveDirectoryIterator ABSPATH size 2 times: t = 7.0754978656769, size = 508766338
get_dirsize() ABSPATH size 2 times: t = 22.463485956192, size = 471668866
RecursiveDirectoryIterator ABSPATH size 2 times: t = 5.9533631801605, size = 508766338
get_dirsize() ABSPATH size 2 times: t = 22.320623874664, size = 471668866
RecursiveDirectoryIterator ABSPATH size 5 times: t = 15.180299043655, size = 508766338
get_dirsize() ABSPATH size 5 times: t = 44.412132024765, size = 471668866
RecursiveDirectoryIterator ABSPATH size 5 times: t = 16.001207113266, size = 508766338
get_dirsize() ABSPATH size 5 times: t = 55.863088846207, size = 471668866
RecursiveDirectoryIterator ABSPATH size 10 times: t = 20.0008020401, size = -1
get_dirsize() ABSPATH size 10 times: t = 74.52948307991, size = 471668870
RecursiveDirectoryIterator ABSPATH size 10 times: t = 20.00087594986, size = -1
get_dirsize() ABSPATH size 10 times: t = 59.835005044937, size = 471668870
RecursiveDirectoryIterator ABSPATH size 20 times: t = 20.00112080574, size = -1
get_dirsize() ABSPATH size 20 times: t = 198.7551150322, size = 471668870

my-test-dir-size.php ( get_dirsize() / cached )

RecursiveDirectoryIterator ABSPATH size 2 times: t = 5.9581201076508, size = 508766329
get_dirsize() ABSPATH size 2 times: t = 0.00055980682373047, size = 471668866
RecursiveDirectoryIterator ABSPATH size 2 times: t = 5.1203339099884, size = 508766329
get_dirsize() ABSPATH size 2 times: t = 0.00056695938110352, size = 471668866
RecursiveDirectoryIterator ABSPATH size 5 times: t = 12.329923868179, size = 508766329
get_dirsize() ABSPATH size 5 times: t = 0.00058197975158691, size = 471668861
RecursiveDirectoryIterator ABSPATH size 5 times: t = 12.478226900101, size = 508766329
get_dirsize() ABSPATH size 5 times: t = 0.00058507919311523, size = 471668861
RecursiveDirectoryIterator ABSPATH size 10 times: t = 20.000779151917, size = -1
get_dirsize() ABSPATH size 10 times: t = 0.0005500316619873, size = 471668861
RecursiveDirectoryIterator ABSPATH size 10 times: t = 20.000670909882, size = -1
get_dirsize() ABSPATH size 10 times: t = 0.00067806243896484, size = 471668861
RecursiveDirectoryIterator ABSPATH size 20 times: t = 20.00090098381, size = -1
get_dirsize() ABSPATH size 20 times: t = 0.00065708160400391, size = 471668866

#17 @azaozz
6 years ago

Thanks @xkon for the testing :)

Looks like RecursiveDirectoryIterator gets faster compared to get_dirsize() when there are more files. I've run the tests on what should be a "typical" install, about 1GB total size, about 35 plugins and 10 themes.

The results for RecursiveDirectoryIterator and get_dirsize() are pretty much the same (tested on PHP 7.3, earlier versions may possibly have some difference). Thinking we should keep using get_dirsize() for now as it has caching. If more tests are done and one is clearly faster on an "average" install, lets switch to it.

#18 in reply to: ↑ 14 @azaozz
6 years ago

Replying to Clorith:

Yeah, don't want to use timer_start() as it runs a bit late and uses a global. It is/was used to show PHP runtime on the front-end and plugins might be doing something with the global :)

A constant is preferable as this is somewhat critical to prevent fatal errors from timeout.

@azaozz
6 years ago

#19 @azaozz
6 years ago

In 46645.3.diff: fix/simplify the timeout protection in recurse_dirsize() and when looping through directories in WP_Debug_Data::debug_data().

#20 @afercia
6 years ago

We need to output some "sign" that the server is working while the user waits for the size calculation. Currently the timeout for that is 20 seconds, and that's a looong time to wait :)

What about using a query arg and checking for that to run the analysis? Navigation to the Info page would be immediate, the analysis would run only when users intentionally press the "Run" link. Would it be a start to improve things?

http://cldup.com/utFImpSg53.png

Note: the counter animation runs indefinitely, would need a mechanism to hide it in the initial view.

#21 @xkon
6 years ago

There are various checks running via ajax on the Status page. Maybe we can ajaxify just this informational section in here as well and only enable the copy ability when that's done?

This would help the page load and most likely not give any other hiccups i.e. the counter animation as @afercia mentions.

#22 @azaozz
6 years ago

Yeah, both approaches will work. With the AJAX we can even be "greedy" and run the size calculations in parallel. That would be four AJAX requests at the same time... However each will have 30 seconds to complete so chances of success are higher. Thinking to try that first :)

However this is somewhat out of scope here. Opened #46707 as a follow-up.

Last edited 6 years ago by azaozz (previous) (diff)

@azaozz
6 years ago

#23 @azaozz
6 years ago

  • Keywords needs-testing removed

In 46645.4.diff: also fix total size calculation when plugins, themes and/or uploads have been moved outside of the WP install dir.

@jeremyfelt @pento thinking this is ready. Any thoughts/suggestions, mostly about moving recurse_dirsize() and adding "timeout protection" to it? The UI/UX issues will be fixed in #46707 after this lands.

#24 @xkon
6 years ago

46645.4_clean_run.jpg shows just a final test on this run without the gists this time. The first run ( no transients etc - left side of screenshot ) bailed properly on folders to keep the page from hanging and the 2nd refresh finished the job almost in no time in my case ( still using the same setup as all my previous tests ).

#25 in reply to: ↑ 6 ; follow-up: @jeremyfelt
6 years ago

Replying to pento:

46645.diff switches to using get_dirsize(). Performance enhancements should probably land in get_dirsize() (related: #19879, #26135).

@jeremyfelt: Any concerns about making get_dirsize() and recurse_dirsize() available on all sites?

No concerns here. It's been a good while since I looked at those, but I don't think it hurts to have them available everywhere.

I haven't looked closely yet, but logic may need to be changed to account for non-uploads directory checking. The is_main_site() check in get_dirsize() assumes you're looking at an uploads directory and excludes /sites from recursive checks. This *could* be an issue if (in a strange world) a plugin or theme slug was "sites". /shrug

It does seem like we'll get quick feedback from people running into #26135, though I think a timeout makes sense as a first step.

#26 in reply to: ↑ 25 ; follow-up: @jeremyfelt
6 years ago

Replying to jeremyfelt:

This *could* be an issue if (in a strange world) a plugin or theme slug was "sites". /shrug

Oh, and I guess even more likely (though probably not much): a plugin or theme with a sites directory inside it.

#27 follow-up: @afercia
6 years ago

Question: see https://core.trac.wordpress.org/browser/trunk/src/wp-admin/site-health-info.php?rev=45068&marks=64-68#L63

		WP_Debug_Data::check_for_updates();

		$info         = WP_Debug_Data::debug_data();
		$english_info = '';
		if ( 0 !== strpos( get_locale(), 'en' ) ) {
			$english_info = WP_Debug_Data::debug_data( 'en_US' );
		}

Does this means with a locale other than English all the checks will run twice? Maybe not so... ideal? If so, all the performance calculations made here should be doubled for non-English locale :)

Aside: this will be changed to check for the user locale in #46694.

#28 in reply to: ↑ 27 @azaozz
6 years ago

Replying to afercia:

Opened #46726 as a follow-up.

#29 in reply to: ↑ 26 @azaozz
6 years ago

Replying to jeremyfelt:

Oh, and I guess even more likely (though probably not much): a plugin or theme with a sites directory inside it.

Hmm, should we make that /sites exclusion conditional on is_multisite()? There's no point having it on single installs.

// On multisite exclude individual site directories from the total when checking the main site,
// as they are subdirectories and should not be counted.
if ( is_multisite() && is_main_site() ) {
	$dirsize[ $directory ]['size'] = recurse_dirsize( $directory, $directory . '/sites', $max_execution_time );
	........

#30 @azaozz
6 years ago

  • Owner set to azaozz
  • Resolution set to fixed
  • Status changed from new to closed

In 45104:

Site health:

  • Prevent fatal errors from timeouts on the Tools => Site Health => Info tab.
  • Use the get_dirsize() and recurse_dirsize() functions to calculate directory sizes. The results are cached.
  • Introduce "timeout protection" in recurse_dirsize().

Props pento, Clorith, xkon, afercia, jeremyfelt, azaozz.
Fixes #46645.

#31 @SergeyBiryukov
6 years ago

In 45106:

PHPCS: Fix WPCS violations in [45104].

See #46645.

#32 @SergeyBiryukov
6 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Unit tests are failing:

1) Tests_Multisite_Get_Space_Used::test_get_space_used_switched_site
Failed asserting that 0 matches expected 4.00543212890625E-5.

/home/travis/build/WordPress/wordpress-develop/tests/phpunit/tests/multisite/getSpaceUsed.php:47

https://travis-ci.com/WordPress/wordpress-develop/builds/106823707

#33 @SergeyBiryukov
6 years ago

The new $max_execution_time parameter for get_dirsize() and recurse_dirsize() needs a @since entry.

#34 @azaozz
6 years ago

Replying to SergeyBiryukov:

PHPCS: Fix WPCS violations in [45104].

ARGGGHHH, since when WPCS forces random white spaces at mostly random locations!? That's so not compliant with the WordPress Coding Standards...

#35 @azaozz
6 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 45111:

Site health: fix usage of max_execution_time. When PHP is run from the cli the default is 0.

Fixes #46645.

#36 @spacedmonkey
5 years ago

  • Component changed from Administration to Site Health
Note: See TracTickets for help on using tickets.