Make WordPress Core

Opened 13 years ago

Closed 6 years ago

Last modified 6 years ago

#18738 closed enhancement (fixed)

Improving cron spawning and other non-blocking HTTP requests

Reported by: johnbillion's profile johnbillion Owned by:
Milestone: 5.1 Priority: normal
Severity: normal Version:
Component: HTTP API Keywords: has-patch wpcom-hack has-dev-note
Focuses: Cc:

Description

The order of preference for transport methods in the HTTP API is cURL, streams, fsockopen. However cURL and streams cannot perform non-blocking requests, but fsockopen can. Therefore, fsockopen should be the highest priority transport method for non-blocking HTTP requests.

Here's an example. I have a script at http://ctftw.com/sleep.php which sleeps for 5 seconds.

$start = microtime( true );
wp_remote_get( 'http://ctftw.com/sleep.php', array(
	'blocking' => false
) );
$end = microtime( true );
var_dump( $end - $start );

When the cURL or streams transports are used, this request blocks the page for 5 seconds (the default request timeout is 5 seconds).

Let's disable the cURL and streams transports (leaving only fsockopen) and try again:

add_filter( 'use_curl_transport',    '__return_false' );
add_filter( 'use_streams_transport', '__return_false' );

$start = microtime( true );
wp_remote_get( 'http://ctftw.com/sleep.php', array(
	'blocking' => false
) );
$end = microtime( true );
var_dump( $end - $start );

This request does not block the page because fsockopen returns immediately after sending the request.

Cron Spawning

This is a benefit to core because it improves the cron spawner (and can potentially fix #8923). The cron spawner uses a timeout of 0.01 seconds and a non-blocking request, but actually takes longer than 0.01 seconds.

Example:

$cron_url = get_option( 'siteurl' ) . '/wp-cron.php?doing_wp_cron';
$start = microtime( true );
wp_remote_post( $cron_url, array(
	'timeout' => 0.01,
	'blocking' => false
) );
$end = microtime( true );
var_dump( $end - $start );

This request takes around 1.1 seconds on the three servers I've tested it on.

Let's disable cURL and streams again (leaving only fsockopen) and see what we get:

add_filter( 'use_curl_transport',    '__return_false' );
add_filter( 'use_streams_transport', '__return_false' );

$cron_url = get_option( 'siteurl' ) . '/wp-cron.php?doing_wp_cron';
$start = microtime( true );
wp_remote_post( $cron_url, array(
	'timeout' => 0.01,
	'blocking' => false
) );
$end = microtime( true );
var_dump( $end - $start );

On each of my three servers I see a time of around 0.001 seconds.

We can therefore improve the cron spawner by setting fsockopen as the preferred transport method for non-blocking HTTP requests.

In an attempt to address #8923, we can change the cron request timeout to 1 second. If fsockopen is used, the request is lightning fast at ~0.001 seconds. If it's not available and the HTTP API falls back to cURL or streams then it takes ~1.1 second, which is the same time it takes currently. (Hopefully that makes sense.)

Patch coming up for those who want to test it.

Attachments (8)

18738.patch (1.4 KB) - added by johnbillion 13 years ago.
testing-script.php (796 bytes) - added by dd32 13 years ago.
18738-curl_multi.patch (1.8 KB) - added by kurtpayne 12 years ago.
Use curl_multi functions for non-blocking requests
18738-timeout_ms.patch (1.2 KB) - added by mestovar 12 years ago.
Uses ms timeout instead of seconds
class-http.php.patch (1.4 KB) - added by shanept 11 years ago.
wp_cron-fastcgi_finish.diff (380 bytes) - added by tellyworth 10 years ago.
wp_cron-side optimization
18738-wp_cron-fastcgi_finish.2.diff (440 bytes) - added by tellyworth 6 years ago.
Refresh older patch with conditional constant
18738-wp_cron-fastcgi_finish.3.diff (506 bytes) - added by tellyworth 6 years ago.
Drop the constant and use a PHP version check instead

Download all attachments as: .zip

Change History (51)

@johnbillion
13 years ago

#1 @johnbillion
13 years ago

  • Keywords has-patch added

Patch

#2 @dd32
13 years ago

Not sure what's going on here, but it needs some further investigation.

Fsockopen writes the data, and then fclose()'s on it.

Streams writes the data, uses the stream_set_blocking() function to create a non-blocking request, and then fclose()'s on the handle

Curl uses curl_exec() followed by curl_close() - this could be causing it to wait, we "recently" changed the headers handling of the call, that might've affected it.

At one point in time, we had a different order for blocking/non-blocking as the PHP HTTP Extension didn't support non-blocking requests, when we removed that, the requests appeared to be working then, so I'm wondering if something changed in the way commands are executed. After reading a few paged on curl, it appears that DNS timeouts might come into it as well, as well as, long connection times (ie. 300ms of latency whilst waiting for the host to respond, might cause the curl_Exec() to hang for up to 900ms).

If non-blocking requests weren't working at all, I'd expect to see a lot of reports of slow blogs, given it's in use by the cron system..

#3 @dd32
13 years ago

Running the benchmarks against the 5second sleep script above:

Using curl 
6.569
5.703

Using fsockopen 
5.753
0.344

Using streams 
6.068
6.105

I see similar speed issues against local resources (ie. google.com.au returns in 1.5seconds usually, except for fsockopen with a non-blocking request which does it in 0.08 seconds).

Looking at the stream_set_blocking() function, it only operates on _socket_ and _local file_ resources, HTTP resources can't use non-blocking mode by itself. Streams would need to use stream_socket_create( 'tcp://...') or 'ssl://...' and send HTTP headers manually in order to be able to use non-blocking requests (Effectively turning it into a Sockets class rather than a "streams" class). This would explain why 0.01s timeouts for cron doesn't work for some users, as the connection needs to be made, which often takes longer.

Looking at curl, It supports Asynchronous requests through cURL's multi_exec functionality: curl_multi_init() - however, it appears that you still need to call curl_multi_exec() in a loop to ensure that the request actually takes place "in the background".

#4 @dd32
13 years ago

Even when using curl multi functions, there's still a few caveats, the following items are run in a blocking mode:

  • Name resolves on non-windows unless c-ares is used
  • GnuTLS SSL connections
  • NSS SSL connections
  • Active FTP connections
  • HTTP proxy CONNECT operations
  • SOCKS proxy handshakes
  • file:// transfers
  • TELNET transfers

@dd32
13 years ago

#5 @johnbillion
13 years ago

The curl_multi functions aren't non-blocking either (well, they are, but they don't actually allow you to perform a non-blocking request). Your script needs to wait for a result from curl_multi_exec in a do while loop. If you call curl_multi_exec once then carry on, the handle will be killed by PHP during cleanup and you risk killing the handle before it's completed its request (eg. it might be in the middle of sending data).

#6 @johnbillion
13 years ago

Sorry dd32, just noticed you had already pointed that out.

#7 follow-up: @dd32
13 years ago

If you call curl_multi_exec once then carry on, the handle will be killed by PHP during cleanup and you risk killing the handle before it's completed its request (eg. it might be in the middle of sending data).

Yeah, It needs to be called a few times whilst it processes the connection request at least. with multi_exec I'd want a shutdown hook to finalise the connection.

Pushing fsockopen to the start of the queue for non-blocking requests is a no-brainer short term, and probably better moved to the start longer term as well (for the simple fact that it's controlled by us - and as a result, has a few bugs of it's own.. but fixable ones).

@kurtpayne
12 years ago

Use curl_multi functions for non-blocking requests

#8 in reply to: ↑ 7 @kurtpayne
12 years ago

  • Cc kpayne@… added

Replying to dd32:

Yeah, It needs to be called a few times whilst it processes the connection request at least. with multi_exec I'd want a shutdown hook to finalise the connection.

Excellent idea. 18738-curl_multi.patch does this. I've run this against the HTTP unit tests and there are no failures. There are no tests that use the blocking flag, but this patch doesn't break anything. And the output from the test script is:

Using curl 
5.303
0.013

Using fsockopen 
5.308
0.162

Using streams 
5.438
5.448

#9 @dd32
12 years ago

  • Milestone changed from Awaiting Review to 3.5

Pulling this up into 3.5, I just noticed curl taking up a number of seconds to spawn cron.

I'd like to suggest that:

  1. We change the request order GLOBALLY to fsockopen / cURL / streams or potentially cURL / fsockopen / streams (from the current cURL / streams / fsockopen )
  2. We add the non-blocking patch here to cURL

Reasoning:

  • We control the way the fsockopen class works in every respect, we can catch when it doesnt work
  • It'd be nice to have 2 classes support non-blocking correctly, especially the ones given priority
  • fsockopen was originally added last as the builtin libraries were supposed to do it better than we would in PHP.. ultimately, bug reports have always pointed to the opposite, often with server configs we just can't work around ( safe mode issues and curl not being able to do DNS lookups, off the top of my head)
  • Streams (fopen) is also great at failing to spawn cron on setups using .local dns, Often seen in mac's, fopen() fails to open a loopback connection, also seen in Ubuntu installs occasionally.. this is what prompted me to come find this ticket again.
Last edited 12 years ago by dd32 (previous) (diff)

#10 @nacin
12 years ago

  • Owner set to dd32
  • Status changed from new to assigned

Dion, any desire for any of this right now?

#11 @dd32
12 years ago

  • Keywords 3.6-early added
  • Milestone changed from 3.5 to Future Release
  • Priority changed from normal to high
  • Status changed from assigned to accepted

It's been broken so long that I don't really care for rushing it in, I also don't have the ability to test all the different HTTP environments correctly right now.

Pushing to future release for now, but I'd like to get these changes into 3.6 as early as possible to give it a decent amount of time on development environments.

#12 @nicola.peluchetti
12 years ago

Hi, two things.

1) You should update the documentation of

WP_Http_Fsockopen

because as it's now it's a little bit misleading as it says

/**
 * Send a HTTP request to a URI using fsockopen().
 *
 * Does not support non-blocking mode.

While this is not true.
2) As of now if cUrl is disabled i can't get the CRON to function even if it points to an empty call. This is because i think that the call that spawns the cron

 
wp_remote_post( $cron_url, array( 'timeout' => 0.01, 'blocking' => false, 'sslverify' =>  apply_filters( 'https_local_ssl_verify', true ) ) );

Always fail if the transport used is WP_Http_Streams, which is the default if cUrl is disabled.
I know i can use ALTERNATIVE_WP_CRON but lot's of user complains ( and i can reproduce it on my dev server ) that it blocks the request, and i also reproduced that in develop

#13 follow-up: @mestovar
12 years ago

I spent a while tracking this one down and fixing it before I copped on and checked your bug tracking system.

My solution was simply to fix the timeout issue where is turns the passed timeout into ceil(timeout) in seconds. Since php 5.2.3 you can pass the timeout to curl as miliseconds, so I just altered it to use that method instead. I verified that the cron does get run, but even though the original code used the same timeout for both, changing the connect timeout to 10ms may have repercussions.

@mestovar
12 years ago

Uses ms timeout instead of seconds

#14 in reply to: ↑ 13 ; follow-up: @kurtpayne
12 years ago

Replying to mestovar:

Since php 5.2.3 you can pass the timeout to curl as miliseconds

The documentation comes with this caveat:

If libcurl is built to use the standard system name resolver, that portion of the connect will still use full-second resolution for timeouts with a minimum timeout allowed of one second.

#15 in reply to: ↑ 14 @mestovar
12 years ago

  • Cc mestovar added

Replying to kurtpayne:

The documentation comes with this caveat:

If libcurl is built to use the standard system name resolver, that portion of the connect will still use full-second resolution for timeouts with a minimum timeout allowed of one second.

That's just the name resolver though, which shouldn't be an issue for the cron anyway. Using the milliseconds option ends up being much more like the expected action, even if you combine it the the curl_multi.

#16 @aaronholbrook
11 years ago

  • Cc aaron@… added

#18 @shanept
11 years ago

Update to 18738-curl_multi.patch. WP 3.7 has some error checking in the blocking section that was obviously not present at the time of the original patch.

#19 @Rarst
11 years ago

  • Cc contact@… added

As additional consideration is there any reason to spawn cron smack in the middle of the init where it can seriously hold up page load? I would think shutdown might be more appropriate hook, but not sure if I am missing reasons it got stuffed into init in first place.

@tellyworth
10 years ago

wp_cron-side optimization

#20 follow-up: @tellyworth
10 years ago

wp_cron-fastcgi_finish.diff implements an optimization used on wp.com: it causes wp-cron.php to end the HTTP request immediately, rather than when it finishes processing.

That helps when a blocking cron spawner is used, but only on fastcgi obviously.

Props @vnsavage.

#21 in reply to: ↑ 20 @johnbillion
10 years ago

  • Keywords 3.6-early removed

Replying to tellyworth:

wp_cron-fastcgi_finish.diff implements an optimization used on wp.com: it causes wp-cron.php to end the HTTP request immediately, rather than when it finishes processing.

This makes sense.

This ticket was mentioned in Slack in #core by netweb. View the logs.


9 years ago

#23 @obenland
9 years ago

  • Keywords wpcom-hack added
  • Milestone changed from Future Release to 4.5

#24 @johnbillion
9 years ago

  • Owner changed from dd32 to johnbillion

#25 @kirasong
9 years ago

@johnbillion: Are you still planning on tackling this in 4.5?

This ticket was mentioned in Slack in #core by jorbin. View the logs.


9 years ago

#27 @jorbin
9 years ago

  • Milestone changed from 4.5 to Future Release

#28 @johnbillion
9 years ago

  • Keywords 4.6-early added
  • Priority changed from high to normal

#29 @johnbillion
8 years ago

  • Keywords 4.6-early removed
  • Owner johnbillion deleted
  • Status changed from accepted to assigned

#30 @johnbillion
7 years ago

  • Milestone Future Release deleted
  • Resolution set to wontfix
  • Status changed from assigned to closed

I'm nervous about concerns around fastcgi_finish_request(): https://stackoverflow.com/questions/14191947/php-fpm-fastcgi-finish-request-reliable

Accidental output can cause the script to exit, and potential issues with not being able to write to the error log.

#31 @jnylen0
7 years ago

I was able to reproduce and fix the accidental output + exit issues mentioned at that link. However, I didn't see any problems with writing to the error log. See ticket:41358#comment:4 for details.

#32 @barry
6 years ago

  • Resolution wontfix deleted
  • Status changed from closed to reopened

Hi @johnbillion - do you think we can get this in trunk early for testing for 5.1? Even if its use is gated by a constant that defaults to falsem it would be good to get some more widespread testing instead of punting because of a SO post. Without the changes being in core I think that testing is unlikely to happen. If it works reliably, even for some hosting configurations, I think it could have a significant positive performance impact.

#33 @barry
6 years ago

  • Milestone set to 5.1

@tellyworth
6 years ago

Refresh older patch with conditional constant

#34 @tellyworth
6 years ago

@barry is attachment:18738-wp_cron-fastcgi_finish.2.diff the sort of thing you had in mind?

Is cron the most important case, or elsewhere?

#35 @barry
6 years ago

This diff looks pretty good. Thanks! wp-cron seems to be a good place to start and probably has the least chance of breaking stuff. There are some other related tickets though that suggest using it more places:

https://core.trac.wordpress.org/search?q=fastcgi_finish_request

Currently there is a popular plugin which hooks some slow actions on shutdown and having an easy way to do this asynchronously would be nice.

#36 follow-up: @pento
6 years ago

I'm not wild about adding a new constant for this, I'd be fine with just calling it as in wp_cron-fastcgi_finish.diff.

As @jnylen0 discovered, calling ignore_user_abort( true ) will fix the problem with output flushing causing the script to exit.

Also, as this will throw an error on PHP < 7.0.16, I'm also fine with restricting it to newer PHP versions.

#37 in reply to: ↑ 36 @barry
6 years ago

Replying to pento:

I'm not wild about adding a new constant for this, I'd be fine with just calling it as in wp_cron-fastcgi_finish.diff.

You're more adventurous than I am. I think we could remove the constant in 5.2 (or default to true) if there aren't any major bugs reported in 5.1. We should urge hosts running FPM and recent PHP versions to set the constant to true once 5.1 is released.

#38 @pento
6 years ago

  • Keywords needs-dev-note added

Let's commit it without a constant for now, and write a dev-note to inform people that it's there. We can ask hosts to specifically test it during beta/RC, too. If it turns out to be problematic, we can turn it off during beta.

@tellyworth
6 years ago

Drop the constant and use a PHP version check instead

#40 @pento
6 years ago

@tellyworth: Looks good. For coding standards, add curly braces to the if block. Or, run composer run format src/wp-cron.php. 🙂

#41 @tellyworth
6 years ago

In 44488:

Cron API: Make wp-cron.php non-blocking where possible.

This should make cron spawning faster by ensuring requests to wp-cron.php return immediately regardless of transport method. It is enabled only on recent PHP versions with fastcgi, due to historical bugs and availability of fastcgi_finish_request(). This needs testing on a range of platforms, to help determine if it's safe to use in other contexts also.

Props vnsavage, johnbillion, jnylen0.
See #18738, #41358

#42 @pento
6 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

#43 @desrosj
6 years ago

  • Keywords has-dev-note added; needs-dev-note removed
Note: See TracTickets for help on using tickets.