Make WordPress Core

Opened 15 years ago

Closed 8 months ago

#8923 closed defect (bug) (worksforme)

cron timeout is too short

Reported by: hailin's profile hailin Owned by:
Milestone: Priority: normal
Severity: normal Version:
Component: Cron API Keywords: needs-patch reporter-feedback dev-feedback
Focuses: Cc:

Description

Many users have reported that 2.7 cron sometimes fails to publish future post.
And it is further reported that it depends on the number of jobs on the cron queue - when the queue is too long, it will miss some.

I believe this is because we set the timeout value too short:

wp_remote_post($cron_url, array('timeout' => 0.01, 'blocking' => false));

When making a request to wp-cron.php, it won't return until all cron jobs are executed. And 0.01 is just too short time span. It doesn't hurt to give it sufficent time, such as 10 minutes. It will return as long as the mature crons are fired up and executed (in most cases, it's going to be at most a few seconds ).

Attachments (1)

8923_cron.diff (517 bytes) - added by hailin 15 years ago.
patch

Download all attachments as: .zip

Change History (31)

@hailin
15 years ago

patch

#2 @ryan
15 years ago

Such a long timeout would defy the point of spawning cron.

#4 @hailin
15 years ago

In the old code, where we used

fsockopen( $partshost?, $_SERVERSERVER_PORT?, $errno, $errstr, 0.01)
fputs( $argyle, "GET {$partspath?}?check=" . wp_hash('187425'). ...)
0.01 only controls the the fsocketopen TCP handshake phase, it was saying
"the timeout for 3-way TCP handshake is only 0.01 sec", which is enough since the host is local. Then we shovel GET into the pipe, and return. All good.

The new code changed the semantics:

wp_remote_post($cron_url, array('timeout' => 0.01, 'blocking' => false));

Here the timeout is application layer timeout, meaning "If HTTP doesn't get anything back in 0.01 sec, let's close the connection". When there are many cron jobs in the queue, it's going to take longer time to fire them up. And after 0.01 sec, the client closes the connection, either by sending RST, or other means. This interrupts cron jobs despite we have ignore_user_abort(true); at the beginning of wp-cron.php

So if we want to retain the "close without waiting" intent, it appears only the old code achieves it. Otherwise, if we stick to wp_remote_post, 0.01 is going to cause issues.

#5 @jacobsantos
15 years ago

It appears that not all of the transports are working as they should. Probably should be increased to a second, since not all of the transports support floats and something less than 1 second.

#6 follow-up: @hailin
15 years ago

wp_remote_post is going to wait until all the cron jobs are done, or timeout at the proposed 1 sec. 1 sec should be enough for most publish_future_posts, but pings can take longer time...

#7 in reply to: ↑ 6 @westi
15 years ago

Replying to hailin:

wp_remote_post is going to wait until all the cron jobs are done, or timeout at the proposed 1 sec. 1 sec should be enough for most publish_future_posts, but pings can take longer time...

As Ryan said the whole point of the cron is to go async.

If we have to block the request that spawns it by 1 second to ensure it works we might as well do it in-line :-(

I think we need to step back and look at how we can achieve the desired goal easily in PHP4/5/6 rather than just keep tinkering with the timeouts.

#8 @thehalogod
15 years ago

I changed the code as suggested from 0.01 to 10*60 and the post still did not post as scheduled. It just said "missed timeline" (or schedule) I can't remember the exact phrase.

Any other ideas on why scheduling posts doesn't seem to work?

#9 @DD32
15 years ago

Any other ideas on why scheduling posts doesn't seem to work?

Because your WordPress cannot make a request to itself.

Generally this is unique solution to the server, While one way might work for one person, It might not work for another.. The method in WordPress Core tends to work for -most- people.

I'm not sure of all the options, You might be best looking back at cron-related tickets and support forum posts (That'd be the best place for this discussion really..)

#10 follow-up: @Otto42
15 years ago

  • Milestone 2.8 deleted
  • Resolution set to invalid
  • Status changed from new to closed

The timeout does not matter. This is just sending the request to cause the wp-cron to execute on the server. We don't care what the results are, it's just making it start running.

Marking as invalid, because raising that timeout does nothing.

#11 @hailin
15 years ago

  • Resolution invalid deleted
  • Status changed from closed to reopened

I still think the reasoning stated in the ticket was correct.

In fact, I changed the timeout to 2 seconds and tested it at WordPress.com,
it's been very smooth and didn't see missed future post issues for over a month.

#12 @DD32
15 years ago

  • Milestone set to Unassigned

In fact, I changed the timeout to 2 seconds and tested it at WordPress.com, it's been very smooth and didn't see missed future post issues for over a month.

Which transport does WordPress.com use?

Remember that the Timeout here is only used for the connection timeout, Not for the actual waiting of the loading of the page.

There was an issue with the cURL transport where a 0.01 timeout would cause it to never create the connection, A fix was put in for that: http://core.trac.wordpress.org/browser/trunk/wp-includes/http.php#L1191

#13 @hailin
15 years ago

WordPress.com uses the same transport implemented in wp-includes/http.php
Even if 0.01 is the cURL transport timeout, I don't see any harm to increase it to 1 or 2 sec, because in most cases it's going to take only a few miliseconds.

Since cron is fired up once only, so in the worst case, letting one page view pay the 1 sec penalty, while improving the reliability, doesn't sound too bad.

#14 @DD32
15 years ago

WordPress.com uses the same transport implemented in wp-includes/http.php

Thats the thing, HTTP has 5 different transports defined :) and only a few were affected, Was just curious as to which had the problem.

And you're right, a slight delay shouldnt be a problem if it only happens once in awhile, but, The problem is, That on some setups, cron ends up being fired on every page load because its having cron-firing-issues..

eitherway, I do support changing it.. and maybe a define-override or filter(Which is already present i think)

#15 @Otto42
15 years ago

I disagree. If there's a problem with a specific transport not firing the request off because the timeout is too short, then the transport itself needs to be fixed. Upping the timeout is only a bandaid solution, because the whole point of this request is that we don't care what the response is. Ideally, the timeout would be zero or nonexistent, because we want this function to a) fire the request off and b) return. Thus completely ignoring the response and having no delay in waiting for it.

If increasing this timeout "fixes" anything, then we need to determine what transport is being used on that system and then why that transport is not firing the http request off when the timeout is so low.

#16 @ryan
15 years ago

wpcom uses curl. The curl transport bumps the 0.01 timeout to 1 since it doesn't handle floats. As far as I know that was working fine.

#17 in reply to: ↑ 10 @Denis-de-Bernardy
15 years ago

  • Resolution set to invalid
  • Status changed from reopened to closed

Replying to Otto42:

The timeout does not matter. This is just sending the request to cause the wp-cron to execute on the server. We don't care what the results are, it's just making it start running.

Marking as invalid, because raising that timeout does nothing.

+1 to that. If it's the transport, the transport should be fixed (as was done with curl).

#18 @Denis-de-Bernardy
15 years ago

  • Milestone Unassigned deleted

#19 follow-up: @bueltge
15 years ago

  • Milestone set to 2.8.1
  • Resolution invalid deleted
  • Status changed from closed to reopened

Many people have also this problem after update to 2.8:
wp_remote_post( $cron_url, array('timeout' => 0.01

#20 @Denis-de-Bernardy
15 years ago

  • Component changed from General to Cron

#21 in reply to: ↑ 19 @westi
15 years ago

  • Keywords reporter-feedback added

Replying to bueltge:

Many people have also this problem after update to 2.8:
wp_remote_post( $cron_url, array('timeout' => 0.01

Can we have more information here please.

What problem?
What fix are you suggesting?

#22 @bueltge
15 years ago

The actual fix change only the value timeout to 1. In all blogs with this problems works the ping then great. Other people have deactivate fopen on webspace and it is not possible to use the ping-function. Maybe it is possible, the problem installs have a small webspace, small memory?

#23 @scribu
14 years ago

  • Keywords needs-patch added; reporter-feedback removed

#24 @westi
14 years ago

  • Keywords reporter-feedback added
  • Milestone changed from 2.9 to Future Release

Moving to Future until we have a clearly defined test case.

  • What Transport fails with a timeout of 0.01

#25 @techecatocom
14 years ago

  • Cc techecatocom added

#27 @chriscct7
9 years ago

  • Keywords dev-feedback added

#28 @BellaBerlin
9 years ago

I described a somehow related issue together with my fix here:
https://wordpress.org/support/topic/save-a-full-second-on-cron-execution

As ryan pointed out, the timout is bumped up to 1 second when using cURL. This causes an unnecessary 1 second delay on every cron execution, which - depending on the setup - can be quite significant. For example, my blog checks one search engine keyword rank every 2 seconds.

My solution is to use stream transport instead, if a fractional timeout or a non-blocking request is used. For a proper fix I would also recommend using the new milliseconds feature from cURL 7.15.5 if supported, and only if not, fall back to stream.

#29 @rmccue
8 years ago

In 37694:

HTTP API: Update Requests.

This introduces a minimum value of 1 second for timeouts passed to cURL.

Internally, cURL uses alarm() for interrupts, which accepts a second-resolution timeout. Any values lower than 1 second are instantly failed rather than being rounded upwards. While this makes the experience worse for those using asynchronous DNS lookups, there's no way to detect which DNS resolver is being used from PHP.

See #33055, #8923.

#30 @johnbillion
8 months ago

  • Milestone Future Release deleted
  • Resolution set to worksforme
  • Status changed from reopened to closed

Closing this off as there's been no real movement in 7 years.

Note: See TracTickets for help on using tickets.