#47223 closed defect (bug) (fixed)
Site Health Check: "last missed cron" test too aggressive
Reported by: |
|
Owned by: |
|
---|---|---|---|
Milestone: | 5.3 | Priority: | normal |
Severity: | normal | Version: | 5.2 |
Component: | Site Health | Keywords: | site-health has-patch has-unit-tests commit |
Focuses: | Cc: |
Description
The WP_Site_Health::has_missed_cron()
test will fail if there is a single cron event with a date more than 0 seconds in the past (the check done is if ( ( $cron->time - time() ) < 0 ) {
). This is much too aggressive relative to the way that WP Cron works, relying upon incoming HTTP connections. A cron job running not at the precise time it was expected is normal *and documented* behaviour of WP Cron and not something to worry the user about (https://developer.wordpress.org/plugins/cron/ - "Though it may not run at a specific time, WP-Cron will get your tasks done in a timely manner").
A more sensible test to indicate that your cron queue is under-serviced would be something like something 5 minutes overdue; that would indicate that the number of visits the cron system is getting isn't keeping up with the number of jobs. (You'll still see false positives in various situations; just, a lot less of them).
Attachments (8)
Change History (49)
#1
@
6 years ago
- Keywords needs-patch site-health added
- Milestone changed from Awaiting Review to Future Release
#2
follow-up:
↓ 3
@
6 years ago
I run wp-cron.php from an O/S cron job once every 15 minutes, so jobs can be as late as that. All my sites are now showing failures, although jobs are getting processed regularly and the sites work well.
There should be a check to see if cron is disabled or an alternative is used, because these are valid options.
This is still true in WP 5.2.1.
#3
in reply to:
↑ 2
;
follow-up:
↓ 4
@
6 years ago
Replying to galbaras:
There should be a check to see if cron is disabled or an alternative is used, because these are valid options.
Sure, but point is to check that the jobs are executed in time, or with a reasonable delay, like 60 minutes or so (IMHO).
Why should the method (alternative way to kick it), or disabled, matter?
#4
in reply to:
↑ 3
@
6 years ago
Replying to knutsp:
Why should the method (alternative way to kick it), or disabled, matter?
I only run wp-cron.php once every 15 minutes. Anything shorter than that may be flagged as overdue, but really isn't, and the health checker has no way of knowing, because that's outside of WP.
Honestly, this can be just a matter of wording. Instead of saying "failed", just say "late", with some possible explanations depending on whether the way cron runs.
#5
@
6 years ago
How about this if DISABLE_WP_CRON === true
:
- Assume site is using crontab or similar to process events
- wp-cron tasks are considered late if they haven't run for 15 minutes,
15 * MINUTE_IN SECONDS
- Only consider wp-cron tasks failed after one hour, ie
HOUR_IN_SECONDS
Otherwise:
- The wp-cron loopback or redirect ought to have run when the user logged in to the dashboard
- wp-cron tasks are considered late if they haven't run for five minutes
- Any later and they are considered failed
This will require some additional conditions be added to the site health component.
#6
@
6 years ago
Another option is to provide a setting for "lateness" when DISABLE_WP_CRON === true
, but I totally agree with @peterwilsoncc . When it comes to background tasks, there's no point to rush. After all, site health checks will only be run occasionally anyway.
#7
@
6 years ago
Related: my WP Crontrol plugin performs an actual cron spawn request, minus the non-blocking flag, and surfaces its result if it's anything erroneous. Takes into account the cron constants too.
#8
@
6 years ago
Good thoughts, I'm curious what might be the best approach here, I'd not experienced a failed cron report my self, so I hadn't thought of it as being too strict, but I definitely see the potential as you've all mentioned.
The check is clearly going to need a few more conditions, which isn't a problem at all, we want this to be as accurate as possible, or else there's little use to it :)
Are there other scenarios the cron check needs to be aware of than the ones outlined already?
#9
@
6 years ago
- Keywords has-patch needs-testing added; needs-patch removed
My patch is a 'first version'. It implements the timings mentioned by @peterwilsoncc
This ticket was mentioned in Slack in #core-php by richard_korthuis. View the logs.
6 years ago
#11
@
6 years ago
- Component changed from Administration to Site Health
Moving Site Health tickets into their lovely new home, the Site Health component.
#12
@
6 years ago
@rockfire Thanks for uploading the patch.
I've uploaded 47223.2.diff which includes a few minor coding standards fixes and modifies the docblock slightly.
As the function is public
and changes the return type, it is a break of backward compatibility which is something WordPress aims to maintain. I'm going to consider a way forward without breaking it.
The return value is still truthy but a plugin or theme author using a strict type check on the function's return value will get either false positives or negatives.
#13
@
6 years ago
Wouldn't a complimentary new function of has_late_cron()
that needs to be truthy before we check for has_missed_cron()
be an idea here? Yes it adds an extra layer, but it feels cleaner than to introducing a string here as well, we already may return a boolean or a WP_Error in that function call as is.
#14
@
6 years ago
- Keywords needs-patch added; has-patch removed
@rockfire ,
Thank you for your help. Does your PHP code fix the issue that many of us are experiencing when we disable WP_CRON and, instead, use a Real Cron Job?
Issue: We're getting "A scheduled event has failed" (i.e., cron) messages when using a Real Cron Job. Click here: http://prntscr.com/oam9nj
The cron errors (from affected plugin, etc) vary depending on when and how often we run WP's Site Health check. Not a biggie for us, but hoping WP will issue a patch to fix this soon.
Again, thank you.
#15
@
6 years ago
- Keywords has-patch added; needs-patch removed
@peterwilsoncc I have fixed the backwards compatibility break, by introducing a new function has_late_cron
as @Clorith suggested.
@jetxpert Yes this will fix it, when you disable WP_CRON a 'late' message will only appear if the cron is more than 15 minutes late (with WP_CRON enabled, it displays the 'late' message when the cron is even 1 second late). And a 'failed' message will be displayed if the cron is more than 1 hour late (with WP_CRON enabled that is after 5 minutes).
#16
@
6 years ago
Why should the determination of whether a cron job is "late" depend upon the method being used to invoke cron jobs? This is over-complicating things. A job is either sufficiently late for it to be worth advising the user about it, or it isn't. If I turn up late for work, that's all the boss cares about - "but I decided to walk, slowly" doesn't mean that late is no longer late.
As a plugin developer who uses cron in several plugins, I see that lots of people disable the "loopback" cron mechanism and use something else, and set up that something else to call wp-cron.php
far too infrequently relative to the number of cron jobs on their site. We shouldn't add allowances for that. If a job is sufficiently late to advise of at all, the user should be advised. And correspondingly, if it's decided that 15 minutes late is OK for some users, then it's OK for all. Or alternatively, if it really is intended that different cron methods are meant to work differently, this should be documented.
#17
@
6 years ago
@rockfire Thanks for the refreshed patch, it's greatly appreciated.
@DavidAnderson The presence of jobs in the wp-cron queue is being used as a proxy for determining whether cron is being run on the server.
For sites using the default set up a loopback request to /wp-cron.php will be fired as the dashboard is opened. If the loopback request is working, any due jobs will be processed.
For sites using crontab due jobs won't run until the next cron event is fired. The presence of due jobs does not indicate the site is not operating as the site administrator intends. The administrator may have switched to crontab because they have long running jobs that timeout over HTTP so must be run on the CLI.
You are right, a recommended frequency should be documented alongside the constant to disable wp-cron loopback requests. I will contact the docs team.
#18
@
6 years ago
@peterwilsoncc Why not make it a constant, given that WP CRON is disabled via wp-config.php?
Simple idea
define( 'WP_CRON_INTERVAL', 15 ); // Minutes
or
define( 'CLI_CRON_INTERVAL', 15 ); // Minutes
Radical idea
Repurpose DISABLE_WP_CRON to define( 'DISABLE_WP_CRON', 15 );
.
The logic can be like this:
- If
DISABLE_WP_CRON
is some positive integer (maybe within a reasonable range), the default WP_CRON will be disabled and Site Health will use it as the interval.
- If
DISABLE_WP_CRON === false
, the normal WP_CRON will be used and Site Health will use its default interval for that.
- Otherwise, WP_CRON will be disabled and Site Health will use a (different?) default value for the interval.
Doing this may actually help plugins and custom functions tailor various intervals to the specific sites. For example, admins will not be offered 1-minute and 5-minute intervals when they don't make sense for the site.
#19
@
6 years ago
@galbaras As a rule, WordPress is avoiding both defining and expanding the use of wp-config constants. I'll be expanding the documentation of the constant to recommend a frequency when I next get the chance.
To keep this ticket focused on the Site Health component, if you wish to discuss this further please open a ticket against the cron component with additional details on the use case.
#20
@
6 years ago
That's cool. I'll take a new constant, too :) Feel free to name something else, even, as long as the respective check makes sense in the context of sites like mine.
#21
follow-up:
↓ 25
@
6 years ago
Component maintainers,
Are you happy with the approach in the most recent patch? It'd be dandy to get this in to 5.3 if possible.
My rough thoughts/questions are:
- Are
$timeout_late_cron
and$timeout_missed_cron
required as properties or can they be defined in the functions - Unit tests would be dandy, keep in mind
DISABLE_WP_CRON === true
in the test suite, affecting the timing rules - I'm not going to add/change constants, are filters required for the timing rules?
Thanks team!
cc @Clorith @miss_jwo @spacedmonkey @afragen
This ticket was mentioned in Slack in #core-site-health by dd32. View the logs.
6 years ago
#23
@
6 years ago
@peterwilsoncc yes I'm happy with the approach.
My rough thoughts/questions are:
- Are $timeout_late_cron and $timeout_missed_cron required as properties or can they be defined in the functions
It seems that $timeout_late_cron
could be defined in the function has_late_cron()
but $timeout_missed_cron
is used in has_late_cron()
and has_missed_cron()
.
- Unit tests would be dandy, keep in mind DISABLE_WP_CRON === true in the test suite, affecting the timing rules
Yes they would be. Alas, I've never actually written one 😔
- I'm not going to add/change constants, are filters required for the timing rules?
I don't really think constants or filters are necessary for this module specific to the Site Health page. If the admin is sufficiently capable it's simple enough to just disable the test and prevent it from showing by using the add_filter( 'site_status_tests' ... )
and filter out scheduled_events
from the array.
Also,
- Must we test for
true === DISABLE_WP_CRON
? I thought the value was a boolean if defined. If so the patch should be updated in the constructor. - Isn't the WPCS
elseif
and notelse if
? - Couldn't the conditionals in
get_test_scheduled_events()
be changed from
if ( is_wp_error( $this->has_missed_cron() ) ) { ... } else { if( $this->has_missed_cron() ) { ... } else if ( $this->has_late_cron() { ... }
to
if ( is_wp_error( $this->has_missed_cron() ) ) { ... } elseif( $this->has_missed_cron() ) { ... } elseif ( $this->has_late_cron() { ... }
I've added a diff with my thoughts. Please ensure my logic with the restructured conditional.
#24
@
6 years ago
Oh heck, I stopping now. I can't even tell if there's a difference between my last 2 patches and I'd really love to be able to delete one. I just don't know which one. Good night everyone.
Can someone just check my indentation.
#25
in reply to:
↑ 21
;
follow-up:
↓ 26
@
6 years ago
- Milestone changed from Future Release to 5.3
Replying to peterwilsoncc:
Are you happy with the approach in the most recent patch? It'd be dandy to get this in to 5.3 if possible.
Haven't fully reviewed the patch yet, just noted that comment:13 suggests checking for has_late_cron()
first and then for has_missed_cron()
, while the latest patch does it vice versa. Is that intentional?
#26
in reply to:
↑ 25
@
6 years ago
Replying to SergeyBiryukov:
Haven't fully reviewed the patch yet, just noted that comment:13 suggests checking for
has_late_cron()
first and then forhas_missed_cron()
, while the latest patch does it vice versa. Is that intentional?
Yes that is intentional. It first checks the worst-case-scenario (a missed cron) and if that is the case there is no need to check for the late cron. Only if there is no missed cron we check for a late cron.
#27
follow-up:
↓ 29
@
6 years ago
I agree with comment:26 that it's better to look for worst cases, before checking for intermediary ones, so let's look for missed ones before seeing if any are late.
The approach in 47223.5.diff looks reasonable to me, I would make one change though, for the has_late_cron()
check, store the ( $cron->time - time() )
segment to a single variable, it saves having to re-calculate it twice per if statement (and avoids time-shifts between the two in some cases), and also improves readability.
#28
@
6 years ago
I wasn't sure what to name the variable but this can be easily changed in the commit.
#29
in reply to:
↑ 27
@
6 years ago
- Keywords needs-unit-tests added
Replying to Clorith:
I agree with comment:26 that it's better to look for worst cases, before checking for intermediary ones, so let's look for missed ones before seeing if any are late.
This makes sense.
These are the unit tests I'd like to see before this is committed:
- Everything is dandy: Add a scheduled task to run in +3 minutes, ensure both the late and missed return
false
- Late cron jobs: Add a scheduled task to run in -16 minutes, ensure late returns
true
, missed returnsfalse
- Missed cron jobs: Add a scheduled task to run in -61 minutes, ensure both late and missed return
true
- No cron jobs: Ensure both late and missed return
WP_Error
#30
@
6 years ago
- Keywords has-unit-tests added; needs-unit-tests removed
47223.8.diff is 47223.7.diff but now with unit tests.
#31
@
6 years ago
- Keywords commit added; needs-testing removed
47223.8.diff looks solid to me, the added unit tests are nice and thorough as well, thank you for those.
#32
@
6 years ago
- Owner set to peterwilsoncc
- Resolution set to fixed
- Status changed from new to closed
In 45801:
#34
@
5 years ago
Good Day!
Hope y'all are in good spirits, coding, and enjoying your daily cup of java.
Hey, can the wonder-folks at WordPress provide an update on this? Site Health is still reporting "failed to run" cron messages when a Real Cron is used - or - a plugin's cron job is out of sync with WordPress' Site Health checks.
What is the estimated date for releasing a fix for this issue? (i.e., WordPress Revision Level)
Thank you!
#35
@
5 years ago
@jetxpert
This was fixed in version 5.2. On sites using alternative cron, tasks are only considered late if they haven't run within 15 minutes of the scheduled time, and missed if they haven't run within on hour of the scheduled time.
If you're still seeing alerts, it's likely your cron task is failing or you need to run it more frequently. Every 10 minutes is recommended.
#36
@
5 years ago
@peterwilsoncc,
Thanks for the quick, stellar reply. Concise and helpful.
Please know that many hosting companies (e.g., SiteGround) are discouraging their clients (especially those on shared plans) from setting up Real Crons less than 30 minutes. Based on your input, therefore, it appears we're going to continue to have these "failed to run" cron messages on our dashboard and other "trigger" issues as well.
With all due respect, I would like to ask the wonder-folks at WordPress consider adjusting the cron timing to 30 minutes -- or -- in a future WP revision, allow users to manually adjust the timing for cron tasks (e.g., in 10 minute increments up to one hour).
Quite frankly, 10 minutes seems too short and may cause undue server load -- with the implication of a temporary host shutdown -- until the cron is adjusted back to 30 minutes or greater.
Last, can you explain again where the 10 minutes came from? Justification? Would like to understand better the rationale.
Again, thank you and have an awesome day!
#37
@
5 years ago
Use a plugin like WP Crontrol to examine the list of scheduled tasks running on your site, and how often they run. You can then run the maths to see how often you need a cron task to run to service them all on your specific site. It's likely you'll find that 10 minutes is not often enough.
Any hosting company that discourages running one more often than 30 minutes has crossed well over the line into customer-hostile - I say this as someone who's run a hosting company, not as a freeloader who thinks resources don't cost money. If someone's infrastructure can't process the equivalent of an HTTP request every few minutes on a website and you have to ask people to cut down to one HTTP request per half an hour, you're effectively saying either "our platform really stinks and is about to fall over if any websites are used" or "we have no sense of proportion", neither of which is a good sound from someone you're entrusting with the running of a website that matters in any way.
#38
@
5 years ago
@DavidAnderson,
You must be on your 3rd cup of coffee :) Thanks for the quick reply.
Click here for a glimpse of what SiteGround (and other hosts) are telling their clients. How would you propose solving this gap?
Again, thank you. Cheers!
#39
@
5 years ago
@jetxpert Since WordPress provides a way to run jobs when visitors load pages, why would you settle for once every 30 minutes?
You can run WP CRON from an O/S cron job as a backup, in case you don't get enough traffic, and then, every 30 minutes may be OK.
It's likely that you can use the following workaround: instead of setting up one cron job every 30 minutes, set up 3 identical ones, and run them 10 minutes apart. Basically, if your hosting company is unreasonable, chear ;P
#40
@
5 years ago
Should WP work overtime to overcome crazy things that hosting companies say or do? Well, to some extent, yes, it's good to make software that "just works" no matter how hostile the environment. But ultimately, you can't square a circle. Two cron jobs per hour, and any non-trivial WordPress site's list of cron jobs that it wants to run, are mathematically incompatible. One plus one cannot be made to equal however-many-dozen.
The kindest intepretation is that Siteground have just followed a piece of popular "folk knowledge" about WP Cron and its supposed performance impact if run externally. But in fact, it has no detectable performance difference compared to when run via the normal mechanism, assuming that your website actually has visitors.
#41
@
5 years ago
@galbaras, @DavidAnderson, @peterwilsoncc
Thank you so much. Useful info.
Bottom line is ... we have a choice, no doubt. We can use either WP Cron or Real (Alternate) Cron to optimize a website's responsiveness and/or performance.
We have changed our Real Cron to run every 10 minutes. Lets hope the pesky "fail to run" messages will no longer be triggered and our website will run better. If not, we'll either switch back to WP Cron or adjust our Real Cron based on data obtained via WP Crontrol.
Have a wonderful day.
Cheers!
Good spot.