Make WordPress Core

Opened 16 years ago

Closed 16 years ago

#9005 closed enhancement (fixed)

Cron spawning improvements

Reported by: azaozz's profile azaozz Owned by:
Milestone: 2.8 Priority: normal
Severity: normal Version:
Component: General Keywords:
Focuses: Cc:

Description

Cron fails to start on some servers as the server either cannot connect back to itself or cannot resolve it's own name with DNS. Spawning it with an AJAX request fixes this.

Also the processing can be optimized: instead of using an (evil) serialized array, each cron job can be separate entry that is deleted when the job is completed.

Attachments (1)

9005.patch (4.5 KB) - added by azaozz 16 years ago.

Download all attachments as: .zip

Change History (22)

#2 @azaozz
16 years ago

(In [10474]) Spawn cron with AJAX request, see #9005

#3 follow-up: @Denis-de-Bernardy
16 years ago

I think it should be very late on wp_footer, and on admin_footer, rather than just wp_head. On wp_head, the job would need to complete before the actual page gets loaded, due to the fact that it's in the head area -- much like google analytics when inserted in that area.

#4 in reply to: ↑ 3 @azaozz
16 years ago

Replying to Denis-de-Bernardy:

... On wp_head, the job would need to complete before the actual page gets loaded...

No, it does "head" AJAX request that is asynchronous (doesn't stop the execution of the rest of the scripts) and additionally has a timeout of 10ms (0.01 sec.) so the browser doesn't wait for returned content. "wp_footer" would have been a good place for it too, however many themes don't have the wp_footer() call.

Also many hosts seem to give access to the "real" cron through their control panels. The changes to wp-cron make it easier to run it from crontab as a scheduled task either through wget, lynx or CLI if available.

#5 @westi
16 years ago

  • Cc westi added

#6 follow-up: @Denis-de-Bernardy
16 years ago

k... one quick question: when the cron is called, do pages get cached? because if they do, that may lead to quite a few misses.

also, I vaguely recall that part of the reason that cron wasn't on an ajax query was to "profit" from spammers and spiders on low traffic blogs. this might be worth a revisit.

#7 @filosofo
16 years ago

I don't think we should do it this way.

  • The people who are helped by this--those whose servers can't resolve domain names--are in small minority, and they are going to have lots of other problems that can't be solved: they can't send pingbacks/trackbacks, can't ping search engines, can't check for core / theme / plugin updates. So instead let's fallback for them on making a direct call to wp_cron() on shutdown.
  • A large proportion of most web traffic are bots, which we should be able to take advantage of for cron purposes, as Denis suggests. Consider a small blog, which may not get any human traffic in the middle of the night. Relying on the JS approach could still mean missed scheduled future posts.
  • It just seems hacky to have to count on website visitors with JavaScript enabled in order for cron to work.

#8 in reply to: ↑ 6 ; follow-up: @azaozz
16 years ago

Replying to Denis-de-Bernardy:

k... one quick question: when the cron is called, do pages get cached? because if they do, that may lead to quite a few misses.

There's no output from wp-cron unless there are php errors, so there's nothing to cache. Also the AJAX call has a different query string every time.

also, I vaguely recall that part of the reason that cron wasn't on an ajax query was to "profit" from spammers and spiders on low traffic blogs. this might be worth a revisit.

Sure, that can be done. We could try looking at the request headers for the more common bots and run wp-cron then. Another idea was to run wp-cron together with wp-comment-post so all bots that post spam do something useful too...

Replying to filosofo:

The people who are helped by this--those whose servers can't resolve domain names--are in small minority...

In many cases the problem seems to be that the site domain name is mapped to 127.0.0.1 in the hosts file and the web server won't connect back to itself.

... So instead let's fallback for them on making a direct call to wp_cron() on shutdown.

This was the first thing I've tested. Unfortunately that would slow down page load considerably in some cases.

A large proportion of most web traffic are bots, which we should be able to take advantage of for cron purposes, as Denis suggests. Consider a small blog, which may not get any human traffic in the middle of the night. Relying on the JS approach could still mean missed scheduled future posts.
It just seems hacky to have to count on website visitors with JavaScript enabled in order for cron to work.

Perhaps we can try running wp-cron directly for all common bots and also together with wp-comments-post. Would rarely need the AJAX then as it's only added when there are cron jobs pending.

#9 in reply to: ↑ 8 ; follow-up: @Denis-de-Bernardy
16 years ago

Perhaps we can try running wp-cron directly for all common bots and also together with wp-comments-post. Would rarely need the AJAX then as it's only added when there are cron jobs pending.

I smell trouble. Picture a super-cached site. Php, let alone WP, doesn't even get loaded in that case. If there happened to be no pending cron jobs when google crawls the entire site, then one is much about certain that no cron job will get done for ages.

#10 @Denis-de-Bernardy
16 years ago

Imo, you could safely add the ajax call to old posts' permalinks (i.e. is_single()), regardless of the user agent, and regardless of pending cron jobs. Based on the blog stats I've seen, old posts only get so much traffic, and it's usually from visitors who arrive from search engines.

More recent posts are a different matter of course -- you wouldn't want an ajax call around when your site's on Digg.

#11 @filosofo
16 years ago

Replying to azaozz:

... So instead let's fallback for them on making a direct call to wp_cron() on shutdown.

This was the first thing I've tested. Unfortunately that would slow down page load considerably in some cases.

Could you elaborate? wp_ob_end_flush_all() is called on shutdown with priority 1, so all text should already be sent to the browser when wp_cron() does its thing. I would think that while the request might still be open, the complete DOM should already be there, and all the JS, for example, would be able to do its thing. And of course, this should only occur occasionally for a few blogs.

#12 in reply to: ↑ 9 ; follow-up: @azaozz
16 years ago

Replying to Denis-de-Bernardy:

I smell trouble. Picture a super-cached site. Php, let alone WP, doesn't even get loaded...

Actually the opposite would be worse when a page with the AJAX js is cached.

More recent posts are a different matter of course -- you wouldn't want an ajax call around when your site's on Digg.

Exactly, will have to add the AJAX only on posts that are couple of days old.

Replying to filosofo:

Could you elaborate? wp_ob_end_flush_all() is called on shutdown with priority 1...

Yes, in theory the browser should display the page before calling wp-cron. However while testing that wasn't always the case. The test: added 5 seconds delay to wp-cron and ran it constantly. While accessing the site with two browsers concurrently, one would wait for 5 sec. sometimes before showing the post.

Think that has been tried before too but didn't make it into core.

#13 in reply to: ↑ 12 @Denis-de-Bernardy
16 years ago

Replying to azaozz:

Exactly, will have to add the AJAX only on posts that are couple of days old.

I'd personally suggest a month at the very least. ;-)

Yes, in theory the browser should display the page before calling wp-cron. However while testing that wasn't always the case. The test: added 5 seconds delay to wp-cron and ran it constantly. While accessing the site with two browsers concurrently, one would wait for 5 sec. sometimes before showing the post.

Think that has been tried before too but didn't make it into core.

+1. if apache is configured to gzip the output before sending it to the browser, the php script would then need to complete before anything gets sent. and take forever for the page to load.

@azaozz
16 years ago

#14 @azaozz
16 years ago

Possibly better/more compatible solution to spawn cron: wp-cron.php is included early and a redirect header is sent to the browser to load the page again without exiting from php.

Works well on Apache but will need testing with the other web servers especially when the server is set to compress output.

#15 @filosofo
16 years ago

+1 to the redirect trick.

#16 @Denis-de-Bernardy
16 years ago

won't the redirect trick conflict with some cache plugins? there are some that actually cache empty pages that send a redirect...

#17 @azaozz
16 years ago

(In [10521]) Cron spawning improvement, see #9005

#18 @azaozz
16 years ago

Tested the redirect trick with WP Super Cache and seems to work well. It's needed only for hosts that have problem running cron the "usual" way.

The latest patch adds two new switches: define('ALTERNATE_WP_CRON', true) to spawn cron with redirect and define('DISABLE_WP_CRON', true) to disable spawning when using crontab.

#19 @cookeal
16 years ago

Have posted some changes to cron.php and wp-cron.php developed by my hosting company which have solved my problems related to their firewall not allowing the server to connect back to itself (ticket #9118)

This change has resolved my cron issues with all blogs I am hosting on restrictive shared hosting. I dont pretend to understand why but hope that it is useful input!

I would add that the cahnges suggested so far in this thread did not solve the problem in my case.

#20 @jidanni
16 years ago

  • Cc jidanni@… added

Erm, you guys are erm, bananas for tangling cron with AJAX.

Take my (Mom's) blog,
http://abj.jidanni.org/articles/ ,
http://jidanni.org/comp/wordpress_jidanni_theme.zip
not a speck of JavaScript or even CSS!

So it's not only "how many of my users have JavaScript turned off" but
also "how many of my designers too!"

Anyway if I were in charge, I would have the cron be external, using
the Unix crontab(1) command... at least it would save a check at each
browse... (though "no browse, no cron" I admit is neat.)

Also see #8927 cron comments.

#21 @ryan
16 years ago

  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.