Make WordPress Core

Opened 5 years ago

Last modified 10 months ago

#15148 new enhancement

Cron Storage Abstraction

Reported by: ryan Owned by:
Milestone: Future Release Priority: normal
Severity: major Version:
Component: Cron API Keywords: has-patch dev-feedback
Focuses: Cc:


Abstract cron storage to allow pluggable storage schemes.

Attachments (4)

15148.diff (4.7 KB) - added by ryan 5 years ago.
15148.2.diff (12.2 KB) - added by ryan 5 years ago.
18084.patch (12.3 KB) - added by mintindeed 5 years ago.
Updated WP_Cron_Store
cron.php (8.9 KB) - added by mintindeed 5 years ago.
Proposed improved cron store (as a drop-in)

Download all attachments as: .zip

Change History (13)

5 years ago

#1 @westi
5 years ago

  • Cc westi added

5 years ago


#2 @ryan
5 years ago

Moved to WP_Cron_Store class that can be inherited by a custom class or overridden. It is loaded in the same fashion as wpdb. I started by using a custom class filter as is done with the xmlrpc server class and other classes, but core WP (and thus probably many plugins) was calling cron functions prior to init. I moved the core places that were doing this into a function hooked onto init. If we are willing to suffer a back compat hit, we could move to the class filter approach which has the advantage of not requiring files living directly in wp-content that aren't part of auto updates.

#3 @markjaquith
5 years ago

  • Milestone changed from Awaiting Review to Future Release

5 years ago

Updated WP_Cron_Store

#4 @mintindeed
5 years ago

I've attached an updated version of Ryan's patch. I've tested it locally and fixed a few minor bugs, but haven't tested it in production yet.

I'm also attaching a cron.php drop-in that would use a separate cron job table. This drop-in won't handle concurrent wp cron processes any better than the current wp cron, but it will handle prevention of duplicate jobs and prevent race conditions that cause jobs to get lost. Concurrent instances of wp cron could be minimized by updating executing jobs to the "processing" state and checking the job state prior to executing. They could be completely eliminated by spawning 1 cron process to handle each queued task, but I didn't want to get too ambitious without feedback.

Some of the modifications include:

  • Private methods are handled entirely within the cron store class, cron API no longer uses private methods but instead passes arguments to the cron store class
  • Changed WP API to pass functions directly to the cron store class. The cron store can figure out how it wants to handle those params.
  • Ryan's previous patch removed the procedural versions of _get_cron_array, _set_cron_array and _upgrade_cron_array, this patch takes the next step and actually labels them as protected methods. Not having a publically-accessible version of these methods may break some plugins that used them in spite of being labelled with @access private. If that's the case, may need to make the methods public and create a public procedural function that also invokes doing_it_wrong and _deprecated.
  • I wonder if it's cheaper for *_Cron_Store->get_event() to first check if the event exists, if not then immediately query the event from the database rather than looping through the current cron array?
  • Can't bail out of *_Cron_Store->get_event() early if _cron_array is empty, because there may be events stored in the database. Don't want to query all pending events because of large job queues, e.g. Disqus on sites with a lot of comments.
  • If using the PMC_Cron_Store class, wp-cron.php could be improved by updating in-progress jobs to "processing" status. Then it would be possible to go back and see what jobs failed/never completed, and potentially have an automated process that would retry them and/or alert the admin that the jobs never completed. It also paves the way for having wp cron spawn 1 process to handle each pending job it tries to process, rather than trying to procedurally complete jobs. This would theoretically negate the need for the "doing cron" transient.

#5 @mintindeed
5 years ago

  • Cc gabriel.koen@… added

5 years ago

Proposed improved cron store (as a drop-in)

#6 @mintindeed
5 years ago

  • Keywords has-patch added

Oops, the cron.php drop-in I originally attached depended on a function only available in upgrade.php. Fixed that.

#7 @agupta_pmc
5 years ago

  • Cc agupta_pmc added

#8 @chriscct7
14 months ago

  • Keywords dev-feedback added

#9 @archon810
10 months ago

  • Severity changed from normal to major

Hello gents,

Any movement here? Let me tell you a wonderful tale that made me stay up till 8am while the site and all servers were on fire, all due to cron, as I found out eventually. It proved once again that using a single serialized array in wp_options is a terrible, horrible, evil idea.

The sequence of events was as follows.

  1. On a busy site (15+mln pageviews a month), cron is used relatively extensively, but within reason.
  2. Heavy load leads to some db performance issues.
  3. A sneaky bug in one of our plugins results in a single cron entry being scheduled by probably 10% of the requests. Usually, the only side effect would be that this cron entry would stick around in the cron job list because it gets reinserted over and over.
  4. These db performance issues result in both slow writes and reads, which in turn means that due to a race condition, wp-cron gets read and written over and over, overwriting anything the jobs that are in progress and potentially even already complete and putting them back in. It's easy to see how that turns into a nightmare. But it gets worse. Btw, I have to mention that due to cron performance issues in the past, I've long switched off the built-in wp cron and enabled a Linux cron-initiated wp-cron.
  5. The heightened load and constant cron overwriting grows the cron job list to about 450 jobs that have no chance to get flushed, at least that was the state I found it in. At this point, just the cron array is 1.8MB, and the servers are burning way up.
  6. Did I mention how things manage to get more fun? The db server is actually a master in a master-slave configuration, and it's starting to spit out binlogs like it's nobody's business. I'm talking a gig per minute at its worst. Network is burning up too, but... then the master runs out of space after eating through 100GB of space in a matter of 2-3 hours.
  7. Oh, and additionally, the huge increases in cron array size (i.e. the call to get_options()) ends up in tons of OOM errors in the database, at which point Wordpress does wonderful things like considering the query result as returning empty. At that point, the site may switch to the "Hiii, it's my first install, here's an install page where you can create an admin user" message, various plugin settings get reset, and other fun stuff. It's really really fun. I can't recommend enough trying this once, but don't forget to take some speed first.
  8. At this point replication breaks, but the slave remains online, which was the only saving grace and together with W3 Total Cache kept the site mostly up, as long as pages remained in the cache.

It took me many hours to finally figure out what's going on, fix the cron bug, delete the ballooned cron job list, re-initiate replication, restore plugin settings from a backup, and watch the sun rise.

Yes, there was a bug that was writing a cron entry too frequently. But if the cron system in WP used a dedicated table, none of this would have happened. It would be way more robust and not easily broken by race conditions.

I sincerely hope this ticket gets some action for a release this year. It would seriously make my day/month/year.

Thanks for reading.

Last edited 10 months ago by archon810 (previous) (diff)
Note: See TracTickets for help on using tickets.