Make WordPress Core

Opened 9 years ago

Closed 9 years ago

Last modified 9 years ago

#33423 closed defect (bug) (fixed)

Arguments switched in wp_batch_split_terms Cron Job in 4.3

Reported by: mechter's profile mechter Owned by: dd32's profile dd32
Milestone: 4.3.1 Priority: high
Severity: critical Version: 4.3
Component: Taxonomy Keywords: fixed-major
Focuses: Cc:

Description

Hi all,

I've upgraded to 4.3 and there seems to be an issue with the cron job wp_batch_split_terms.

Line 4448 of wp-includes/taxonomy.php calls:

wp_schedule_single_event( 'wp_batch_split_terms', time() + MINUTE_IN_SECONDS );

I think it meant to call

wp_schedule_single_event( time() + MINUTE_IN_SECONDS, 'wp_batch_split_terms' );

Note that there is now a cron job with the hook name for a timestamp in the database, that might need dealing with.

Attachments (2)

33423.diff (891 bytes) - added by Otto42 9 years ago.
Suggested patch to avoid massive DB operations
33423.2.diff (1.2 KB) - added by peterwilsoncc 9 years ago.

Download all attachments as: .zip

Change History (55)

#1 @dd32
9 years ago

  • Milestone changed from Awaiting Review to 4.3.1

Thankfully the affected call is in _wp_check_for_scheduled_split_terms() which acts as a backup - so this won't affect many people, the main job scheduled during the database upgrade (and during the cron job itself) is correct.

We'll need to clean up the old schedule, as it won't be purged automatically AFAIK.

#2 @dd32
9 years ago

  • Owner set to dd32
  • Resolution set to fixed
  • Status changed from new to closed

In 33646:

Term Splitting: Fix a reversal of parameters to wp_schedule_single_event() introduced in [33621].
The existing invalid cron entries will not be purged automatically (as the 'timestamp' is never matched) so we do this ourselves.

Props mechter for noticing!
See #30261.
Fixes #33423 for trunk.

#3 @dd32
9 years ago

In 33647:

Term Splitting: Fix a reversal of parameters to wp_schedule_single_event() introduced in [33621].
The existing invalid cron entries will not be purged automatically (as the 'timestamp' is never matched) so we do this ourselves.

Merges [33646] to the 4.3 branch.
Props mechter for noticing!
See #30261.
Fixes #33423 for the 4.3 branch.

#4 @swissspidy
9 years ago

#33462 was marked as a duplicate.

#5 @riseoflex88
9 years ago

Yes, this is a major issue for me,

I have three sites now on wp4.3,

Site 1, csgohub.gg - http://www.gp-hub.com/wp-content/uploads/2015/08/Screen-Shot-2015-08-20-at-11.49.33.png
Site 2, gp-hub.com - http://www.gp-hub.com/wp-content/uploads/2015/08/Screen-Shot-2015-08-20-at-11.49.46.png
Site 3, csgo24.gg - http://www.gp-hub.com/wp-content/uploads/2015/08/Screen-Shot-2015-08-20-at-11.49.39.png

All three running crons in varying frequencies.

Site 1 traffic is around 300 onsite now, so crons are scheduled every second of the day. Site 2, has a much lower frequency so crons are every few seconds, same for site 3.

As advised above, changing taxonomy.php line 4448 to

wp_schedule_single_event( time() + MINUTE_IN_SECONDS, 'wp_batch_split_terms' );

Fixed the issue for me

#6 @dd32
9 years ago

#33463 was marked as a duplicate.

#7 @samuelsidler
9 years ago

  • Keywords fixed-major added

#8 follow-up: @dd32
9 years ago

Just to note it here, as mentioned above and in #33463, there appears to be a rare case which is much worse than what I initially encountered here.

If term splitting hasn't completed on a site, and the cron isn't queued up by the Database upgrade OR the cron fails to work correctly, it gets stuck into a state where it adds an extra cronjob (which will never run) to the database, causing the option to fill up.

This is likely to affect few people, as the cron will do it's job for vast majority of cases from what I can tell..

#9 in reply to: ↑ 8 ; follow-up: @Kent Brockman
9 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Replying to dd32:

Just to note it here, as mentioned above and in #33463, there appears to be a rare case which is much worse than what I initially encountered here.

If term splitting hasn't completed on a site, and the cron isn't queued up by the Database upgrade OR the cron fails to work correctly, it gets stuck into a state where it adds an extra cronjob (which will never run) to the database, causing the option to fill up.

This is likely to affect few people, as the cron will do it's job for vast majority of cases from what I can tell..

Hello, I can confirm 12 websites with this exact same problem. All in different servers, with different Apache, PHP, MySQL version combinations. But all of them working on WP 4.3. This fix (https://core.trac.wordpress.org/changeset/33646/trunk/src/wp-includes/taxonomy.php) solves the problem for me. The main problem is how to massively delete all of the currently useless entries this bug created. You can see a screen capture in this forum thread:
https://wordpress.org/support/topic/how-can-i-massively-delete-cron-tasks-from-outside-wp?replies=2#post-7330175

Sorry for bugging you, but this bug turned my day in a mayhem, because sites with high traffic have the cron list flooded with thousands of entries. One of my sites had 12k cron tasks created by this bug. How to cleanup all that mess? be my guest :)

#10 in reply to: ↑ 9 ; follow-up: @ocean90
9 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

Replying to Kent Brockman:

One of my sites had 12k cron tasks created by this bug. How to cleanup all that mess? be my guest :)

[33646] includes an upgrade routine for 4.3.1 which unschedules the invalid cron entries, see https://core.trac.wordpress.org/changeset/33646#file1.

#11 @Otto42
9 years ago

Yeah, this created hundreds of entries on my sites as well. Perhaps this is a candidate for the Hotfix plugin, because this is causing real load problems in a lot of cases.

#12 in reply to: ↑ 10 @Kent Brockman
9 years ago

Replying to ocean90:

Replying to Kent Brockman:

One of my sites had 12k cron tasks created by this bug. How to cleanup all that mess? be my guest :)

[33646] includes an upgrade routine for 4.3.1 which unschedules the invalid cron entries, see https://core.trac.wordpress.org/changeset/33646#file1.

Great! Any ETA on when the 4.3.1 will be released? Today? Tomorrow?

#13 follow-up: @Otto42
9 years ago

Kent, if you need an immediate fix, I left a post here with instructions on how to repair your site and prevent this, until 4.3.1 is released:
https://wordpress.org/support/topic/high-cpu-load-after-update-to-v43?replies=17#post-7330770

#14 @riseoflex88
9 years ago

[edit: ignore this message] - see otto's above for a proper fix.

Last edited 9 years ago by riseoflex88 (previous) (diff)

#15 @Otto42
9 years ago

@riseoflex88 That will delete all the cron entries, which can cause problems with other plugins or anything that relies on the cron system and lacks a rescheduling mechanism.

The instructions I gave in the link above are more focused, and will only fix the problem, without potentially causing new ones. Please recommend those instructions instead.

#16 @boonebgorges
9 years ago

#33463 was marked as a duplicate.

#17 in reply to: ↑ 13 @Kent Brockman
9 years ago

Replying to Otto42:

Kent, if you need an immediate fix, I left a post here with instructions on how to repair your site and prevent this, until 4.3.1 is released:
https://wordpress.org/support/topic/high-cpu-load-after-update-to-v43?replies=17#post-7330770

Thanks Otto, it works. But I'm unable to apply it on more than 60 sites. I hope things don't get ugly before the update is released..

#18 @mobiFlip
9 years ago

One of my sites had thousands cron tasks created by this bug. -.- and fix kills the mysql-server because of the high load

Last edited 9 years ago by mobiFlip (previous) (diff)

#19 @Compute
9 years ago

What is the status for a 4.3.1 release? This is really critical and can break alot of sites..

#20 follow-up: @mikenolan
9 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

The cleanup code on this causes us serious issues - we have binary logging enabled for replicating our database and in attempting to clean up the old cron jobs it generates many, many GB of binary logs in a couple of minutes. Think this is because it's calling wp_unschedule_event thousands of times in the foreach loops, each of which is writing up to a MB back to the binary log.

#21 in reply to: ↑ 20 @Kent Brockman
9 years ago

Replying to mikenolan:

The cleanup code on this causes us serious issues - we have binary logging enabled for replicating our database and in attempting to clean up the old cron jobs it generates many, many GB of binary logs in a couple of minutes. Think this is because it's calling wp_unschedule_event thousands of times in the foreach loops, each of which is writing up to a MB back to the binary log.

Maybe asking to recode the wp_unschedule_event function to accept an array of task names to be unscheduled could be a good solution to reduce amount of function calls.

This ticket was mentioned in Slack in #core by jeichorn. View the logs.


9 years ago

#23 @Otto42
9 years ago

I agree, calling wp_unschedule_event repeatedly for thousands of entries takes a long time and may also not finish within a 30 second window.

Attached a patch 33423.diff which basically performs the same operation in one pass, with one query. Essentially just copies the operation from wp_unschedule_event into the upgrade function directly, then uses _set_cron_array once only.

@Otto42
9 years ago

Suggested patch to avoid massive DB operations

#24 @dd32
9 years ago

Yeah, this was based on me only finding ~20 items on all the sites I checked, before reports of thousands of entries were reported.

The change can be simplified to this instead:

	$cron_array = _get_cron_array();
	if ( isset( $cron_array['wp_batch_split_terms'] ) ) {
		unset( $cron_array['wp_batch_split_terms'] );
	}
	_set_cron_array( $cron_array );
Version 0, edited 9 years ago by dd32 (next)

#25 @Otto42
9 years ago

That seems like a good and quick simplification, since those entries should not exist anyway.

This ticket was mentioned in Slack in #core by peterwilsoncc. View the logs.


9 years ago

#27 @peterwilsoncc
9 years ago

  • Keywords fixed-major removed

An incorrectly named task was being scheduled in admin_init catch.

In all instances, the task to be scheduled should be: wp_split_shared_term_batch
The erroneously scheduled event that needs to be cleared used the timestamp: wp_batch_split_terms

This fix and the changes from comment:24 are in 33423.2.diff.

#28 @dd32
9 years ago

In 33727:

Term Splitting: Switch to a faster cron unschedule process to benefit sites with thousands of affected jobs. Fix the cron hook name in the failsafe rescheduler.

Props Otto42, dd32, peterwilsoncc
See #33423

#29 @dd32
9 years ago

  • Keywords fixed-major added

#30 @Asif2BD
9 years ago

I noticed it first 2 days ago, when one of my server ran out of space, about 100GB was used up, very weird. After some digging I find out its all because of MySQL bin log fines, each 100MB, thousand of those file, causing about 100GB. I removed those log file, and temporary stopped logging in mysql.

This ticket was mentioned in Slack in #core by peterwilsoncc. View the logs.


9 years ago

#32 follow-up: @peterwilsoncc
9 years ago

To cover databases expanding in size each time a user hits an admin page, I've written a hotfix plugin.

The plugin will deactivate itself once WordPress core upgrades to version 4.3.1.

#33 in reply to: ↑ 32 ; follow-up: @vi3t
9 years ago

@peter, How to run plugin after installed?

#34 in reply to: ↑ 33 @peterwilsoncc
9 years ago

Replying to vi3t:

@peter, How to run plugin after installed?

Answered in the associated support thread.

This ticket was mentioned in Slack in #core by alm. View the logs.


9 years ago

#36 @karpstrucking
9 years ago

For what it's worth, of the 385 sites we tested, we found that 40 were affected by this. Most were only mildly affected, with a dozen or less of these cronjobs, but a few had several hundred and one with ~93k.

Last edited 9 years ago by karpstrucking (previous) (diff)

#37 @SergeyBiryukov
9 years ago

  • Priority changed from normal to high
  • Severity changed from normal to critical

#38 follow-up: @pavelevap
9 years ago

I found two websites which were blocked because of Fatal error (insufficient memory), I heard that some people bought higher price plan to increase memory_limit and many other websites are really slow (and wp_options table is huge). This should landed a week ago :-(

BTW: Why are some websites affected and some not?

#39 @worcesterwideweb
9 years ago

I found this issue on a site of mine as well, what a pain to find. thanks for the hotfix it cleared it up. hope to see an update soon

#40 in reply to: ↑ 38 @SergeyBiryukov
9 years ago

Replying to pavelevap:

BTW: Why are some websites affected and some not?

Apparently it's caused by unfinished term splitting combined with cron issues, see comment:8.

#41 @pavelevap
9 years ago

Thank you, Sergey, but I am still not sure why/how term splitting should be unfinished. I searched one affected website and there is only 40 categories, 100 tags and no other taxonomies. Another website has many more terms and no problems...

#42 @DrewAPicture
9 years ago

#33681 was marked as a duplicate.

#43 @dd32
9 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 33877:

Term Splitting: Switch to a faster cron unschedule process to benefit sites with thousands of affected jobs. Fix the cron hook name in the failsafe rescheduler.

Merges [33727] to the 4.3 branch
Props Otto42, dd32, peterwilsoncc
Fixes #33423 for trunk

#44 @dd32
9 years ago

In 33936:

Cron: Reject events when the provided $timestamp is not a valid timestamp.

Invalid timestamps were previously accepted by the scheduling functions but would never be run due to our implementation which caused the cron option to forever contain the events.
This rejects such events which most likely only occur due to developer error.

Props utkarshpatel, wonderboymusic, SergeyBiryukov.
See #33423, Fixes #33475

#45 @ocean90
9 years ago

#33807 was marked as a duplicate.

#46 @luciole135
9 years ago

Hi,

Yesterday I had this error on my site: Fatal error: Allowed memory size of 67108864 bytes exhausted (tried to allocate 1726875 bytes) in .../wp-includes/functions.php on line 384

After I removed this option in wp_options, the admin page was accessible again normally.

Today, in this cron option, wp_batch_split_terms to him has only 214 lines like this:
{s:8:"schedule";b:0;s:4:"args";a:0:{}}}i:1442263593;a:1:{s:32:"..........";a:2:

Should I disable the cron?

#47 @Otto42
9 years ago

@luciole135:

The instructions I posted here will allow you to fix the cron entry and prevent the problem from recurring. Follow those instructions:

https://wordpress.org/support/topic/high-cpu-load-after-update-to-v43?replies=17#post-7330770

#48 @luciole135
9 years ago

Thank you, it works perfectly!

#49 follow-up: @Asif2BD
9 years ago

I am confused, this bug is not fixed in 4.3.1? Its not shipped with 4.3.1?

#50 @Otto42
9 years ago

This bug is indeed fixed in 4.3.1.

#53 in reply to: ↑ 52 @Otto42
9 years ago

Replying to Asif2BD:

Thanks, I got confused after seeing this link from 4.3.1 release note, they dont list this ticket.

It does, actually. Changeset [33877], right at the bottom of that list. Description: "Term Splitting: Switch to a faster cron unschedule process to benefit …"

Note: See TracTickets for help on using tickets.