Make WordPress Core

Opened 16 years ago

Closed 8 years ago

#6698 closed defect (bug) (maybelater)

Editing a published post causes excessive pings / closing comments on old posts causes trackbacks

Reported by: lapcat's profile lapcat Owned by:
Milestone: Priority: normal
Severity: normal Version: 2.8.1
Component: Pings/Trackbacks Keywords: needs-patch
Focuses: Cc:

Description

I moderate all comments, and I was tired of spam comments on old posts sometimes slipping by Akismet and getting into my moderation queue, so I decided to close comments on a number of old posts. Steps to reproduce: (1) Click the "Manage" tab in the admin page for my blog. (2) Click "Posts" under "Manage". (3) Find a post. (4) Click "Edit" for that post. (5) Uncheck "Allow Comments" under "Discussion". (6) Click "Save".

As a result of doing this, I immediately got a number of trackbacks in my moderation queue. The trackbacks were from the posts whose comments I had just closed. The trackbacks were to other posts in my blog that were linked from those posts. Note that when I unchecked "Allow Comments", I did not uncheck "Allow Pings". I left "Allow Pings" checked.

Under "Options", "Discussion", I currently have "Attempt to notify any blogs linked to from the article" checked. However, I believe that at some point in the past, that option was unchecked, so the old posts whose comments I closed may have never attempted to send trackbacks before.

I believe that this is a bug. Simply closing comments for a post should not cause it to send trackbacks.

Attachments (6)

6698.diff (1.2 KB) - added by Denis-de-Bernardy 15 years ago.
6698.2.diff (1.5 KB) - added by ryan 15 years ago.
Send generic pings only for newly published posts
6698.3.diff (1.6 KB) - added by Denis-de-Bernardy 15 years ago.
wordpress_generic_ping_ticket_6698.patch (1.4 KB) - added by VoxPelli 15 years ago.
Patch for changing one hour delay to one minute and instead having a minimum of half an hour wait between pings
6698.4.diff (915 bytes) - added by Denis-de-Bernardy 15 years ago.
throttling with immediate pinging
wordpress_generic_ping_ticket_6698_2.patch (1.2 KB) - added by VoxPelli 15 years ago.
Switched from pure options to transients and removed the 60 seconds throttle

Download all attachments as: .zip

Change History (62)

#1 follow-up: @jacobsantos
15 years ago

  • Milestone changed from 2.7 to 2.8

This needs to be reproduced in order to find where the mistake is located. Not going to be a fun process.

#2 in reply to: ↑ 1 @vladimir_kolesnikov
15 years ago

  • Keywords 2nd-opinion dev-feedback added

Replying to jacobsantos:

This needs to be reproduced in order to find where the mistake is located. Not going to be a fun process.

:-) I think it is easy. When we save the post, edit_post() invokes wp_update_post() which finally calls wp_insert_post(). If wp_insert_post() succeeds, it will fire 'publish_post' event (we are talking about published posts which gets modified later).

publish_post event is handled by _publish_post_hook. If you take a look at it, it does this:

	if ( get_option('default_pingback_flag') )
		$wpdb->insert( $wpdb->postmeta, $data + array( 'meta_key' => '_pingme' ) );
	$wpdb->insert( $wpdb->postmeta, $data + array( 'meta_key' => '_encloseme' ) );
	wp_schedule_single_event(time(), 'do_pings');

So if default_pingback_flag is on, WP will schedule pings. It doesn't matter what we do - turning comments off or changing post title - this hook will be triggered.

The only workaround I see right now (maybe need to think more), is modify wp_insert_post() from

	if ( isset($to_ping) )
		$to_ping = preg_replace('|\s+|', "\n", $to_ping);
	else
		$to_ping = '';

to

	if ( isset($to_ping) && get_option('default_pingback_flag') )
		$to_ping = preg_replace('|\s+|', "\n", $to_ping);
	else
		$to_ping = '';

What do you think?

#3 @ryan
15 years ago

  • Component changed from Administration to Comments
  • Owner anonymous deleted

#4 @janeforshort
15 years ago

  • Milestone changed from 2.8 to Future Release

Punting to be evaluated in next release cycle due to timeframe.

#5 @Denis-de-Bernardy
15 years ago

  • Severity changed from normal to major

it's more related to a workflow issue. it could be argued that _publish_post_hook should only get triggered on publish, rather than whenever a published post gets saved.

#6 @Denis-de-Bernardy
15 years ago

  • Keywords has-patch added
  • Milestone changed from Future Release to 2.8
  • Summary changed from Closing comments on old posts causes trackbacks to Editing a published post causes pings / closing comments on old posts causes trackbacks

#7 @Denis-de-Bernardy
15 years ago

  • Keywords needs-testing added

#8 @Denis-de-Bernardy
15 years ago

  • Keywords 2nd-opinion dev-feedback removed

#9 @Denis-de-Bernardy
15 years ago

  • Summary changed from Editing a published post causes pings / closing comments on old posts causes trackbacks to Editing a published post causes excessive pings / closing comments on old posts causes trackbacks

#10 @ryan
15 years ago

We run that hook when saving published posts in case content is edited and we have different pings and enclosures to handle.

#11 @ryan
15 years ago

Shouldn't _publish_post_hook() check pings_open($post_id) instead of checking default_pingback_flag?

#12 @Denis-de-Bernardy
15 years ago

I don't think so. pings_open would be for incoming pings, rather than outgoing pings. That function is for outgoing pings.

The other point you raise makes more sense, but as it currently stands, any change to a post, even the most meaningful one such as changing a typo, triggers pings all over the place.

On my end, I'm constantly dealing with users who are pitched by the likes of:

http://www.maxblogpress.com/plugins/mpo/

#13 @ryan
15 years ago

We could pull generic_ping() out of do_all_pings() and schedule it only when publishing a post for the first time.

@ryan
15 years ago

Send generic pings only for newly published posts

#14 @ryan
15 years ago

There's a patch. Of course, some will argue that we should send generic pings for updates, but with a throttle. A patch that uses a transient to throttle generic_ping() might be a better approach.

#15 @Denis-de-Bernardy
15 years ago

  • Owner set to Denis-de-Bernardy
  • Status changed from new to accepted

I'll take care of it before this week-end.

#16 @Denis-de-Bernardy
15 years ago

this one should do the trick

#17 @Denis-de-Bernardy
15 years ago

also of interest on pinging: #2933

#18 @ryan
15 years ago

  • Resolution set to fixed
  • Status changed from accepted to closed

(In [11410]) Throttle generic pings to no more than once per hour. Props Denis-de-Bernardy. fixes #6698

#19 @rtz
15 years ago

  • Cc rtz@… added

I wonder if this patch is really the correct thing to do, it appears to delay all pings for up to an hour, even when publishing a new post.

Is this the intended purpose?

#20 @Denis-de-Bernardy
15 years ago

It is the intended purpose, yes. That way, if you edit a coma a minute after publishing your post, it'll not re-ping on the spot.

#21 @rtz
15 years ago

I understand that you want to delay unnecessary pinging on edits and such, but the problem is that the ping for the original publish action, for a completely new post, is also delayed for an hour. That's probably not going to be popular, people expect their new posts to be pinged immediately, even if their minor edits are queued up for an hour.

If I start editing a new post now, and publish immediately, I can see in Crontrol that a do_pings is scheduled for immediate execution, and a do_generic_ping for one hour into the future. The do_all_pings function no longer calls generic_ping, even for new posts.

Ideally, for a completely new post, the ping should happen immediately, or at least be scheduled to happen one hour after the last ping, which in my case would be yesterday.

I'm using version 2.8-beta2, rev 11440 on trunk.

#22 follow-up: @VoxPelli
15 years ago

  • Cc VoxPelli added
  • Keywords has-patch needs-testing removed
  • Milestone changed from 2.8 to 2.8.2
  • Resolution fixed deleted
  • Status changed from closed to reopened
  • Version changed from 2.3.2 to 2.8.1

I agree with rtz - although it might be good to have a minimum delay it should certainly not be one hour - if there should be one at all, to grab all of the small changes done right after a post is published, it should only be a few minutes long.

The hour that a ping is delayed now should instead be the minimum wait between two pings - but even then one hour is far too long - I think a minimum wait of 30 minutes is more appropriate.

So - I suggest to change the functionality into: If a ping has been made in the last 30 minutes - wait until 30 minutes has passed since then before a ping is made. If not a ping has been made in the last 30 minutes schedule the ping to be made immediately or after a few minutes.

(Btw - I'm not very used to the WordPress ticket system - perhaps a new ticket should be opened to fix this issue, but since it has already been talked about here I'm reopening this one instead - feel free to correct it if you feel it's wrong)

#23 in reply to: ↑ 22 @christianbolstad
15 years ago

  • Cc christianbolstad added
  • Priority changed from normal to high
  • Summary changed from Editing a published post causes excessive pings / closing comments on old posts causes trackbacks to This fix create more problem than it solve and halt the evolution towards the real time web

I agree with both rtz and VoxPelli that this fix create more problem than it solves. The concept with a "blocking timer" is good, but this solution is flawed. Ideally pings should be sent instantly when a new post is published. After that the timer can be started.

Instant announcement to other services is critical for the emerging real time web.

#24 @ryan
15 years ago

  • Summary changed from This fix create more problem than it solve and halt the evolution towards the real time web to Editing a published post causes excessive pings / closing comments on old posts causes trackbacks

#25 @VoxPelli
15 years ago

I btw also fail to see how [11410] fixes the original issue reported in this ticket - it doesn't change the behaviour of trackbacks - only the behaviour of general pings was changed.

#26 @Denis-de-Bernardy
15 years ago

@VoxPelli: Yawn. Make yourself useful and offer a better patch instead of ranting on your blog:

http://www.karamell.net/2009/07/19/wordpress-putting-the-real-time-web-to-a-halt/

Plus, it's trivial to override in a plugin:

if ( wp_next_scheduled('do_generic_ping') ) {
  wp_clear_scheduled_hook('do_generic_ping');
  do_action('do_generic_ping');
}

By comparing the return value of wp_next_scheduled() you could even throttle it differently to suit whichever behavior makes the most sense to you -- instant, after a minute, after a half hour, whatever.

#27 @Denis-de-Bernardy
15 years ago

Oh, sorry, it's another guy. But still... patch welcome.

#28 @Denis-de-Bernardy
15 years ago

The reverse throttling could be done using a transient that expires after a certain amount of time.

@VoxPelli
15 years ago

Patch for changing one hour delay to one minute and instead having a minimum of half an hour wait between pings

#29 @Denis-de-Bernardy
15 years ago

Patch isn't good. Use this to throttle:

set_transient('last_ping', '1', time() + 3600);

It'll return false as soon as the transient expires.

@Denis-de-Bernardy
15 years ago

throttling with immediate pinging

#30 @Denis-de-Bernardy
15 years ago

  • Keywords has-patch tested added

#31 @Huvet
15 years ago

Denis-de-Bernardy: In my opinion, excessive pinging on edit and closing of a post, shouldn't prompt a wordpress change that changes the user experience for everyone. I rely on pinging to get my posts indexed by (several different) search engines, and since I report on current news events, being indexed one hour late, means someone else's article will be indexed instead. This is serious, and I believe a serious hit to wordpress as a platform.

Now, you're saying that this can be overridden with a plugin, but I believe the default wordpress installation should be as good as possible. It shouldn't demand installation of plugins to work for a news blog.

The only good reason to change this that I see (default behaviour), is if ping services would complain to wordpress that they are getting too much traffic. But I highly doubt this. Can't Google handle the load? Of course they can, and they can handle eventual (rare) dublicates that we send them.

Have anyone initiated contact with pinging services, and asked them what they think of this change? My guess is that they will be furious, that we take away their chance to get updates directly when someone posts them (Google will have a harder time to pick up blog posts).

To me, there are two options:

1) Change the patch to only affect *edits* of a post, not the initial publish event.
2) Make throttling of all posts a plugin instead (add hooks or what you need to make it happen). The reason for this is that I think that excessive pinging is a less important problem than delayed crawling.

No matter what change you decide to go forward with, please think this through.

#32 @Denis-de-Bernardy
15 years ago

@Huvet: see 6698.4.diff, which was added an hour ago.

And here's code to add in a plugin until it gets checked in:

if ( ( wp_next_scheduled('do_generic_ping') > time() + 60 ) && !get_transient('last_ping') ) {
	wp_clear_scheduled_hook('do_generic_ping');
	wp_schedule_single_event(time(), 'do_generic_ping');
	set_transient('last_ping', time(), time() + 3600);
}

#33 @Denis-de-Bernardy
15 years ago

@Huvet - also, keep in mind that there are scaremongers on the web. see:

http://core.trac.wordpress.org/ticket/6698#comment:12

#34 follow-up: @VoxPelli
15 years ago

  • Component changed from Comments to Pings/Trackbacks
  • Keywords needs-testing added; tested removed

@Denis: Your suggested patch will not queue up pings during the wait period - if I create a new post during the wait period it should be queued up and a ping sent as soon as the minimum wait time between ping calls has passed.

Why would a transient be better than a pure option? I haven't come across the transient function that much before - but giving them a brief look now I think they only add excessive database interaction by having that timeout - because it doesn't delete a timed out option prior to it being accessed at which time it will be updated immediately as well.

I would still suggest using my patch - perhaps with a change to transient if there are any good reason for that that I now fail to see and perhaps removing the now forced minimum 60 seconds throttling.

#35 in reply to: ↑ 34 ; follow-up: @christianbolstad
15 years ago

  • Keywords needs-testing removed

I have tried the patch submitted by VoxPelli against the nighly build and it work as described. It is a quick fix that is "good enough" to handle the problems with notifications to ping services in a short perspective. In the longer run the ping system has to be rewritten a bit more. I am gathering recommendations from the different services and would love to co-operate with other developers to build the optimal solution.

http://core.trac.wordpress.org/attachment/ticket/6698/wordpress_generic_ping_ticket_6698.patch

#36 in reply to: ↑ 35 @VoxPelli
15 years ago

  • Keywords tested added

Replying to christianbolstad:

I have tried the patch submitted by VoxPelli against the nighly build and it work as described. It is a quick fix that is "good enough" to handle the problems with notifications to ping services in a short perspective.

I'm marking this as tested then

#37 follow-up: @Denis-de-Bernardy
15 years ago

Transients and options work in much the same way except that a) they can expire (hence no need for your options-related logic) and b) they're not stored in the database when using a persistent object cache.

Anyway, if you feel a 60s throttle before any pinging occurs is in order, base your patch on 6698.4.diff, and schedule the ping to occur at time() + 60 instead of time().

@VoxPelli
15 years ago

Switched from pure options to transients and removed the 60 seconds throttle

#38 in reply to: ↑ 37 @VoxPelli
15 years ago

  • Keywords needs-testing added; tested removed

Replying to Denis-de-Bernardy:

Transients and options work in much the same way except that a) they can expire (hence no need for your options-related logic) and b) they're not stored in the database when using a persistent object cache.

When saved directly to database it seems like pure options would be better in this case - but if transients works better with the persistent object cache as you say I should use them instead. I've updated the patch with that change.

Anyway, if you feel a 60s throttle before any pinging occurs is in order, base your patch on 6698.4.diff, and schedule the ping to occur at time() + 60 instead of time().

I actually removed the throttle now - if a real cron job hasn't been set up the website has to actually be visited to trigger a cron job and we don't know when that will happen so even a 60 seconds throttle can turn in to a very long throttle for a blog not visited very often.

I based my new patch on my old because I think your patch is broken in at least three ways.

First it ignores all ping sources during the waiting period between two pings - it should queue a ping instead as my patch does.

Secondly it sets the timer in _transition_post_status() when it's only in generic_ping() we actually know the correct timestamp.

Thirdly a minor thing - the third argument of set_transient() is not a timestamp - it's the lifetime of the option.

#39 follow-up: @Denis-de-Bernardy
15 years ago

Err...

the expiration parameter is checked as such:

if ( get_option($transient_timeout) < time() )

So it's a timestamp that is needed, and your patch isn't working because of this -- it always returns false.

Your other two points you raise are equally invalid. The value I put in the transient could be 1, or "about to do a ping", or whatever -- it doesn't matter, its time() value is only there in the event a plugin might want to know when it was set.

Re the second point, it's just a different approach. With the patch I suggested we ping once per hour at most (and based on my testing since yesterday, it's working exactly as documented).

With the one you're suggesting makes it ping once per hour per post. But if that's what we want, we should be tossing an ID parameter in generic_pings and in the cron.

As to which is better, I honestly don't care myself. Keep in mind that pinging once per hour per post, however, is bait for marketers who scare poorly informed users into buying pricey solutions to "WP pings too often" problems. (See comments above, re the reason the throttling was introduced.)

#40 @Denis-de-Bernardy
15 years ago

  • Keywords tested commit added; needs-testing removed

Re-suggesting 6698.4.diff as a commit candidate, as it fixes the real-time issue raised by christian. And suggesting we open a separate ticket in the event we want per-post pinging (could be worth a discussion during the next IRC meet-up).

#41 in reply to: ↑ 39 @VoxPelli
15 years ago

  • Keywords needs-testing added; tested commit removed

Replying to Denis-de-Bernardy:

So it's a timestamp that is needed, and your patch isn't working because of this -- it always returns false.

Did you actually try my patch? It's true that it checks for a timestamp - but at line 731 in functions.php it adds the current timestamp to the expiration time, which leaves your options expiring sometime in februari 2049:

add_option($transient_timeout, time() + $expiration, , 'no');

Your other two points you raise are equally invalid. The value I put in the transient could be 1, or "about to do a ping", or whatever -- it doesn't matter, its time() value is only there in the event a plugin might want to know when it was set.

Yes - I realize now that that point is invalid because of my first point - my bad

Re the second point, it's just a different approach. With the patch I suggested we ping once per hour at most (and based on my testing since yesterday, it's working exactly as documented).

I'm not arguing that your patch doesn't fix that a ping is made at most once an hour - I'm saying that I think it's wrong that it doesn't queue a ping for changes made during that hour since that leaves ping services with outdated information.

Let me give you an example:

At 10.00 AM: I'm posting a new post and my blog pings with both our patches.

At 10.10 AM: I discover a serious error in that post and changes it - with my patch a ping is queued to take place at 10.30 AM since I don't want to ping services more than once every 30 minutes - but with your patch this change will not be communicated to the ping services at all.

At 10.20 AM: I'm really creative this hour so I'm posting yet another post - with my patch since a ping is queued already nothing further happens - with your patch this post will also not be communicated to ping services at all.

At 10.25 AM: I'm moving away from the computer for some hours to do some administrative work.

At 10.30 AM: It was 30 minutes since the ping services was notified of changes from my blog and we can feel safe to send a new ping. With my patch, since changes that should be pinged has already happened during that time, the system will now immediately ping them to alert them that they should check my blog for updates - with your patch nothing happens.

At 12.00 AM: Since nothing has changed on the blog - I'm doing administrative work - the ping services will believe they are up to date on my blog. With my patch they will be - with your patch they won't be until I'm done with my administrative work and returning to do some more blogging.

Ping services want to be up to date - ping services don't want to be spammed. By limiting a ping to once every half an hour but also making sure that every change is actually followed by a ping we can ensure to have as up to date ping services as we possibly can. A maximum of two pings per hour and a maximum delay of 30 minutes for a ping service to be notified of a change.

Btw - how can you say that your patch is tested and committed? I have reviewed it and I find it broken. Don't you need a third party review to label your patch as tested?

I'm btw obviously still suggesting my patch as the solution here.

#42 @VoxPelli
15 years ago

Just to clarify - I'm not suggesting per post pings - I'm suggesting that pings shouldn't be lost just because we don't want to spam ping services.

#43 @Denis-de-Bernardy
15 years ago

Woops, I stand corrected on that timeout... My bad, you're correct.

Re the workflow, that would be ideal, if we had more contributors.

#44 @ryan
15 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

(In [11732]) Process waiting pings a half hour after the last ping to avoid spamming ping sites. Don't make a ping wait if the last ping was more than half an hour ago. Props VoxPelli. fixes #6698 for trunk

#45 @ryan
15 years ago

(In [11733]) Process waiting pings a half hour after the last ping to avoid spamming ping sites. Don't make a ping wait if the last ping was more than half an hour ago. Props VoxPelli. fixes #6698 for 2.8

#46 @ryan
15 years ago

Committed to trunk and the 2.8 branch. This will be in 2.8.3 if and when it comes out. Thanks VoxPelli.

#47 @ryan
15 years ago

(In [11734]) Pings to the people. Return to pinging for updates to published posts. see #6698

#48 @ryan
15 years ago

(In [11735]) Pings to the people. Return to pinging for updates to published posts. see #6698 for 2.8

#49 @ryan
15 years ago

After further discussion, reverting to the old behavior. Some will be unhappy with that, but it seems to fit expectations the best these days.

#50 follow-up: @Otto42
15 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Reading through this, I'm not sure why there was all this patching, none of which solved the original problem.

To solve the actual original problem: Can we detect when the post_content changed and only generate a ping event when it has actually changed (and not some other field has been altered)?

This keeps the real-time aspect (which is indeed critical, IMO), while solving the original problem of not forcing pings when the content remains unaltered.

#51 @azaozz
15 years ago

  • Keywords needs-patch added; has-patch needs-testing removed
  • Priority changed from high to normal
  • Severity changed from major to normal

Could probably do something like:

$old = normalize_whitespace( $old_post['post_content'] );
$new = normalize_whitespace( $new_post['post_content'] );

if ( $old != $new || $old_post['post_title'] != $new_post['post_title'] || $old_post['post_excerpt'] != $new_post['post_excerpt'] )
    do_pings...

#52 @Denis-de-Bernardy
15 years ago

  • Owner Denis-de-Bernardy deleted
  • Status changed from reopened to assigned

#53 @demetris
14 years ago

  • Cc dkikizas@… added

#54 @ryan
14 years ago

  • Milestone changed from 2.9 to Future Release

#55 in reply to: ↑ 50 @jeremyfelt
10 years ago

Only pinging when a post has truly been modified makes sense, but proves difficult.

I started going down the road of removing the default publish_post hook to _publish_post_hook() if post content was different. That alone disables xmlrpc_publish_post, which may still be desirable. We could instead move the scheduling of the event to wp_insert_post() to get around that.

However... where do we draw the line for a post being updated? Comparing various WP_Post properties is pretty easy, but there can be a variety of post meta attached that may or may not indicate a valid change. Even an administrative task like closing comments could be considered helpful in notifying others that comments are no longer accepted on a piece of content.

After poking around, I'm starting to think that the current content based behavior makes sense - always ping. If there is something specific that should be avoided, a hook on transition_post_status can remove the default action before it fires.

I'm not entirely sure that we should consider this a 'bug'. It may make sense to close as maybelater to reconsider if demand mounts in the future.

#56 @chriscct7
8 years ago

  • Milestone Future Release deleted
  • Resolution set to maybelater
  • Status changed from assigned to closed
Note: See TracTickets for help on using tickets.