WordPress.org

Make WordPress Core

Opened 4 years ago

Closed 2 months ago

#36824 closed enhancement (fixed)

do_all_pings function queries all posts

Reported by: spacedmonkey Owned by: boonebgorges
Milestone: 5.3 Priority: normal
Severity: normal Version: 2.1
Component: Pings/Trackbacks Keywords: needs-patch
Focuses: performance Cc:
PR Number:

Description

In the do_all_pings function, the following sql query runs.

SELECT ID FROM $wpdb->posts WHERE to_ping <> AND post_status = 'publish'

Performance monitor tool (new relic) has flagged this has a bad query. It said the following

This table was retrieved with a full table scan, which is often quite bad for performance, unless you only retrieve a few rows.
The table was retrieved with this index:
No index was used in this part of the query.
You can speed up this query by querying only fields that are within the index. Or you can create an index that includes every field in your query, including the primary key.
Approximately 114860 rows of this table were scanned.

I don't understand why this call isn't going through the wp query object. If it did you could then cache the result.

Attachments (16)

36824.patch (3.3 KB) - added by spacedmonkey 4 years ago.
First pass at a patch
36824.1.patch (3.5 KB) - added by spacedmonkey 4 years ago.
36824.3.diff (4.2 KB) - added by dshanske 3 years ago.
Attempt to Add Unit Tests
36824.4.diff (4.4 KB) - added by spacedmonkey 3 years ago.
36824.5.diff (1.8 KB) - added by dshanske 3 years ago.
Refresh of the Pingme and Encloseme improvements
36824.4.1.patch (2.3 KB) - added by janw.oostendorp 2 years ago.
Updated the query to include nopaging
36824.5.2.diff (5.0 KB) - added by dshanske 2 years ago.
36824.6.diff (5.0 KB) - added by mrmadhat 2 years ago.
36824.7.diff (5.4 KB) - added by dshanske 18 months ago.
36824.8.diff (5.7 KB) - added by boonebgorges 18 months ago.
36824.9.diff (6.1 KB) - added by dshanske 18 months ago.
36824.10.diff (14.9 KB) - added by birgire 18 months ago.
36824-pings.diff (1.0 KB) - added by boonebgorges 3 months ago.
36824-enclosures.diff (12.2 KB) - added by boonebgorges 3 months ago.
36824-trackbacks.diff (1.4 KB) - added by boonebgorges 3 months ago.
36824-46175-47746-fix-php-74-compat.patch (956 bytes) - added by jrf 3 months ago.
Fix PHP 7.4 compatibility

Download all attachments as: .zip

Change History (83)

#1 @dshanske
4 years ago

  • Keywords needs-patch added

I believe it predates WP_Query. But a patch to change that for do_all_pings would likely be in order.

#2 @spacedmonkey
4 years ago

Interesting the you can not query using wp_query using the the to_ping param. I do believe this function should be converted to use WP_Query. There are many benefits of using wp_query, not only does make this whole process more filterable. Should I make a patch that includes the change wp query class?

#3 @dshanske
4 years ago

Please do. It looks like it just wasn't an option when the function was last updated. Looking forward to the patch.

@spacedmonkey
4 years ago

First pass at a patch

#4 @spacedmonkey
4 years ago

This is the first pass for a patch. It isn't finished / tested, but more of a talking point. I don't think it is far off a finished patch.

#5 @spacedmonkey
4 years ago

  • Keywords has-patch added; needs-patch removed

#6 @dshanske
4 years ago

  • Keywords needs-unit-tests added

The fields => ids should return only an array of post IDs, therefore it should be $ping, not $ping->ID.

Other than that, while it is not my area of expertise, it needs unit testing.

@dshanske
3 years ago

Attempt to Add Unit Tests

#7 @dshanske
3 years ago

  • Keywords has-unit-tests added; needs-unit-tests removed

This ticket was mentioned in Slack in #core-comments by dshanske. View the logs.


3 years ago

This ticket was mentioned in Slack in #core by dshanske. View the logs.


3 years ago

#10 @dshanske
3 years ago

  • Keywords 2nd-opinion added

I found an issue with the patch. post_type = any will exclude any custom post type that has the exclude_from_search set to true. This would break expected ping behavior.

Suggest post_type be set to get_post_types( array( 'publicly_queryable' => true ) and I can resubmit the last version of the patch with that change.

@spacedmonkey
3 years ago

#11 @spacedmonkey
3 years ago

Updated the patch from feedback above. I also set the suppress_filters flag to false. This will help for those using a post caching plugins.

This ticket was mentioned in Slack in #core by dshanske. View the logs.


3 years ago

#13 @SergeyBiryukov
3 years ago

  • Milestone changed from Awaiting Review to 4.7

This ticket was mentioned in Slack in #core by stevenkword. View the logs.


3 years ago

#15 @boonebgorges
3 years ago

  • Owner set to boonebgorges
  • Status changed from new to reviewing

#16 @boonebgorges
3 years ago

  • Keywords needs-patch added; has-patch removed
  • Milestone changed from 4.7 to Future Release

Hi @spacedmonkey and @dshanske - Thanks for working on this ticket. A few comments about 36824.4.diff.

The 'meta' portion of the _pingme and _encloseme queries is incorrect. It should be 'meta_key' => '_pingme', not 'meta_value'.

Switching to get_posts() with posts_per_page=-1 feels dangerous to me. Currently, do_all_pings() requires many database queries, but each one will only load a small amount of content into memory. I can guarantee that loading post_content for hundreds or thousands of posts in a single query is going to cause out-of-memory fatal errors. We can still leverage the post cache when available by doing fields=ids and then, in the foreach loop, calling get_post(). (This is similar to what's already happening with the do_trackbacks() loop.)

Adding a parameter to WP_Query that checks 'to_ping' seems OK to me. But it's worth noting that changing to WP_Query doesn't fix the "bad query" issue reported by New Relic. If anything, it'll make it worse, since the SQL query generated by WP_Query is bound to be more complex. Also, to_ping acts like a boolean in your patch, but the to_ping column is not boolean - it's a list of URLs. A more accurate and robust parameter might be something like:

if ( null !== $q['has_to_ping'] ) {
    if ( $q['has_to_ping'] ) {
         $where .= " AND {$wpdb->posts}.to_ping <> ''";
    } else {
         $where .= " AND {$wpdb->posts}.to_ping = ''";
    }
}

One final thought. If we're really concerned about to_ping performance, we might consider moving/mirroring to_ping in postmeta. Meta queries like this are quite fast, because they only touch the indexed meta_key column: SELECT ... WHERE m.meta_key = 'to_ping'. It would be valuable to see performance benchmarks.

There are a couple improvements here, any of which could be considered for 4.7 as a standalone item if an updated patch were ready in time:

  • a param for WP_Query that references to_ping
  • use get_posts( 'fields=ids' ) + get_post() for _pingme and _encloseme queries
  • mirror to_ping in postmeta and use it instead of the to_ping column when querying for posts that need to send trackbacks

#17 @dshanske
3 years ago

This is odd. I looked at the patch and I see that we added 'fields=ids' for only one. But I tested another version of this in plugin form as part of my experiments in pingback improvement and I just checked that code and I did add it in in all places. I don't know how I missed this.

I will try to refresh the patch to address the _pingme and _encloseme issues. I would suggest we use this ticket to address those specific areas.

Specifically with the to_ping notation, I think we need to reconsider a few things re trackbacks before we start mirroring to post meta. So, I would suggest that that, and the WP_Query issue be set aside for future and this address the lower hanging fruit.

@dshanske
3 years ago

Refresh of the Pingme and Encloseme improvements

#18 @dshanske
3 years ago

  • Keywords has-patch added; has-unit-tests 2nd-opinion needs-patch removed

#19 @dshanske
3 years ago

@boonebgorges Can you review the abbreviated patch? The issue of to_ping I think that you raised is out of scope for this patch and I am going to open a separate ticket to discuss that as it goes to a key issue.

#20 @dshanske
3 years ago

I just had an idea. The _pingme meta key is automatically added to any post that is published/updated and we're already using that to generate a list of IDs...as that idea seems to be a good one.

All pingbacks add the URLs they send to the pinged field. So, there should be a way to get some resource savings by only looking at those posts for trackback sending. But that requires some more substantial changes.

#21 @dshanske
3 years ago

Related #38197 which proposes a new pingback() function that deprecates the need to pass through the post content and retrieves the post itself.

#22 @boonebgorges
3 years ago

Thanks for the new patch, @dshanske. A question about the publicly_queryable clause. Is there a situation in which a post that belongs to a post type which has publicly_queryable=false would need to send a pingback? Your patch would break this kind of use. But I'm not sure whether it's a valid or realistic use case. (Keep in mind that this is an old part of WP, and people do weird things.)

All pingbacks add the URLs they send to the pinged field. So, there should be a way to get some resource savings by only looking at those posts for trackback sending. But that requires some more substantial changes.

Can you explain this more? do_trackbacks() seems to skip sending trackbacks to URLs that have already gotten pingbacks. Is this correct?

#23 @dshanske
3 years ago

@boonebgorges The original 'all' would not cover custom post types that were listed as 'exclude_from_search'. There are a few tickets that have covered that issue. I discovered when I implemented the code in a test plugin that it was an issue. But publicly_queryable would indicate it would be accessible from the front end even if not displayed by default, which would be consistent with the idea that pingback is to notify that you've linked to someone.

The pingback function in WordPress adds its URLs to the pinged field in the post so that it won't send a trackback and a pingback. The reason why I wanted to break that portion away is that I wanted to change the do_trackbacks and the pingback functions, get_punged, get_pinged, etc to support passing either a post_ID or a post object, which is an easy change.

There are a few other minor issues(such as the items removed from the to_ping list wouldn't update the post object, but this would reduce the number of queries in this area to the list of known recently updated posts where the to_ping list has already been retrieved.

So, based on your original feedback, this is something worth taking away to look at in a different way.

Re trackbacks not being sent if pingbacks were...since one is a manual process and the other an automatic one, question is if this makes sense. But that is a bigger issue.

This ticket was mentioned in Slack in #core by dshanske. View the logs.


3 years ago

#25 @dshanske
3 years ago

  • Keywords needs-refresh added

I think we need to relook at this entirely in regard to the feedback. 4.7 will now include support for the ping functions to take in a WP_Post object, which can assist here a tiny bit.

I think the suggestion of moving to_ping to meta might make sense. We can do a search of how many plugins might be accessing to_ping directly...I doubt there are that many.

#26 @dshanske
3 years ago

@boonebgorge I've been giving this thought again in light of #39960. The best parameter under the current setup is supports trackbacks as this is used to determine whether pingbacks should be enabled.

If I get #39960 through, that would change to the union of supports pingbacks and supports trackbacks.

Switching to supports, and supporting the new pingback improvements made so far(supporting $post as well as $post_id, allowing an array of pinged to go in, allowing $content to be retrieved from $post as well as passed in...) we should be able to get some performance improvements, if not all.

@janw.oostendorp
2 years ago

Updated the query to include nopaging

#27 @janw.oostendorp
2 years ago

36824.4.diff looked like the most complete patch. But it used post_per_page => -1
I replaced it with nopaging => true

#28 @dshanske
2 years ago

I am glad this is getting attention again. Maybe we can get it into 4.9

#29 @janw.oostendorp
2 years ago

@dshanske I'm quite new to this and I havn't red the whole thread.
But even with the linkt to other tickets, the current patch should be a performance boost already.

Does this ticket need more work? If so what?
I'd like to know and help.

#30 @dshanske
2 years ago

I mentioned it higher up. The supports parameter is a better way to narrow down the query. And the new enhancements to add_ping and such need to be incorporated. Let me make a go at it.

@dshanske
2 years ago

#31 @dshanske
2 years ago

  • Keywords needs-refresh removed
  • Milestone changed from Future Release to 4.9

Just did my own shot at fixing a few of the problems. Want to see if we can get this in. The new version makes the same modifications to do_enclose that were previously done to pingback, moving the retrieval of content out of this function and into that one.

It also narrows the post_types to only ones that declared support for trackbacks, which at the moment is how both pingbacks and trackback support is declared, except for enclosures which has the broader definition.

@mrmadhat
2 years ago

#32 @mrmadhat
2 years ago

updated do_all_pings so that when do_enclose() is called it passes the content of the returned posts instead of null, formatted the arrays and moved delete_post_meta() calls after the work is done.

#33 @dshanske
2 years ago

@mrmadhat The changes I made have that done by the function itself, which is why it switched to null. That means that it pulls the IDs one by one, but then pulls the content subsequently.

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


2 years ago

#36 @jbpaul17
2 years ago

  • Milestone changed from 4.9 to Future Release

Punting this to Future Release per the 4.9 bug scrub earlier today.

@dshanske
18 months ago

#37 @dshanske
18 months ago

  • Milestone changed from Future Release to 4.9.7

I've refreshed this patch and would like to see if we can get this in finally. This is a performance optimization as opposed to a strict enhancement.

#38 @boonebgorges
18 months ago

  • Keywords needs-patch added; has-patch removed

36824.8.diff fixes a meta_key/meta_value mismatch with _encloseme, and does some cleanup for PHPCS.

The "Do trackbacks" section needs more work. get_posts() and WP_Query do not support a to_ping parameter. Above, there was some discussion about either adding it to WP_Query, or moving it to postmeta, but it looks like the current patch does neither of these things. The problem that originally prompted this ticket was that specific query, so we should be sure to address it one way or another.

Regarding the change to get_post_types_by_supports( 'trackbacks' ), it feels like this might cause compatibility problems for plugins that may not be declaring support properly, but are still expecting trackbacks to be sent. Can someone walk through that scenario? It could be that it's only a concern if the plugin is manually setting _pingme, which happens in core only for the 'post' post type; see wp_transition_post_status() -> 'publish_post' -> _publish_post_hook(). But it would be helpful to understand how likely it is that this'll be a breaking change for someone.

#39 @dshanske
18 months ago

@boonebgorges _pingme is only for pingbacks, and _encloseme does the same thing for enclosures. It adds a property that, if set, triggers that operation for that particular post.

Trackbacks use a different system, where there is a field in the post called to_ping that must not be empty, and that is a list of the URLs to send trackbacks to. The issue being that trackbacks don't have to have an established link to the post they are sending to, and pingbacks do.

Right now, regarding support for trackbacks, I have ticket #39960 which tries to separate trackback and pingback support, which needs debate as that would be a breaking change. But, the way WordPress is set up at this exact moment, the trackback UI will not be enabled on a post unless the trackbacks support function is there. Trackbacks, unlike pingbacks, are manually initiated. So, if anyone was expecting them, it wouldn't be delivering unless they did something else I can't predict.

As for the issue with the _pingme and the publish_post...I never noticed that till now. That would seem to be a bug as any post type can declare pingback and trackback support and it should work to send them. Maybe it does make sense to fix that at some point.

to_ping is set in _wp_translate_postdata .

Last edited 18 months ago by dshanske (previous) (diff)

@dshanske
18 months ago

#40 @dshanske
18 months ago

I gave using postmeta a shot, by setting the flag if there are any URLs to ping when published.

Does this address that issue?

#41 @birgire
18 months ago

The patch in 36824.10.diff:

  • Adds unit tests for do_enclose() to help with the ticket. First pass.
  • Adds @return for do_enclose().
  • Adjusts the ! empty( get_to_ping( $post_id ) ) from 36824.9.diff to avoid Fatal error: Can't use function return value in write context for PHP < 5.5.
  • Removes the seemingly unused global $wpdb in do_all_pings().
  • Removes doc block for $wpdb in do_all_pings().
  • Added end-of-line dots for inline comments in do_all_pings().
Last edited 18 months ago by birgire (previous) (diff)

@birgire
18 months ago

#42 @pbiron
18 months ago

  • Keywords has-patch has-unit-tests added; needs-patch removed

#43 @ocean90
17 months ago

  • Milestone changed from 4.9.7 to 4.9.8

4.9.7 has been released, moving to next milestone.

This ticket was mentioned in Slack in #core by jon_bossenger. View the logs.


17 months ago

This ticket was mentioned in Slack in #core by pbiron. View the logs.


17 months ago

#46 @pento
17 months ago

  • Milestone changed from 4.9.8 to 4.9.9

4.9.8 has hit RC, moving to 4.9.9.

#47 @pento
14 months ago

  • Milestone changed from 4.9.9 to Future Release

#48 @dshanske
7 months ago

  • Milestone changed from Future Release to 5.3

#49 @dshanske
7 months ago

I think this is an enhancement that can improve site performance and it is time to merge it in.

#50 @desrosj
3 months ago

@boonebgorges Are you able to review this for 5.3 beta next week (23 September)?

#51 @boonebgorges
3 months ago

The patches on this ticket do a lot of refactoring beyond the original issue, which was the greedy query for trackbacks. I'm going to start breaking them out into separate patches/commits, to simplify the review process.

#52 follow-up: @boonebgorges
3 months ago

36824-pings.diff includes only the change to the pingback query. It corrects the meta_key / meta_value error in the most recent patches. Looking at the SQL generated, there are two potentially breaking changes. I'd like feedback from others with an understanding of this issue before moving forward with it.

  1. Currently, the _pingme query doesn't look at post type at all. Now we're limiting it to those post types with post_type_supports( 'trackbacks' ). (Related concern.) This seems like it will break expected behavior. What problem is it meant to solve? In part, it addresses the problems with 'post_type' => 'any' (reference). But I wonder if we might instead pass a list of all registered post types, for better backward compatibility. This issue could then possibly be revisited after #39960.
  1. Currently, the pingback query is post_status-agnostic. Now, it's limited to publish. Could this have unintended consequences? In theory, only published posts should have _pingme https://core.trac.wordpress.org/browser/tags/5.2.3/src/wp-includes/post.php#L6551 but I'd like others to weigh in.

#53 follow-up: @boonebgorges
3 months ago

36824-enclosures.diff includes only the changes to enclosures. Questions for feedback:

  1. I've removed some of the miscellaneous changes to do_enclose(). These are generally OK but should be moved to different tickets.
  1. Similar to my concern 1 about pingbacks, I wonder if we should be querying all post types, rather than limiting to those that are publicly_queryable.

#54 @boonebgorges
3 months ago

36824-trackbacks.diff is the part of the patch that touches trackbacks.

  1. It implements the get_post_types() suggestion I made above for the other patches.
  2. I've changed the name of the new meta key to be more descriptive.

#55 in reply to: ↑ 52 ; follow-up: @dshanske
3 months ago

Replying to boonebgorges:

36824-pings.diff includes only the change to the pingback query. It corrects the meta_key / meta_value error in the most recent patches. Looking at the SQL generated, there are two potentially breaking changes. I'd like feedback from others with an understanding of this issue before moving forward with it.

  1. Currently, the _pingme query doesn't look at post type at all. Now we're limiting it to those post types with post_type_supports( 'trackbacks' ). (Related concern.) This seems like it will break expected behavior. What problem is it meant to solve? In part, it addresses the problems with 'post_type' => 'any' (reference). But I wonder if we might instead pass a list of all registered post types, for better backward compatibility. This issue could then possibly be revisited after #39960.

Specifically with trackbacks, it is currently all or nothing and they are a spam concern. That said, the other idea about improving the declared post type supports functionality would address that issue.

  1. Currently, the pingback query is post_status-agnostic. Now, it's limited to publish. Could this have unintended consequences? In theory, only published posts should have _pingme https://core.trac.wordpress.org/browser/tags/5.2.3/src/wp-includes/post.php#L6551 but I'd like others to weigh in.

It can't have unintended consequences because pingbacks require verification. So the remote site is required to check that it is being linked to. If the post isn't published, it won't be able to retrieve it.

#56 in reply to: ↑ 53 @dshanske
3 months ago

Replying to boonebgorges:

36824-enclosures.diff includes only the changes to enclosures. Questions for feedback:

  1. I've removed some of the miscellaneous changes to do_enclose(). These are generally OK but should be moved to different tickets.
  1. Similar to my concern 1 about pingbacks, I wonder if we should be querying all post types, rather than limiting to those that are publicly_queryable.

Here I would agree with you it should be all, as enclosures are a matter of identifying embedded media. Mostly used for RSS, you could in theory have a private RSS feed, for example.

#57 @boonebgorges
3 months ago

In 46175:

Improve do_enclose() logic on post publish.

Removing the direct SQL query in do_all_pings() improves filterability.

As part of this change, the signature of do_enclose() is changed so that
a null $content parameter can be passed, with the $content then inferred
from the $post passed in the second parameter. In addition, the second
parameter was modified so that a post ID or a WP_Post object can be
provided. These changes make it possible to trigger enclosure checks with
a post ID alone (as in do_all_pings()) and also brings the function
signature in line with do_trackbacks() and pingback().

Props dshanske, spacedmonkey, janw.oostendorp, mrmadhat, birgire.
See #36824.

#58 @boonebgorges
3 months ago

In 46176:

PHPCS: Fix coding standards violations in do_enclose().

  • Use strict checking when appropriate in in_array() checks.
  • Improved comment formatting.
  • Yoda and strict equality checks where appropriate.

See #36824.

#59 in reply to: ↑ 55 @boonebgorges
3 months ago

Replying to dshanske:

Specifically with trackbacks, it is currently all or nothing and they are a spam concern. That said, the other idea about improving the declared post type supports functionality would address that issue.

Thanks for this. This, and the discussion above, suggests to me that it's acceptable to err on the side of caution and look at all post types. post_type_supports() improvements can be considered technically.

#60 @boonebgorges
3 months ago

In 46177:

Use WP_Query when sending pingbacks.

Props dshanske, spacedmonkey, janw.oostendorp, mrmadhat, birgire.
See #36824.

#61 @boonebgorges
3 months ago

  • Resolution set to fixed
  • Status changed from reviewing to closed

In 46178:

Improve performance of trackback query in do_all_pings().

Previously, the direct SQL query used to identify trackbacks in
do_all_pings() performed poorly, due to an unindexed query against the
to_ping column. We improve performance in two ways. First, we switch
to using a postmeta flag for posts that require trackbacks to be sent;
queries joining against the postmeta table that check only the meta_key
are generally quite fast. Second, we switch to the use of WP_Query,
making the query cacheable and filterable using standard methods.

Props dshanske, spacedmonkey, janw.oostendorp, mrmadhat, birgire.
Fixes #36824.

#62 @jrf
3 months ago

Sweetie darlings, I love you all to bits, but could you please refrain from introducing new PHP 7.4 compatibility issues ?

New issue introduced in: [46175]

PHP 7.4 problem caused is the same as was previously fixed via #47746

@jrf
3 months ago

Fix PHP 7.4 compatibility

#63 @jrf
3 months ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

#64 @SergeyBiryukov
3 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 46182:

Tests: Replace join() with implode() in do_enclose() tests introduced in [46175], and ensure the arguments passed are in the correct order.

The implode() function accepts two parameters, $glue and $pieces. For historical reasons, these parameters have been accepted in any order, though it was recommended that the documented order of $glue, $pieces be used. It is also generally considered best practice to use the canonical function rather than an alias.

Starting in PHP 7.4, specifying the parameters in the reverse order will trigger a deprecation notice with the plan to remove this tolerance in PHP 8.0.

Props jrf.
Fixes #36824. See #47746.

#65 @SergeyBiryukov
3 months ago

In 46292:

Pings/Trackbacks: Use correct variable in a foreach() loop in do_all_pings().

Props itowhid06.
Fixes #48094. See #36824.

#66 @johnbillion
2 months ago

  • Keywords needs-patch added; has-patch has-unit-tests removed
  • Resolution fixed deleted
  • Status changed from closed to reopened

Enclosure processing has broken since [46175]. Looks like an undefined property is used when iterating posts with pending enclosures.

#67 @johnbillion
2 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 46427:

Pings/Trackbacks: Fix processing of posts with pending enclosures.

Introduced in [46175].

Fixes #36824

Note: See TracTickets for help on using tickets.