WordPress.org

Make WordPress Core

Opened 2 years ago

Last modified 6 days ago

#36824 reviewing enhancement

do_all_pings function queries all posts

Reported by: spacedmonkey Owned by: boonebgorges
Milestone: 4.9.8 Priority: normal
Severity: normal Version: 2.1
Component: Pings/Trackbacks Keywords: has-patch has-unit-tests
Focuses: performance Cc:

Description

In the do_all_pings function, the following sql query runs.

SELECT ID FROM $wpdb->posts WHERE to_ping <> AND post_status = 'publish'

Performance monitor tool (new relic) has flagged this has a bad query. It said the following

This table was retrieved with a full table scan, which is often quite bad for performance, unless you only retrieve a few rows. The table was retrieved with this index: No index was used in this part of the query. You can speed up this query by querying only fields that are within the index. Or you can create an index that includes every field in your query, including the primary key. Approximately 114860 rows of this table were scanned.

I don't understand why this call isn't going through the wp query object. If it did you could then cache the result.

Attachments (12)

36824.patch (3.3 KB) - added by spacedmonkey 2 years ago.
First pass at a patch
36824.1.patch (3.5 KB) - added by spacedmonkey 2 years ago.
36824.3.diff (4.2 KB) - added by dshanske 2 years ago.
Attempt to Add Unit Tests
36824.4.diff (4.4 KB) - added by spacedmonkey 2 years ago.
36824.5.diff (1.8 KB) - added by dshanske 22 months ago.
Refresh of the Pingme and Encloseme improvements
36824.4.1.patch (2.3 KB) - added by janw.oostendorp 11 months ago.
Updated the query to include nopaging
36824.5.2.diff (5.0 KB) - added by dshanske 11 months ago.
36824.6.diff (5.0 KB) - added by mrmadhat 11 months ago.
36824.7.diff (5.4 KB) - added by dshanske 4 weeks ago.
36824.8.diff (5.7 KB) - added by boonebgorges 4 weeks ago.
36824.9.diff (6.1 KB) - added by dshanske 4 weeks ago.
36824.10.diff (14.9 KB) - added by birgire 4 weeks ago.

Download all attachments as: .zip

Change History (57)

#1 @dshanske
2 years ago

  • Keywords needs-patch added

I believe it predates WP_Query. But a patch to change that for do_all_pings would likely be in order.

#2 @spacedmonkey
2 years ago

Interesting the you can not query using wp_query using the the to_ping param. I do believe this function should be converted to use WP_Query. There are many benefits of using wp_query, not only does make this whole process more filterable. Should I make a patch that includes the change wp query class?

#3 @dshanske
2 years ago

Please do. It looks like it just wasn't an option when the function was last updated. Looking forward to the patch.

@spacedmonkey
2 years ago

First pass at a patch

#4 @spacedmonkey
2 years ago

This is the first pass for a patch. It isn't finished / tested, but more of a talking point. I don't think it is far off a finished patch.

#5 @spacedmonkey
2 years ago

  • Keywords has-patch added; needs-patch removed

#6 @dshanske
2 years ago

  • Keywords needs-unit-tests added

The fields => ids should return only an array of post IDs, therefore it should be $ping, not $ping->ID.

Other than that, while it is not my area of expertise, it needs unit testing.

@dshanske
2 years ago

Attempt to Add Unit Tests

#7 @dshanske
2 years ago

  • Keywords has-unit-tests added; needs-unit-tests removed

This ticket was mentioned in Slack in #core-comments by dshanske. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by dshanske. View the logs.


2 years ago

#10 @dshanske
2 years ago

  • Keywords 2nd-opinion added

I found an issue with the patch. post_type = any will exclude any custom post type that has the exclude_from_search set to true. This would break expected ping behavior.

Suggest post_type be set to get_post_types( array( 'publicly_queryable' => true ) and I can resubmit the last version of the patch with that change.

@spacedmonkey
2 years ago

#11 @spacedmonkey
2 years ago

Updated the patch from feedback above. I also set the suppress_filters flag to false. This will help for those using a post caching plugins.

This ticket was mentioned in Slack in #core by dshanske. View the logs.


23 months ago

#13 @SergeyBiryukov
23 months ago

  • Milestone changed from Awaiting Review to 4.7

This ticket was mentioned in Slack in #core by stevenkword. View the logs.


22 months ago

#15 @boonebgorges
22 months ago

  • Owner set to boonebgorges
  • Status changed from new to reviewing

#16 @boonebgorges
22 months ago

  • Keywords needs-patch added; has-patch removed
  • Milestone changed from 4.7 to Future Release

Hi @spacedmonkey and @dshanske - Thanks for working on this ticket. A few comments about 36824.4.diff.

The 'meta' portion of the _pingme and _encloseme queries is incorrect. It should be 'meta_key' => '_pingme', not 'meta_value'.

Switching to get_posts() with posts_per_page=-1 feels dangerous to me. Currently, do_all_pings() requires many database queries, but each one will only load a small amount of content into memory. I can guarantee that loading post_content for hundreds or thousands of posts in a single query is going to cause out-of-memory fatal errors. We can still leverage the post cache when available by doing fields=ids and then, in the foreach loop, calling get_post(). (This is similar to what's already happening with the do_trackbacks() loop.)

Adding a parameter to WP_Query that checks 'to_ping' seems OK to me. But it's worth noting that changing to WP_Query doesn't fix the "bad query" issue reported by New Relic. If anything, it'll make it worse, since the SQL query generated by WP_Query is bound to be more complex. Also, to_ping acts like a boolean in your patch, but the to_ping column is not boolean - it's a list of URLs. A more accurate and robust parameter might be something like:

if ( null !== $q['has_to_ping'] ) {
    if ( $q['has_to_ping'] ) {
         $where .= " AND {$wpdb->posts}.to_ping <> ''";
    } else {
         $where .= " AND {$wpdb->posts}.to_ping = ''";
    }
}

One final thought. If we're really concerned about to_ping performance, we might consider moving/mirroring to_ping in postmeta. Meta queries like this are quite fast, because they only touch the indexed meta_key column: SELECT ... WHERE m.meta_key = 'to_ping'. It would be valuable to see performance benchmarks.

There are a couple improvements here, any of which could be considered for 4.7 as a standalone item if an updated patch were ready in time:

  • a param for WP_Query that references to_ping
  • use get_posts( 'fields=ids' ) + get_post() for _pingme and _encloseme queries
  • mirror to_ping in postmeta and use it instead of the to_ping column when querying for posts that need to send trackbacks

#17 @dshanske
22 months ago

This is odd. I looked at the patch and I see that we added 'fields=ids' for only one. But I tested another version of this in plugin form as part of my experiments in pingback improvement and I just checked that code and I did add it in in all places. I don't know how I missed this.

I will try to refresh the patch to address the _pingme and _encloseme issues. I would suggest we use this ticket to address those specific areas.

Specifically with the to_ping notation, I think we need to reconsider a few things re trackbacks before we start mirroring to post meta. So, I would suggest that that, and the WP_Query issue be set aside for future and this address the lower hanging fruit.

@dshanske
22 months ago

Refresh of the Pingme and Encloseme improvements

#18 @dshanske
22 months ago

  • Keywords has-patch added; has-unit-tests 2nd-opinion needs-patch removed

#19 @dshanske
22 months ago

@boonebgorges Can you review the abbreviated patch? The issue of to_ping I think that you raised is out of scope for this patch and I am going to open a separate ticket to discuss that as it goes to a key issue.

#20 @dshanske
22 months ago

I just had an idea. The _pingme meta key is automatically added to any post that is published/updated and we're already using that to generate a list of IDs...as that idea seems to be a good one.

All pingbacks add the URLs they send to the pinged field. So, there should be a way to get some resource savings by only looking at those posts for trackback sending. But that requires some more substantial changes.

#21 @dshanske
22 months ago

Related #38197 which proposes a new pingback() function that deprecates the need to pass through the post content and retrieves the post itself.

#22 @boonebgorges
22 months ago

Thanks for the new patch, @dshanske. A question about the publicly_queryable clause. Is there a situation in which a post that belongs to a post type which has publicly_queryable=false would need to send a pingback? Your patch would break this kind of use. But I'm not sure whether it's a valid or realistic use case. (Keep in mind that this is an old part of WP, and people do weird things.)

All pingbacks add the URLs they send to the pinged field. So, there should be a way to get some resource savings by only looking at those posts for trackback sending. But that requires some more substantial changes.

Can you explain this more? do_trackbacks() seems to skip sending trackbacks to URLs that have already gotten pingbacks. Is this correct?

#23 @dshanske
22 months ago

@boonebgorges The original 'all' would not cover custom post types that were listed as 'exclude_from_search'. There are a few tickets that have covered that issue. I discovered when I implemented the code in a test plugin that it was an issue. But publicly_queryable would indicate it would be accessible from the front end even if not displayed by default, which would be consistent with the idea that pingback is to notify that you've linked to someone.

The pingback function in WordPress adds its URLs to the pinged field in the post so that it won't send a trackback and a pingback. The reason why I wanted to break that portion away is that I wanted to change the do_trackbacks and the pingback functions, get_punged, get_pinged, etc to support passing either a post_ID or a post object, which is an easy change.

There are a few other minor issues(such as the items removed from the to_ping list wouldn't update the post object, but this would reduce the number of queries in this area to the list of known recently updated posts where the to_ping list has already been retrieved.

So, based on your original feedback, this is something worth taking away to look at in a different way.

Re trackbacks not being sent if pingbacks were...since one is a manual process and the other an automatic one, question is if this makes sense. But that is a bigger issue.

This ticket was mentioned in Slack in #core by dshanske. View the logs.


21 months ago

#25 @dshanske
21 months ago

  • Keywords needs-refresh added

I think we need to relook at this entirely in regard to the feedback. 4.7 will now include support for the ping functions to take in a WP_Post object, which can assist here a tiny bit.

I think the suggestion of moving to_ping to meta might make sense. We can do a search of how many plugins might be accessing to_ping directly...I doubt there are that many.

#26 @dshanske
15 months ago

@boonebgorge I've been giving this thought again in light of #39960. The best parameter under the current setup is supports trackbacks as this is used to determine whether pingbacks should be enabled.

If I get #39960 through, that would change to the union of supports pingbacks and supports trackbacks.

Switching to supports, and supporting the new pingback improvements made so far(supporting $post as well as $post_id, allowing an array of pinged to go in, allowing $content to be retrieved from $post as well as passed in...) we should be able to get some performance improvements, if not all.

@janw.oostendorp
11 months ago

Updated the query to include nopaging

#27 @janw.oostendorp
11 months ago

36824.4.diff looked like the most complete patch. But it used post_per_page => -1 I replaced it with nopaging => true

#28 @dshanske
11 months ago

I am glad this is getting attention again. Maybe we can get it into 4.9

#29 @janw.oostendorp
11 months ago

@dshanske I'm quite new to this and I havn't red the whole thread. But even with the linkt to other tickets, the current patch should be a performance boost already.

Does this ticket need more work? If so what? I'd like to know and help.

#30 @dshanske
11 months ago

I mentioned it higher up. The supports parameter is a better way to narrow down the query. And the new enhancements to add_ping and such need to be incorporated. Let me make a go at it.

@dshanske
11 months ago

#31 @dshanske
11 months ago

  • Keywords needs-refresh removed
  • Milestone changed from Future Release to 4.9

Just did my own shot at fixing a few of the problems. Want to see if we can get this in. The new version makes the same modifications to do_enclose that were previously done to pingback, moving the retrieval of content out of this function and into that one.

It also narrows the post_types to only ones that declared support for trackbacks, which at the moment is how both pingbacks and trackback support is declared, except for enclosures which has the broader definition.

@mrmadhat
11 months ago

#32 @mrmadhat
11 months ago

updated do_all_pings so that when do_enclose() is called it passes the content of the returned posts instead of null, formatted the arrays and moved delete_post_meta() calls after the work is done.

#33 @dshanske
10 months ago

@mrmadhat The changes I made have that done by the function itself, which is why it switched to null. That means that it pulls the IDs one by one, but then pulls the content subsequently.

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


10 months ago

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.


10 months ago

#36 @jbpaul17
10 months ago

  • Milestone changed from 4.9 to Future Release

Punting this to Future Release per the 4.9 bug scrub earlier today.

@dshanske
4 weeks ago

#37 @dshanske
4 weeks ago

  • Milestone changed from Future Release to 4.9.7

I've refreshed this patch and would like to see if we can get this in finally. This is a performance optimization as opposed to a strict enhancement.

@boonebgorges
4 weeks ago

#38 @boonebgorges
4 weeks ago

  • Keywords needs-patch added; has-patch removed

36824.8.diff fixes a meta_key/meta_value mismatch with _encloseme, and does some cleanup for PHPCS.

The "Do trackbacks" section needs more work. get_posts() and WP_Query do not support a to_ping parameter. Above, there was some discussion about either adding it to WP_Query, or moving it to postmeta, but it looks like the current patch does neither of these things. The problem that originally prompted this ticket was that specific query, so we should be sure to address it one way or another.

Regarding the change to get_post_types_by_supports( 'trackbacks' ), it feels like this might cause compatibility problems for plugins that may not be declaring support properly, but are still expecting trackbacks to be sent. Can someone walk through that scenario? It could be that it's only a concern if the plugin is manually setting _pingme, which happens in core only for the 'post' post type; see wp_transition_post_status() -> 'publish_post' -> _publish_post_hook(). But it would be helpful to understand how likely it is that this'll be a breaking change for someone.

#39 @dshanske
4 weeks ago

@boonebgorges _pingme is only for pingbacks, and _encloseme does the same thing for enclosures. It adds a property that, if set, triggers that operation for that particular post.

Trackbacks use a different system, where there is a field in the post called to_ping that must not be empty, and that is a list of the URLs to send trackbacks to. The issue being that trackbacks don't have to have an established link to the post they are sending to, and pingbacks do.

Right now, regarding support for trackbacks, I have ticket #39960 which tries to separate trackback and pingback support, which needs debate as that would be a breaking change. But, the way WordPress is set up at this exact moment, the trackback UI will not be enabled on a post unless the trackbacks support function is there. Trackbacks, unlike pingbacks, are manually initiated. So, if anyone was expecting them, it wouldn't be delivering unless they did something else I can't predict.

As for the issue with the _pingme and the publish_post...I never noticed that till now. That would seem to be a bug as any post type can declare pingback and trackback support and it should work to send them. Maybe it does make sense to fix that at some point.

to_ping is set in _wp_translate_postdata .

Last edited 4 weeks ago by dshanske (previous) (diff)

@dshanske
4 weeks ago

#40 @dshanske
4 weeks ago

I gave using postmeta a shot, by setting the flag if there are any URLs to ping when published.

Does this address that issue?

#41 @birgire
4 weeks ago

The patch in 36824.10.diff:

  • Adds unit tests for do_enclose() to help with the ticket. First pass.
  • Adds @return for do_enclose().
  • Adjusts the ! empty( get_to_ping( $post_id ) ) from 36824.9.diff to avoid Fatal error: Can't use function return value in write context for PHP < 5.5.
  • Removes the seemingly unused global $wpdb in do_all_pings().
  • Removes doc block for $wpdb in do_all_pings().
  • Added end-of-line dots for inline comments in do_all_pings().
Last edited 4 weeks ago by birgire (previous) (diff)

@birgire
4 weeks ago

#42 @pbiron
3 weeks ago

  • Keywords has-patch has-unit-tests added; needs-patch removed

#43 @ocean90
2 weeks ago

  • Milestone changed from 4.9.7 to 4.9.8

4.9.7 has been released, moving to next milestone.

This ticket was mentioned in Slack in #core by jon_bossenger. View the logs.


13 days ago

This ticket was mentioned in Slack in #core by pbiron. View the logs.


6 days ago

Note: See TracTickets for help on using tickets.