Opened 6 weeks ago
Last modified 11 days ago
#62953 assigned enhancement
Add post_ancestor as query var to WP_Query class
Reported by: |
|
Owned by: |
|
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | trunk |
Component: | Query | Keywords: | has-patch has-screenshots has-testing-info has-unit-tests |
Focuses: | rest-api | Cc: |
Description
Issue on Gutenberg:
More specifically, have a look at the last few paragraphs of this comment:
What
Add post_ancestor
as a query var to the WP_Query class. This var should accept a page id and will fetch all descendants of that page from the database.
Why
I'm working on adding a toggle to the Gutenberg editor that shows all descendants (not just the direct children) of the current page (see link to GitHub comment above). Currently the WP_Query class only allows you to fetch the direct children of a list of pages through the use of either post_parent
or post_parent__in
.
Recommend Approach
- In the WP_Query class, add
post_ancestor
as one of the query props. - If
post_ancestor
is present then thepost_parent
andpost_parent_in
props should be ignored. - The query should get 'all' pages from the database.
- Then filter out the descendant pages using
get_page_children
(docs: https://developer.wordpress.org/reference/functions/get_page_children/).
Other approaches
Instead of get_page_children
, we could loop over the page hierarchy and execute a different sql query for every depth we need to fetch children from. (So one sql query would fetch the direct children of the parent, a second sql query would fetch the children of those children, etc, etc).
- Each post/page has a
post_parent
property in the db. So it's very easy to get the direct children of a page but getting the grandchildren is more difficult and requires something similar to the algorithm defined above.
Discussion
Are people happy with my recommended approach? I'm happy to do this ticket myself.
Change History (9)
This ticket was mentioned in PR #8338 on WordPress/wordpress-develop by @bbpaule.
5 weeks ago
#1
- Keywords has-patch added
This ticket was mentioned in Slack in #core by bbpaule. View the logs.
5 weeks ago
#5
@
4 weeks ago
I've added a very long comment on the associated pull request.
The most important thing is that in order for this to work as expected, recursive database queries would need to be made to determine the complete (unbounded) post tree under the requested ancestor post, which would then be followed by a post__in
query to get the descendants and ensure the number of found rows and post count is correct.
This recursion would result in a significant performance impact on sites, especially those with deep trees of posts, such as some of the handbooks on WordPress.org.
I've included A way of doing it in my comment, I'm sure there are other approaches that would work but they'd all involve unbound queries of one form or another.
13 days ago
#6
Hi @peterwilsoncc! Thanks your insights! It is much appreciated :D
Due to the way get_page_children() works, this will return incorrect data when querying posts with different dates set. When the query also includes the number of found_rows, that number will also be incorrect as it's not reduced by filtering.
I did some digging and I don't think the problem is with dates. The query retrieves the expected number of posts from the database, but since there’s no filtering at that stage, it just grabs the first set of posts. But when the filtering is applied via get_page_children()
, we may end up getting no posts returned. To fix this, I have applied a commit to gets all posts from the db before filtering it through get_page_children()
.
With this change the test you wrote should now pass. However, getting all posts from the db might have performance concerns. And if the db is large, PHP might run out of memory unless I split the results up.
In order for this to have the desired effect, it would need to make a series of database queries to find the complete tree of descendants (regardless of the limit passed to WP_Query) and following that, make a postin query for the entire tree.
This recursion would result in a significant performance impact on sites, especially those with deep trees of posts, such as some of the handbooks on WordPress.org.
As an alternative, we could have a setting which limits the depth of which descendants to fetch. Hopefully. that would be enough to reduce the performance impact to suitable levels?
@peterwilsoncc commented on PR #8338:
11 days ago
#8
With this change the test you wrote should now pass. However, getting all posts from the db might have performance concerns. And if the db is large, PHP might run out of memory unless I split the results up.
Yeah, I am concerned about that.
As the code from the Gutenberg PR requires allowing the query to be added to the posts REST API endpoint, it also allows the general public to trigger an unlimited database query from an unauthenticated connection. Which, on large sites, could end up becoming a neat little way of DDOSing a server.
It would be possible to loop through pages and determine the tree with something along the lines of the following pseudo code but it would still generate a bunch of queries the maximum depth of the tree.
$descendants = array() $top_level_post = 4; $parent_ids = array( $top_level_post ); do { $parent_ids = SELECT ID from wp_posts WHERE post_parent IN $parent_ids $descendants = array_merge( $descendants, $parent_ids ); } while ( ! empty( $parent_ids ) );
I'll continue to think about it but probably won't get much thinking done during the 6.8 release cycle.
11 days ago
#9
As the code from the Gutenberg PR requires allowing the query to be added to the posts REST API endpoint, it also allows the general public to trigger an unlimited database query from an unauthenticated connection. Which, on large sites, could end up becoming a neat little way of DDOSing a server.
Wow, I didn't even think of the DDOS possibilities. We definitely don't want that to happen 😅
It would be possible to loop through pages and determine the tree with something along the lines of the following pseudo code but it would still generate a bunch of queries the maximum depth of the tree.
Right now I'm leaning towards using this approach but imposing a soft limit on the depth it will search for to reduce the number of queries. I'll let you decide on what approach is best. Good luck with the 6.8 release cycle!
## Summary
Adds
post_ancestor
property to the query class and exposes it in the REST API. This property is used for fetching all descendants of a page. To see why this is needed, see the corresponding issue on Gutenberg:Trac ticket: https://core.trac.wordpress.org/ticket/62953
## Changelog
post_ancestor
property to the query class.## Testing Steps
To test that the property works in the REST API:
http://localhost:8889/wp-json/wp/v2/pages?parent=0&ancestor=<your page's id>
To test that property will work using the query block:
## Image/Video
https://github.com/user-attachments/assets/57ef7a43-f0a8-46d0-a83b-b78cf546d0cd