#50244 closed feature request (fixed)
Add bulk operation support to the Rest API
Reported by: | andraganescu | Owned by: | TimothyBlynJacobs |
---|---|---|---|
Milestone: | 5.6 | Priority: | high |
Severity: | normal | Version: | 4.4 |
Component: | REST API | Keywords: | has-patch has-unit-tests |
Focuses: | performance | Cc: |
Description
This ticket is a continuation of the Slack discussions around implementing support for bulk operations on the REST endpoints.
Originally this was started as the response to a need found by the new navigation screen experiment. It then became clear that the better option is to implement systemic support, at the level of WP_REST_Server
.
The goal is to obtain the ability to send multiple entities in one call for create, save and update, for example to create multiple posts or multiple menus or menu items with a single API request.
Change History (44)
#1
@
4 years ago
- Component changed from General to REST API
- Focuses performance added
- Milestone changed from Awaiting Review to 5.5
- Owner set to TimothyBlynJacobs
- Priority changed from normal to high
- Status changed from new to assigned
- Type changed from defect (bug) to feature request
- Version set to 4.4
#3
@
4 years ago
@zieladam Yep, but to clarify the bulk of the validation happens by providing a schema for the properties which validates and sanitizes according to the provided JSON schema. When the validation isn't possible to do just with a schema, the goal is to provide a validate_callback
, see for instance check_template()
in the posts controller. But we don't do this consistently, for instance handle_status_param()
could be implemented as a validate_callback
and sanitize_callback
. The majority of those other checks could also be refactored to be callbacks as well.
The menus controller is separately doing validation against the schema which should be dropped, https://github.com/WordPress/gutenberg/blob/2881f84896cace4cc1828774c8e09252d9e67e39/lib/class-wp-rest-menu-items-controller.php#L368. And has some things that can be expressed in the schema, https://github.com/WordPress/gutenberg/blob/2881f84896cace4cc1828774c8e09252d9e67e39/lib/class-wp-rest-menu-items-controller.php#L516.
So I think starting with the menus controller, any validation checking that happens in prepare_item_for_database
that can be refactored to have a validate_callback
should be starting with the checks that happen on a single parameter. Something to note, is that applying a custom validate_callback skips the schema validating since that is implemented just as the default validate callback. So in that case, if the validation is supposed to be additive, then we should call rest_validate_request_arg()
in the top of the new validate_callback and bail out if that returns a WP_Error
instance.
This ticket was mentioned in Slack in #core-restapi by timothybjacobs. View the logs.
4 years ago
This ticket was mentioned in Slack in #core-restapi by chrisl27. View the logs.
4 years ago
This ticket was mentioned in PR #311 on WordPress/wordpress-develop by TimothyBJacobs.
4 years ago
#6
- Keywords has-patch has-unit-tests added
Trac ticket: https://core.trac.wordpress.org/ticket/50244
#7
@
4 years ago
I've created a PR with a first pass at what this could look like. Things for thoughts:
- I don't really love this being in
wp/v2
, it isn't really part of thewp
data namespace. I kinda would like it do just bewp-json/batch
. - Should routes have to opt-in to supporting batching? We could have a
register_rest_route
flag likeallows_batch
. While theoretically any properly coded route wouldn't have an issue being used in this manner, not every route is properly coded ( #50239 ). - We could also support a
validate_callback
in theregister_rest_route
on the same level ascallback
andpermission_callback
that would take aWP_REST_Request
object and do validation for it. This would be helpful for cases where validation has to be done with all the fields together as context, not just individual field validation. We'd do the parameter validation first, and then callvalidate_callback
on the whole request. - I'm thinking about registering the batch controller as part of
WP_REST_Server
itself like the namespace routes. That way we don't have to expose thematch_request_to_handler
method. - In
pre
validation mode, validation will happen twice. Once when we check every request for passing validation, and then again in the individual dispatch. Adjusting this to only happen once makes me nervous, but there would be a performance benefit to it only happening once. - The
maxItems
number is fairly arbitrary, and isn't currently implemented #48821. - I've omitted support for
GET
requests entirely at the moment. The other requests will have their own permission checks, so it is more difficult for an unscrupulous person to create a lot of load. But if we allowGET
requests this could easily get out of control with no permission checks, and the permission checks are non-obvious. I think we should also encourage using linking/embedding over a batchGET
for the time being. So I'd like to puntGET
batching to a later release.
This ticket was mentioned in Slack in #core-restapi by timothybjacobs. View the logs.
4 years ago
This ticket was mentioned in Slack in #core-restapi by timothybjacobs. View the logs.
4 years ago
#10
@
4 years ago
@TimothyBlynJacobs I read through that PR and it looks pretty good to me. Some thoughts:
Should routes have to opt-in to supporting batching? We could have a register_rest_route flag like allows_batch. While theoretically any properly coded route wouldn't have an issue being used in this manner, not every route is properly coded ( #50239 ).
I think it should be opt-in - as we discussed earlier, there are important routes that do not support pre-validation at the moment and sending batch requests with validation=pre
will not work as expected. With opt-in approach, the batch processing endpoint could be merged first and then more and more routes would opt-in to batch processing as time passes.
We could also support a validate_callback in the register_rest_route on the same level as callback and permission_callback that would take a WP_REST_Request object and do validation for it. This would be helpful for cases where validation has to be done with all the fields together as context, not just individual field validation. We'd do the parameter validation first, and then call validate_callback on the whole request.
Great idea
In pre validation mode, validation will happen twice. Once when we check every request for passing validation, and then again in the individual dispatch. Adjusting this to only happen once makes me nervous, but there would be a performance benefit to it only happening once.
The only thing that could go wrong there is something unexpected happening between the batch validation and the handler invocation, e.g. requesting a batch of two requests, both trying to insert a row with the same value that's subject to an UNIQUE constraint - I think core should protect developers from doing that, but offer an option to opt out of this protection when more performance is required. Maybe by having two pre-validation modes like validation=pre-and-individual
and validation=pre-only
?
I've omitted support for GET requests entirely at the moment.
Sounds good and makes sense
This ticket was mentioned in Slack in #core by timothybjacobs. View the logs.
4 years ago
#12
@
4 years ago
Thanks for the review @zieladam! I've pushed some changes to the PR.
I think it should be opt-in - as we discussed earlier, there are important routes that do not support pre-validation at the moment and sending batch requests with validation=pre will not work as expected. With opt-in approach, the batch processing endpoint could be merged first and then more and more routes would opt-in to batch processing as time passes.
Passing an allow_batch
keyword is now required.
Great idea
Implemented as a validate_callback
option.
The only thing that could go wrong there is something unexpected happening between the batch validation and the handler invocation, e.g. requesting a batch of two requests, both trying to insert a row with the same value that's subject to an UNIQUE constraint - I think core should protect developers from doing that, but offer an option to opt out of this protection when more performance is required. Maybe by having two pre-validation modes like validation=pre-and-individual and validation=pre-only?
What I've done right now is made it so it only validates once. Right now, that isn't an issue that would get caught by validation anyway. It'd fail when the wp_insert_*
call happens.
Leaving the choice up to the caller feels weird to me. Maybe make it specific to the endpoint?
This ticket was mentioned in Slack in #core by peterwilsoncc. View the logs.
4 years ago
#14
@
4 years ago
- Milestone changed from 5.5 to 5.6
It looks like the navigation screen is not going to make it for 5.5. Punting this to 5.6 as its the main consumer of this feature.
#15
@
4 years ago
@TimothyBlynJacobs
Passing an allow_batch keyword is now required.
Awesome!
Implemented as a validate_callback option.
Thank you!
Leaving the choice up to the caller feels weird to me. Maybe make it specific to the endpoint?
Ah yes, it sounds like the best option indeed.
Also, I am working on a PR to make WP_REST_Menu_Items_Controller compatible with this change. I will post a link here as soon as it's ready to share.
#16
@
4 years ago
Also, I am working on a PR to make WP_REST_Menu_Items_Controller compatible with this change. I will post a link here as soon as it's ready to share.
Fantastic! What I think I'm going to do is commit the validate_callback
support for 5.5 so we can utilize that behavior in Gutenberg when doing that refactor.
#17
@
4 years ago
That sounds great @TimothyBlynJacobs ! In fact the PR I started ( https://github.com/WordPress/gutenberg/pull/23474 - not ready for a review yet) should really use a controller-level validate_callback
and I was planning to rebase it on top of your PR.
4 years ago
#18
Adding a support for sanitize_callback
would make things even easier on the controller side. It could be done by adding the following code to sanitize_params
method in class WP_REST_Request
:
`php
if ( isset( $attributessanitize_callback? ) ) {
$sanitized = call_user_func( $attributessanitize_callback?, $this );
if ( is_wp_error( $sanitized ) ) {
return $sanitized;
}
}
`
#19
@
4 years ago
@TimothyBlynJacobs class-wp-rest-menu-items-controller.php
PR is now ready for discussion and reviews: https://github.com/WordPress/gutenberg/pull/23474
TimothyBJacobs commented on PR #311:
4 years ago
#20
Why sanitize? The sanitize callbacks aren’t supposed to return an error instance.
4 years ago
#21
Why sanitize?
@TimothyBJacobs How else would you approach classifying e.g. this part of the class-wp-rest-menu-items-controller.php
?
It performs validation based on:
- Menu item stored in the database
- Results of sanitization of
menu-item-type
earlier in the same function
It doesn't neatly fit into the validation callback, but it fits the sanitize callback, with the only caveat being returning an error.
The sanitize callbacks aren’t supposed to return an error instance.
Returning an error instance is supported for per-field validated callback so my thinking was that a per-request callback should also support it.
My bigger problem here is that the notion of validation and sanitization in the current API relates to request data, whereas with endpoints like class-wp-rest-menu-items-controller.php
what we really want to validate/sanitize is a hydrated object.
TimothyBJacobs commented on PR #311:
4 years ago
#22
How else would you approach classifying e.g. this part of the class-wp-rest-menu-items-controller.php?
It doesn't neatly fit into the validation callback, but it fits the sanitize callback, with the only caveat being returning an error.
Why doesn't it fit in validate? The purpose of sanitize is to transform a value, AFAICT it isn't transforming a value, but validating that it is valid when combined with other request data.
Returning an error instance is supported for per-field validated callback so my thinking was that a per-request callback should also support it.
Right, but I think that would be in the per-request
validate_callback
no?
4 years ago
#23
@TimothyBJacobs here's my reasoning:
Consider the following block of code:
What it does is this: If menu-item-object
is missing from $prepared_nav_item
, then fill it based on menu-item-type
and menu-item-object-id
.
The logic depends on $prepared_nav_item
defined earlier, based on both request and database data, let's call it a hydrated $prepared_nav_item.
Let's try to split it assuming we only have the following buckets to choose from:
per-field validate_callback
per-field sanitize_callback
per-request validate_callback
- Route callback
Let's see how it fits each of these buckets:
per-field validate_callback
:menu-item-object
is not a request field so there is no related validate_callback. Let's consider reusingmenu-item-type
validate_callback. It would need to hydrate $prepared_nav_item first, and then validate conditionally onmenu-item-object
being falsy. That doesn't seem right.per-field sanitize_callback
: Same as above, plus there isn't a good way to account for this line:$prepared_nav_item['menu-item-object'] = get_term_field( 'taxonomy', $original );
.per-request validate_callback
: Hydrating $prepared_nav_item and returningnew WP_Error
in the error case would work just fine. There is no good way to account for this line though:$prepared_nav_item['menu-item-object'] = get_term_field( 'taxonomy', $original );
- Route callback: Provided returning an error was handled by the
validate_callback
, we could hydrate $prepared_nav_item again and duplicate most of the logic/ifs to assign$prepared_nav_item['menu-item-object'] = get_term_field( 'taxonomy', $original );
. This means we need to hydrate twice and duplicate some of the validation logic in the controller to only setmenu-item-object
under specific conditions. It's okay-ish but not ideal.
There is also one additional problem. See the following code:
It performs validation based on the value of resolved $prepared_nav_item['menu-item-object']
from the snippet above. That logic does not nicely follow a linear pattern like 1. Throw errors for invalid inputs 2. Transform the data to save-able form. It's more like an interdependent mixture of both. There are other parts ([like menu position logic](https://github.com/WordPress/gutenberg/blob/2e80a87f2bedd22a4b30f1aa10421d7230ff1e82/lib/class-wp-rest-menu-items-controller.php#L432)) that fit the above reasoning too.
Now let's see what would happen with per-request sanitize_callback
(or hydrate_callback
, transform_callback
, prepare_item_for_database
, or any other name really):
sanitize_callback
hydrates $prepared_nav_item
, performs a check and return an error if needed, assigns $prepared_nav_item[ "menu-item-object" ]
, and then makes the $prepared_nav_item
available for the action handler to process without any further transformations (at least ones that have request-dependent error conditions).
It seems cleaner overall, let me know what you think or if you have any alternative ideas - I don't insist on this specific approach, just trying to figure out what would work best here :)
#24
@
4 years ago
I think it should be opt-in - as we discussed earlier, there are important routes that do not support pre-validation at the moment and sending batch requests with validation=pre will not work as expected. With opt-in approach, the batch processing endpoint could be merged first and then more and more routes would opt-in to batch processing as time passes.
I'm not sure I agree with this rationale. Making this opt-in will make this feature be of quite limited value. I think in many cases people won't use Batch, so won't ever even add it to their route registration.
If a route doesn't support pre-validation (which I think is to say that validation is done directly in the get_items
(etc) handler, why would it matter? Route callbacks can always create errors, so avoiding that I'm assuming is not the issue.
#25
@
4 years ago
If a route doesn't support pre-validation (which I think is to say that validation is done directly in the get_items (etc) handler, why would it matter? Route callbacks can always create errors, so avoiding that I'm assuming is not the issue.
This ticket was born to address the following use case:
- A batch of n, let's say 100 requests is sent over in a single request.
- The entire dataset is validated and sanitized sufficiently to guarantee there won't be any error caused by the input.
- All the request handlers are being executed, only erroring out in rare cases such as lost database connection, but never due to something being wrong with the input.
The goal being either processing all requests or no request, but almost never some of them. This would be even more reliable with transactions, but they're not trivial to implement (think caches) so the first version focuses just on the processing itself.
What you're saying, as I understand, is that there is value in having two modes of operating: 1. Pre-validation as described above when the route supports it 2. Simple batch processing for all routes - e.g. sending over 100 requests and expecting that e.g. 60 of them would succeed while 40 would fail. Can you think of some specific use cases where that would be handy? Also, we'd need to distinguish between:
- Batch requests with pre-validation vs no pre-validation (different URL? request parameter as discussed earlier?)
- Endpoints supporting pre-validation vs endpoints not supporting pre-validation (route config?)
#26
@
4 years ago
I'm not sure I agree with this rationale. Making this opt-in will make this feature be of quite limited value.
My rationale for the opt-in flag is more along these lines:
While theoretically any properly coded route wouldn't have an issue being used in this manner, not every route is properly coded
The REST API is of course designed to be reentrant, but I've seen a surprising number of instances where due to how people have implemented controllers it wouldn't be. In particular for non-get requests and things with permission callbacks.
If one of those routes are included in the batch, the results could be bizarre or completely broken. If we're fine with that, then I think we could switch it to an opt-out mechanism.
I think in many cases people won't use Batch, so won't ever even add it to their route registration.
So this would be for when a developer wants to make batch requests for another plugin's routes?
This ticket was mentioned in Slack in #core by zieladam. View the logs.
4 years ago
#28
@
4 years ago
So my current thought is to commit the refactoring of WP_REST_Server::dispatch()
by the end of this week. Then we can implement and iterate on the actual design of the batch endpoint in the Gutenberg repository. This will require making the match_request_to_handler
and respond_to_request
methods temporarily public. We can mark them as private and then remove the public visibility when the batch endpoint is merged before Beta 1.
This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.
4 years ago
TimothyBJacobs commented on PR #311:
4 years ago
#32
I royally screwed up this rebase. Closing this out.
This ticket was mentioned in PR #518 on WordPress/wordpress-develop by TimothyBJacobs.
4 years ago
#33
Trac ticket: https://core.trac.wordpress.org/ticket/50244
#35
@
4 years ago
- Keywords early removed
Removing the early
keyword. The early portion of the ticket has been committed. The actual batch implementation will come with the Gutenberg code merge near Beta 1.
That PR is in progress here: https://github.com/WordPress/gutenberg/pull/25096
This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.
4 years ago
#37
@
4 years ago
Notes from conversation with Timothy earlier today in Slack:
- PR is merged https://github.com/WordPress/gutenberg/pull/25096
- Per Timothy, this one is ready and
it’ll end up being merge Monday most likely when @ noisysocks does the general 5.6 Gutenberg merge
#38
@
4 years ago
Something we need to decide is whether this should ship as __experimental
or as v1
. Generally we've decided to ship APIs as versioned, not __experimental
. If we need to, we can increment the version :) The benefit of __experimental
is that we can make changes without maintaining technical debt.
Cc'ing:
@spacedmonkey
@andraganescu
@zieladam
@helen
@youknowriad
This ticket was mentioned in PR #629 on WordPress/wordpress-develop by TimothyBJacobs.
4 years ago
#40
Trac ticket: https://core.trac.wordpress.org/ticket/50244
This ticket was mentioned in Slack in #core-restapi by hellofromtonya. View the logs.
4 years ago
TimothyBJacobs commented on PR #629:
4 years ago
#43
Fixed in 9defd1fabcfc6e17736ff2c285d5f0ad5772533b.
One of the conclusions from the Slack chat was that the REST API already uses separate functions to validate and store the data, and so it should be fairly straightforward to reuse these functions to validate the entire batch input before persisting anything.
Today I have been looking into flagging any place in the menu items controller that's doing the validation outside of
validate_callback
. I found the validation and sanitization are implemented all at once in prepare_item_for_database and thatvalidate_callback
method is not used for any property at all.This follows the pattern from WP_REST_Posts_Controller which does the exact same thing. The same is true for WP_REST_Comments_Controller, WP_REST_Attachments_Controller, and potentially a lot of open source classes inheriting from any of these.
It seems like the first step here would have to be refactoring these four classes to make them use
sanitize_callback
andvalidate_callback
for each schema property. What do y'all think?