WordPress.org

Make WordPress Core

Opened 8 weeks ago

Last modified 4 weeks ago

#43874 new defect (bug)

REST API: Only render fields specific to request when _fields= is used

Reported by: danielbachhuber Owned by:
Milestone: 4.9.8 Priority: normal
Severity: normal Version:
Component: REST API Keywords: has-patch has-unit-tests commit
Focuses: rest-api, performance Cc:

Description

Given a request like:

GET wp/v2/posts?_fields=id,title

It doesn't make sense for the REST API to generate content for each response object, only to later discard it.

Instead, if _fields= is used, we should only render the fields specified.

Previously #38131

From https://github.com/WordPress/gutenberg/issues/5376

Attachments (1)

43874.1.diff (46.1 KB) - added by danielbachhuber 7 weeks ago.

Download all attachments as: .zip

Change History (12)

#1 @danielbachhuber
8 weeks ago

@pento I've done a first pass applying this to the Users controller with https://github.com/WordPress/wordpress-develop/pull/12

There's a good amount of code churn. If you're amenable to the direction I'm headed, I'll continue applying to the other controllers.

#2 @pento
8 weeks ago

Yah, I like this approach. Let's go with it.

#3 follow-up: @schlessera
8 weeks ago

Some thoughts on the approach in https://github.com/WordPress/wordpress-develop/pull/12:

Possible minor scalability optimization for current approach:

If you array_flip() the array of fields at the start of the checks, you can turn the individual checks from a linear array search (O(n)) into a hash table lookup (O(1)). Probably not worth the hassle, as I doubt we'll have field arrays with thousands of fields.

Possible way to avoid code churn:

Turn the $fields = get_fields_for_response() into a $schema = get_filtered_item_schema(), so that all the checks remain unchanged.

Possible way to make the code cleaner - variant A (with even more code churn, though):

Iterate over the $fields and use a switch statement:

foreach( $this->get_fields_for_response( $request ) as $field ) {
   switch ( $field ) {
        case 'id':
            $data['id'] = $user->ID;
            break;
        case 'username':
            $data['username'] = $user->user_login;
            break;
        // [...]
   }
}

Possible way to make the code cleaner - variant B (with yet more code churn):

Iterate over the $fields and map them to methods:

public function prepare_item_for_response( $user, $request ) {
   // Could be deduced, but probably too much magic then.
   $mapping = array(
      'id'       => 'get_id',
      'username' => 'get_username'
   );

   foreach( $this->get_fields_for_response( $request ) as $field ) {
      if ( ! array_key_exists( $field, $mapping ) {
         // throw error
      }

      $method         = $mapping[ $field ];
      $data[ $field ] = $this->$method( $user, $request );
   }
}

private function get_id( $user, $request ) {
   return $user->ID;
}

private function get_username( $user, $request ) {
   return $user->user_login;
}

// [...]

#4 in reply to: ↑ 3 @danielbachhuber
8 weeks ago

Replying to schlessera:

Possible way to make the code cleaner - variant B (with yet more code churn):

Iterate over the $fields and map them to methods:

Way back in the day, I had a proposal somewhere for a similar abstraction: define the database_attribute as a part of the schema, and auto-load the data from that.

For now, I think I'll stick with some minor variation of what I already have.

Also, it's worth noting just so it's mentioned: limiting generated response data early means less data is passed through to add_additional_fields_to_object(). This is why I've persisted id, in the possibility that it's being used to pull other data into the response.

#5 @danielbachhuber
7 weeks ago

  • Keywords has-patch has-unit-tests commit added
  • Milestone changed from 4.9.6 to 4.9.7

43874.1.diff introduces a WP_REST_Controller::get_fields_for_response() method that's used in all sub-classed controllers except WP_REST_Settings_Controller. I've included tests for the base method and each impacted subclass.

When _fields=[] is used, this changeset will change the data passed into add_additional_fields_to_object(). Notably, this includes callbacks added via register_rest_field(). Because of this, id is always included in the original response preparation, so it can continue to be a lookup value.

I'd like to land this in trunk so we can consider including this in 4.9.7. I don't think it's so pressing that we need to get into 4.9.6.

#6 @pento
7 weeks ago

In 43087:

REST API: Filter responses based on the _fields parameter, before data is processed.

Historically, the REST API would generate the entire response object, including running expensive filters, then it would apply the _fields parameter, discarding the fields that weren't specificed.

This change causes _fields to be applied earlier, so that only requested fields are processed.

Props danielbachhuber.
See #43874.

#7 @pento
7 weeks ago

@danielbachhuber: Nice work! As this is a large patch, I've committed it to trunk now, to avoid letting it get stale.

I think we'll also need to consider how to filter meta fields before they're retrieved from the database, too: if you have a lot of meta fields registered, it can be a fairly substantial performance drag.

Also, I would be totally okay with get_metadata() (and the related get_*_meta() functions) accepting an array for $meta_key, so data can be retrieved in one query, instead of hundreds.

This ticket was mentioned in Slack in #core-restapi by kadamwhite. View the logs.


7 weeks ago

#9 follow-up: @TimothyBlynJacobs
7 weeks ago

If the _fields parameter is going to start being checked off of the request object especially in controllers, might it be worth adding it to WP_REST_Controller::get_collection_params()?

#10 in reply to: ↑ 9 @danielbachhuber
7 weeks ago

Replying to TimothyBlynJacobs:

If the _fields parameter is going to start being checked off of the request object especially in controllers, might it be worth adding it to WP_REST_Controller::get_collection_params()?

The _ underscore denotes that it's a "magic" parameter handled by the server, not the controller. Even though our new implementation includes performance enhancements handled by the controller, _fields is ultimately a server-level feature. So no, I don't think it's necessary to include it as a collection parameter.

#11 @desrosj
4 weeks ago

  • Milestone changed from 4.9.7 to 4.9.8

Moving all tickets in 4.9.7 to 4.9.8.

Note: See TracTickets for help on using tickets.