Make WordPress Core

Opened 4 months ago

Closed 5 weeks ago

Last modified 5 weeks ago

#46199 closed defect (bug) (fixed)

REST: Escape already decoded values when adding query args

Reported by: dmsnell Owned by: kadamwhite
Milestone: 5.2 Priority: normal
Severity: normal Version:
Component: REST API Keywords: has-patch needs-testing
Focuses: Cc:


add_query_args() assumes that when given an array of args as the first parameter that they are already urlencodeed but in various places we are passing $_GET (indirectly) as that first parameter; unfortunately (per the PHP docs) "The GET variables are passed through urldecode()."

On a site with over two hundred tags a search was made in the Gutenberg editor for tags matching # - the site used tags like #cool-thing and #important - as part of its autocompleting of tag names. This produced a URL with &search=%23 in the query.

Inside of WP_REST_Terms_Controller::get_items() we call add_query_args( $request->get_query_params(), … ) which by that time passes on $_GET which contains $_GET['search'] = '#' because PHP decoded it.

Although add_query_args() runs existing args from its URI through urlencode_deep() it doesn't modify the args passed into the function itself. This leaves us calling build_query() which calls _http_build_query() with the $urlencode parameter set to false, thus the decoded value gets directly implode()ed into the final query string and we're left with a raw # before any actual fragment part of the URL.

This resulted in the response for the site's tag getting a next link whose query args are like this…
…when they should be like this…

We can note that it looks like we've done something funky moving around the value for the search arg but actually we truncated at search=# and then treated everything after it as if it were the fragment, finally adding it back in at the end of add_query_args().

In Gutenberg this resulted in a crash after typing # into the tag box because the API call came back with invalid data as the Link for the next page of tag results (and that link was automatically followed by Gutenberg's attempt to get all of the results in a paged set). Comically the valid data was there but because WordPress stuck _envelope=1 after the # it was ignored by PHP as not being a $_GET arg.

To fix this I propose wrapping $request->get_query_args() in urldecode_deep() because we should be able to reliably know that they are decoded and thus need re-encoding. When I scanned around I found that a few other REST controllers were affected by this same bug and so am including corrections there in this patch as well.

We could approach this with other means and do more to detect if incoming args to add_query_args() were already encoded or decoded but I don't think the additional complexity is worth it.

We could have always run the array args to add_query_args() through urlencode_deep() but I don't think we can safely assume that it's not being called with already-encoded args; I was able to find places where we do already encode them when we're passing variables that aren't $_GET so I think it's inconceivable to paint broadly here and encode everything.

There may be more places affected by this.

I searched the codex to see if I could find any ban on including # in tag names but couldn't find any. Further, I hypothesize that this will affect many different API calls which allow user-input search terms.

Attachments (1)

46199.patch (5.0 KB) - added by dmsnell 4 months ago.
initial proposal to fix #46199

Download all attachments as: .zip

Change History (16)

4 months ago

initial proposal to fix #46199

#1 @pento
4 months ago

  • Version trunk deleted

#2 @dmsnell
3 months ago

@kadamwhite would you be willing to give this a review?

This ticket was mentioned in Slack in #core by dmsnell. View the logs.

2 months ago

#4 @pento
2 months ago

  • Keywords 2nd-opinion added
  • Milestone changed from Awaiting Review to 5.2

Milestoning for 5.2. I think 46199.patch is reasonable, but I'd like some input from REST API folks first.

#5 @rmccue
2 months ago

  • Keywords 2nd-opinion removed

Huh. I guess we just missed this case entirely.

Conceptually, 46199.patch looks good to me. Haven't tested or confirmed the fix though.

This ticket was mentioned in Slack in #core-restapi by kadamwhite. View the logs.

8 weeks ago

#7 @kadamwhite
8 weeks ago

  • Owner set to kadamwhite
  • Status changed from new to assigned

I'll review this before RC1 and land if it tests out. Thanks for the patch!

This ticket was mentioned in Slack in #core by jeffpaul. View the logs.

7 weeks ago

This ticket was mentioned in Slack in #core-restapi by kadamwhite. View the logs.

6 weeks ago

#10 @azaozz
5 weeks ago

@kadamwhite 5.2-RC1 is in a few hours and seems this is not ready yet. Are we punting it to the next release or perhaps the next minor?

#11 @pento
5 weeks ago

  • Milestone changed from 5.2 to 5.3

Aye, this patch needs more than a few hours of reviewing. Bumping to 5.3.

This ticket was mentioned in Slack in #core by kadamwhite. View the logs.

5 weeks ago

#13 @kadamwhite
5 weeks ago

We'd reviewed this and it looked good, I was waiting on a chance to test it in an environment with >100 tags starting with #. I have done so, and do not see any issues. (Gutenberg no longer appears broken without this patch, but that seems to be due to a change in the logic for the tag filter box and not because this issue was fixed in elsewhere). Pinging @pento for go-ahead on moving it back to 5.2

#14 @kadamwhite
5 weeks ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 45267:

REST API: Always urlencode_deep() query args in get_items methods.

Passing all received query arguments through urlencode_deep ensures that the full set of query arguments are encoded in the same way.

Props dmsnell.
Fixes #46199

#15 @pento
5 weeks ago

  • Milestone changed from 5.3 to 5.2
Note: See TracTickets for help on using tickets.