Make WordPress Core

Opened 7 years ago

Closed 5 years ago

Last modified 5 years ago

#44949 closed enhancement (fixed)

Add support for JSON Schema string pattern to REST API

Reported by: jason_the_adams's profile jason_the_adams Owned by: timothyblynjacobs's profile TimothyBlynJacobs
Milestone: 5.5 Priority: normal
Severity: normal Version:
Component: REST API Keywords: needs-refresh good-first-bug has-patch has-unit-tests
Focuses: Cc:

Description

According to the JSON Schema regular expression patterns are a part of the spec: https://json-schema.org/understanding-json-schema/reference/string.html#regular-expressions

The rest_validate_value_from_schema validates some of the schema, but omits this. It's certainly possible to provide a validate_callback to the endpoint, but since this is a valid part of the schema it would be great to be able to validate based on the pattern.

I believe this is simple yet useful enough to quickly become a helpful addition to the REST API. Furthermore, I believe it's safe to add. If someone does use the pattern they're probably using validate_callback to check the regex already, or I doubt they would mind the REST API flagging it as... well... that's the point of the spec. :)

Interested in thoughts and feedback!

Attachments (3)

44949.patch (762 bytes) - added by jason_the_adams 7 years ago.
44949.2.patch (762 bytes) - added by jason_the_adams 7 years ago.
Last diff was wrong direction, this adds
44949.3.patch (830 bytes) - added by jason_the_adams 7 years ago.

Download all attachments as: .zip

Change History (22)

@jason_the_adams
7 years ago

Last diff was wrong direction, this adds

#1 @jason_the_adams
7 years ago

  • Keywords has-patch added

#2 @TimothyBlynJacobs
7 years ago

  • Keywords needs-unit-tests added

I definitely think we should add this.

I believe the patch should be changed so that the pattern does not need the regex delimiters. We should then probably preg_quote() the pattern before attaching our own delimiters.

[Spec](https://json-schema.org/latest/json-schema-validation.html#regexInterop)
[Examples](https://json-schema.org/understanding-json-schema/reference/regular_expressions.html)

This will need unit tests.

#3 @jason_the_adams
7 years ago

Thanks, @TimothyBlynJacobs

I couldn't find anywhere in the spec that explicitly said the delimiters should be omitted from the pattern, but it does show the pattern without them in the json-schema site.

That said, I agree that it's the way to go. I don't think we'll need any flags added to the regex, do you? If necessary, the user can always add things via modifiers such as (?i).

Last edited 7 years ago by jason_the_adams (previous) (diff)

#4 follow-up: @TimothyBlynJacobs
7 years ago

My understanding is that the [a-zA-Z] part is the actual regex and for convenience purposes you can pass flags to the regex by using the delimiter syntax to instantiate the RegExp.

Reference Implementations:

That said, I agree that it's the way to go. I don't think we'll need any flags added to the regex, do you?

Interestingly, looking at the PHP validators, they all use the u flag. Presumably to reject non UTF-8 characters. I believe PHP requires JSON to be in UTF-8. To be consistent, we might also want to add this flag so it rejects non UTF-8 strings since request data might not come from JSON.

If necessary, the user can always add things via modifiers such as (?i).

I don't think we should recommend this. The spec recommends schema authors to limit themselves to regular expressions that have the highest change of being interoperable: http://json-schema.org/latest/json-schema-validation.html#regexInterop

#5 in reply to: ↑ 4 @jason_the_adams
7 years ago

My understanding is that the [a-zA-Z] part is the actual regex and for convenience purposes you can pass flags to the regex by using the delimiter syntax to instantiate the RegExp.

Sorry, I'm confused by this sentence. It sounds like you're saying the pattern would include the delimiter and flags, but I know that's not what you're saying because you said I should change the current patch to not do that. So... I'm not sure what you're saying, then. 😄

Interestingly, looking at the PHP validators, they all use the u flag. Presumably to reject non UTF-8 characters. I believe PHP requires JSON to be in UTF-8. To be consistent, we might also want to add this flag so it rejects non UTF-8 strings since request data might not come from JSON.

Can't say I've played around with that flag (just did to understand it). Makes sense and seems like it's good to have when one wants it.

Adding another patch to reflect what we've discussed so far.

#6 @birgire
7 years ago

Regarding the PCRE/u modifier:

There exists the _wp_can_use_pcre_u() core function, used in the compatible version of mb_substr().

In https://core.trac.wordpress.org/ticket/44296#comment:9 I noticed that core seems to make PCRE_UTF8 checks inconsistently.

#7 @pento
6 years ago

  • Version trunk deleted

#8 @desrosj
6 years ago

  • Milestone changed from Awaiting Review to Future Release

Going to move to Future Release as there is adequate support for this request.

#9 @TimothyBlynJacobs
6 years ago

#47906 was marked as a duplicate.

#10 @TimothyBlynJacobs
5 years ago

  • Keywords needs-refresh good-first-bug added; has-patch removed
  • Milestone changed from Future Release to 5.5

#11 @sorenbronsted
5 years ago

I will work on this

This ticket was mentioned in PR #261 on WordPress/wordpress-develop by sorenbronsted.


5 years ago
#12

Added string regex pattern

Trac ticket: https://core.trac.wordpress.org/ticket/44949

#13 @sorenbronsted
5 years ago

  • Keywords has-patch has-unit-tests added; needs-unit-tests removed

#14 follow-up: @birgire
5 years ago

@sorenbronsted thanks for the patch.

I wonder if preg_quote() should be used instead of str_replace().

ps: preg_quote() is already used in few places in core.

pss: scanning through the older thread above, I see it's been already mentioned in https://core.trac.wordpress.org/ticket/44949?cnum_edit=14#comment:2 :-)

Last edited 5 years ago by birgire (previous) (diff)

#15 in reply to: ↑ 14 ; follow-up: @sorenbronsted
5 years ago

Replying to birgire:

I wonder if preg_quote() should be used instead of str_replace().

preg_quote is used to escape special regex characters in your string, so they become ordinary characters. So if you put the regex a+b through preg_quote('a+b') it becomes a\+b, and thereby the special regex character + looses it's meaning.

#16 @TimothyBlynJacobs
5 years ago

  • Owner set to TimothyBlynJacobs
  • Resolution set to fixed
  • Status changed from new to closed

In 47810:

REST API: Support the JSON Schema pattern keyword.

Props jason_the_adams, birgire, sorenbronsted.
Fixes #44949.

#17 @TimothyBlynJacobs
5 years ago

In 47811:

REST API: Add @since entries for rest_validate_value_from_schema().

See #49572, #48818, #44949, #50053, #48820, #49720, #42961, #44975, #43392, #38583.

#18 @jason_the_adams
5 years ago

Awesome! Thanks for resurrecting this and bringing it across the finish line, @sorenbronsted and @TimothyBlynJacobs!

#19 in reply to: ↑ 15 @birgire
5 years ago

Replying to sorenbronsted:

Replying to birgire:

I wonder if preg_quote() should be used instead of str_replace().

preg_quote is used to escape special regex characters in your string, so they become ordinary characters. So if you put the regex a+b through preg_quote('a+b') it becomes a\+b, and thereby the special regex character + looses it's meaning.

thanks @sorenbronsted, sorry about my little confusion there :-)

Note: See TracTickets for help on using tickets.