#44949 closed enhancement (fixed)
Add support for JSON Schema string pattern to REST API
Reported by: |
|
Owned by: |
|
---|---|---|---|
Milestone: | 5.5 | Priority: | normal |
Severity: | normal | Version: | |
Component: | REST API | Keywords: | needs-refresh good-first-bug has-patch has-unit-tests |
Focuses: | Cc: |
Description
According to the JSON Schema regular expression patterns are a part of the spec: https://json-schema.org/understanding-json-schema/reference/string.html#regular-expressions
The rest_validate_value_from_schema
validates some of the schema, but omits this. It's certainly possible to provide a validate_callback
to the endpoint, but since this is a valid part of the schema it would be great to be able to validate based on the pattern.
I believe this is simple yet useful enough to quickly become a helpful addition to the REST API. Furthermore, I believe it's safe to add. If someone does use the pattern they're probably using validate_callback
to check the regex already, or I doubt they would mind the REST API flagging it as... well... that's the point of the spec. :)
Interested in thoughts and feedback!
Attachments (3)
Change History (22)
#2
@
7 years ago
- Keywords needs-unit-tests added
I definitely think we should add this.
I believe the patch should be changed so that the pattern
does not need the regex delimiters. We should then probably preg_quote()
the pattern before attaching our own delimiters.
[Spec](https://json-schema.org/latest/json-schema-validation.html#regexInterop)
[Examples](https://json-schema.org/understanding-json-schema/reference/regular_expressions.html)
This will need unit tests.
#3
@
7 years ago
Thanks, @TimothyBlynJacobs
I couldn't find anywhere in the spec that explicitly said the delimiters should be omitted from the pattern, but it does show the pattern without them in the json-schema site.
That said, I agree that it's the way to go. I don't think we'll need any flags added to the regex, do you? If necessary, the user can always add things via modifiers such as (?i)
.
#4
follow-up:
↓ 5
@
7 years ago
My understanding is that the [a-zA-Z]
part is the actual regex and for convenience purposes you can pass flags to the regex by using the delimiter syntax to instantiate the RegExp
.
Reference Implementations:
- PHP: https://github.com/justinrainbow/json-schema/blob/master/src/JsonSchema/Constraints/StringConstraint.php#L43
- PHP: https://github.com/swaggest/php-json-schema/blob/aa5ce4073adb69f36f84255392b69b3f6588d086/src/Helper.php#L15
- PHP: https://github.com/opis/json-schema/blob/master/src/Validator.php#L1209
- JS: https://github.com/epoberezkin/ajv/blob/master/lib/dot/pattern.jst#L8
- JS: https://github.com/mafintosh/is-my-json-valid/blob/master/index.js#L128
- Java: https://github.com/everit-org/json-schema/blob/master/core/src/main/java/org/everit/json/schema/regexp/JavaUtilRegexpFactory.java#L12 ( You have to jump around a bunch, but it uses just the regex, adding flags is a separate method: https://docs.oracle.com/javase/7/docs/api/java/util/regex/Pattern.html#compile(java.lang.String) )
That said, I agree that it's the way to go. I don't think we'll need any flags added to the regex, do you?
Interestingly, looking at the PHP validators, they all use the u
flag. Presumably to reject non UTF-8 characters. I believe PHP requires JSON to be in UTF-8. To be consistent, we might also want to add this flag so it rejects non UTF-8 strings since request data might not come from JSON.
If necessary, the user can always add things via modifiers such as (?i).
I don't think we should recommend this. The spec recommends schema authors to limit themselves to regular expressions that have the highest change of being interoperable: http://json-schema.org/latest/json-schema-validation.html#regexInterop
#5
in reply to:
↑ 4
@
7 years ago
My understanding is that the
[a-zA-Z]
part is the actual regex and for convenience purposes you can pass flags to the regex by using the delimiter syntax to instantiate theRegExp
.
Sorry, I'm confused by this sentence. It sounds like you're saying the pattern would include the delimiter and flags, but I know that's not what you're saying because you said I should change the current patch to not do that. So... I'm not sure what you're saying, then. 😄
Interestingly, looking at the PHP validators, they all use the
u
flag. Presumably to reject non UTF-8 characters. I believe PHP requires JSON to be in UTF-8. To be consistent, we might also want to add this flag so it rejects non UTF-8 strings since request data might not come from JSON.
Can't say I've played around with that flag (just did to understand it). Makes sense and seems like it's good to have when one wants it.
Adding another patch to reflect what we've discussed so far.
#6
@
7 years ago
Regarding the PCRE/u modifier:
There exists the _wp_can_use_pcre_u()
core function, used in the compatible version of mb_substr()
.
In https://core.trac.wordpress.org/ticket/44296#comment:9 I noticed that core seems to make PCRE_UTF8
checks inconsistently.
#8
@
6 years ago
- Milestone changed from Awaiting Review to Future Release
Going to move to Future Release as there is adequate support for this request.
#10
@
5 years ago
- Keywords needs-refresh good-first-bug added; has-patch removed
- Milestone changed from Future Release to 5.5
This ticket was mentioned in PR #261 on WordPress/wordpress-develop by sorenbronsted.
5 years ago
#12
Added string regex pattern
Trac ticket: https://core.trac.wordpress.org/ticket/44949
#14
follow-up:
↓ 15
@
5 years ago
@sorenbronsted thanks for the patch.
I wonder if preg_quote()
should be used instead of str_replace()
.
ps: preg_quote()
is already used in few places in core.
pss: scanning through the older thread above, I see it's been already mentioned in https://core.trac.wordpress.org/ticket/44949?cnum_edit=14#comment:2 :-)
#15
in reply to:
↑ 14
;
follow-up:
↓ 19
@
5 years ago
Replying to birgire:
I wonder if
preg_quote()
should be used instead ofstr_replace()
.
preg_quote
is used to escape special regex characters in your string, so they become ordinary characters. So if you put the regex a+b
through preg_quote('a+b')
it becomes a\+b
, and thereby the special regex character +
looses it's meaning.
#16
@
5 years ago
- Owner set to TimothyBlynJacobs
- Resolution set to fixed
- Status changed from new to closed
In 47810:
#18
@
5 years ago
Awesome! Thanks for resurrecting this and bringing it across the finish line, @sorenbronsted and @TimothyBlynJacobs!
#19
in reply to:
↑ 15
@
5 years ago
Replying to sorenbronsted:
Replying to birgire:
I wonder if
preg_quote()
should be used instead ofstr_replace()
.
preg_quote
is used to escape special regex characters in your string, so they become ordinary characters. So if you put the regexa+b
throughpreg_quote('a+b')
it becomesa\+b
, and thereby the special regex character+
looses it's meaning.
thanks @sorenbronsted, sorry about my little confusion there :-)
Last diff was wrong direction, this adds