Opened 2 years ago
Last modified 18 months ago
#56494 new defect (bug)
REST API: Fix and test rest_default_additional_properties_to_false
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | 6.0.2 |
Component: | REST API | Keywords: | has-patch has-unit-tests |
Focuses: | Cc: |
Description
Currently the rest_default_additional_properties_to_false
- does not support the "allOf", "anyOf", "oneOf" keywords from meta-schema v4, v6, v7
- has no tests
The patch provides both of these.
Attachments (8)
Change History (18)
This ticket was mentioned in PR #3170 on WordPress/wordpress-develop by annaghi.
2 years ago
#1
- Keywords has-unit-tests added
TimothyBJacobs commented on PR #3170:
2 years ago
#2
Thanks for the PR @annaghi! This is looking great, thank you for the comprehensive tests.
I think we need to add some more documentation to the $types
array discussing why it is sometimes passed and sometimes not.
Additionally, Core doesn't support the tuple
form of items
so we should remove it from this function.
This ticket was mentioned in PR #3374 on WordPress/wordpress-develop by annaghi.
2 years ago
#3
This PR is related to https://github.com/WordPress/wordpress-develop/pull/3170
It is a different version of the same function, still providing
- support for "allOf", "anyOf", "oneOf", ~and "items"~ as
schemaArray
typed keywords - unit tests
Trac ticket: https://core.trac.wordpress.org/ticket/56494
#4
@
2 years ago
Hi @TimothyBJacobs, thanks for the review, and raising the need of discussion of passing the $types
parameter.
I was thinking about it and experimenting with the function a lot, and finally I came up with two possible solutions.
It became a bit long write-up, but it helped me to organize the information around MR Trees for this function.
Insignificant changes
At the beginning I would like to mention the least important changes:
- moved the
array
section before theobject
- that moved the "Task" at the very end of the function
Now the code can be split into 3 parts from "bigger" to "smaller" data structure:
- recursive call on forest: "Forest part"
- recursive call on tree: "Tree part"
- performing some task on the current node: "Task"
Interesting changes
The schema traversal can be implemented in two ways (hence the two solutions), depending on how we define the schema being "object".
Object definition
The schema is an object if:
- has
type ⊃ 'object'
Or we can say more permissive that if:
- has
type ⊃ 'object'
, or - has
properties
keyword set, or - has
patternProperties
keyword set, or - has
additionalProperties
keyword set witharray
value
Similarly a schema is an array if:
- has
type ⊃ 'array'
, or - has
items
keyword set (not supporting thetuple
form)
"Task" to be performed
In this sense the "Task" can be implemented in two ways:
<?php if ( ! isset( $schema['additionalProperties'] ) && in_array( 'object', $types, true ) ) { $schema['additionalProperties'] = false; }
or
<?php if ( ! isset( $schema['additionalProperties'] ) && ( in_array( 'object', $types, true ) || isset( $schema['properties'] ) || isset( $schema['patternProperties'] ) ) ) { $schema['additionalProperties'] = false; }
Solution 1
It is more similar to the original function, uses the first object definition, and the first "Task" will be performed.
https://github.com/WordPress/wordpress-develop/pull/3170
The schema traversal might stop at some nodes, and leave sub-trees unvisited without performing the "Task" on them. This is because the nodes are checked against the type
, and are discarded from further traversal if they do not have, or have wrong type
.
Solution 2
It uses the more liberal second object definition, and the second "Task" will be performed.
https://github.com/WordPress/wordpress-develop/pull/3374
The schema will be fully traversed, the "Task" will be performed on much more nodes, because there are no guards in the "Forest part" nor in the "Tree part".
The test cases for these solutions have some minor differences, and they seem to have more sense in Solution 2.
Solution 1
This implementation is very similar to the original one, in which we assume that we work on valid schema.
Definition of valid schema
The original implementation ran on Trees, and the schema was valid if:
- has
type
- has correct
type
(if it isarray
then it hasitems
, and if it isobject
then it has some properties)
The proposed implementation runs on MR Trees, and the schema is valid if:
- has
type
- has correct
type
(if it isarray
then it hasitems
, and if it isobject
then it has some properties) - has correct base
type
- localtype
relation
Tree, MR Tree and the third rule will be discussed in the separate comment.
If the schema is invalid, then
1. Throw notice & discard schema from further traversal
In the original implementation if the schema had no type
, then the schema was discarded from further traversal because of the first line $type = (array) $schema['type'];
, and a run-time notice was thrown:
PHP Notice: Undefined index: type in /home/anna/html/wp60/wp-includes/rest-api.php on line 3041
In the proposed implementation the behavior remained the same, except the built-in notice was replaced with the _doing_it_wrong()
notice.
2. Discard schema from further traversal
In the original implementation if the schema had no correct type
, then the schema was discarded from further traversal because of these guards:
<?php if ( in_array( 'object', $type, true ) ) { ... }
<?php if ( in_array( 'array', $type, true ) ) { ... }
For example these wrong schemas were discarded:
<?php [ 'type' => 'array', 'properties' => [...] ]
or the reverse:
<?php [ 'type' => 'object', 'items' => [...] ]
This remained as is in the proposed implementation.
3. Skip performing "Task" on schema
In the original implementation if the schema had no type ⊃ 'object'
then the "Task" was skipped. For example the "Task" was not performed on this wrong schema:
<?php [ 'type' => 'string', 'properties' => [...] ]
The "Task" remained as is in the proposed implementation.
Solution 2
While in Solution 1 the schema might have sub-trees unvisited because of a missing or wrong type
, in this solution the traversal is full, and the "Task" will be performed on much more nodes of the schema.
I came up with this solution because I think
Validation
Both solutions are silent about wrong type
keyword, but they handle this situation differently (discarding vs. keeping nodes in the traversal in order to do not or to do perform the "Task" on them, respectively). This function in general should not be responsible for validation, it is done in another function where the value is validated against the schema, and notices will be thrown.
Sanitization
Looking at the rest-api.php
code it seems to me that we do not sanitize the schema (well, besides this very function where we set the additionalProperties
to false
), and discarding nodes from performing the "Task" on them seems like a (silent to the user) anti-sanitization.
Completeness
It feels wrong for me discarding silently some nodes from performing the "Task" on them, so first I was experimenting with full featured sanitization. It went out of hands, so I thought: let's expand the object definition instead, and there is no reason to think about validation nor sanitization anymore in this function.
But I am not aware of all aspects of these changes, and would like to understand if the first solution is desired.
If the first solution is the correct one, then it might be worth to continue reading about how supporting allOf
, anyOf
, oneOf
is related to the third rule in valid schema definition.
Now having base type
on allOf
, anyOf
, oneOf
schemas allows not having type
on children. But we need to perform the "Task" on children too, so they have to have correct type
. And that is when and why we pass the $types
in the "Forest part": the children will use the base type
incoming from their parent.
#5
@
2 years ago
Tree defined by mutual recursion (MR Tree)
This comment is related to the Solution 1, and because it discards invalid schemas from traversal, it is enough to analyze valid schemas.
Before supporting the keywords with schemaArray
definition (allOf
, anyOf
, oneOf
), the function ran on Trees.
After supporting the keywords with schemaArray
definition, the function runs on MR Trees.
This is how I imagine these data structures:
Tree | MR Tree |
---|---|
A Tree can have two kinds of nodes:
| An MR Tree can have three kinds of nodes:
With the short terms:
|
We can see on the second figure that
- we can capture the essence of the forest with the horizontal line, which tries to mimic the notation of the indexed array
[]
, - that horizontal line also can be imagined as a "fence" between the parent and the roots which has to be "jumped through", and
- the trees in the forest are independent from each other, there is a possibility to run the algorithm concurrently on them (if ever needed).
Schema defined by MR
I have picked 4 schemas from the test cases, and they would look like this:
different kind of properties
oneOf local type
oneOf local & base type
oneOf base object type
Three kinds of nodes and they are all valid
- filled circle: the "Task" has to be performed on this node, and the
type
keyword is set with correct value - empty circle: the "Task" has to be performed on this node, but the
type
keyword is missing (this case will be handled by passing the$types
parameter from parent to roots) - dotted circle: there is no need to perform the "Task" on this node, and the
type
keyword is missing (this case is handled by defaulting to$types = array()
)
Well defined where these nodes can appear in a valid MR schema
- filled: tree and leaf nodes have to be filled, with exceptions:
- if the node is a tree and also parent, then it can be dotted too
- if the node is a tree or leaf and also root, then it can be empty too
- empty: only the roots can be empty
- dotted: only the parent can be dotted
The parent of a forest can be
- dotted: meaning the
type
is missing, so the roots have to have their own localtype
, or - filled: meaning it has
type
which is the basetype
in respect of the roots.
A root inside a forest can be
- filled: in which case the root has its own local
type
, or - empty: in which case the parent has to have the base
type
.
Valid schema defined by MR
We can discover the pattern of which kinds of nodes exist together in the parent - root relation:
- filled parent - filled root (root's
type
overwrites the (base) parent'stype
) - dotted parent - filled root
- filled parent - empty root
Inserting two more test cases in order to illustrate the alternating situations:
- dotted parent - filled root:
oneOf additional properties, local type, :||
- filled parent - empty root:
oneOf additional properties, base type, :||
Wrong base type - local type relation
The schema is invalid if the parent has no type
, and a root also have no type
.
Meaning there is no base type
on parent, and there is no local type
on the root, so the "Task" cannot be performed on the root.
- Iteration 1 runs on the parent, and passes its base
type
in the$types
parameter to the next iteration. - Iteration 2 runs on each root, and they have no local
type
, so the incoming$types
should be used.
Definition of valid MR schema
All the previous analysis leads us to extract the meaning of the third rule:
- has
type
- has correct
type
(if it isarray
then it hasitems
, and if it isobject
then it has some properties) - has correct base
type
- localtype
relation (the situation on the red figure above does not exist)
Which part of the function runs on which node
Current node is filled
- if the node is the parent of a forest, then the "Forest part" will be executed passing the node's
type
to the roots - if the node is a tree (in- or outside a forest), then the "Tree part" will be executed
- in both cases and also if the node is a leaf (in- or outside a forest), the "Task" will be performed
Current node is dotted
- the node is the parent of a forest without
type
, so the "Forest part" will be executed passing thetype = array()
to the roots (and they must havetype
on their own)
Current node is empty
- if the node is a tree (inside a forest), then the "Tree part" will be executed using the incoming base
type
- in the previous case and also if the node is a leaf (inside a forest), the "Task" will be performed using the incoming base
type
Docs
The difference between standard and WordPress schema might lead to confusion for schema writers, and so might need to highlight this difference.
They have to be aware of that the WordPress schema mirrors the boolean additionalProperties
values, and sets those to false
by default. See the test case allOf base object type & properties
which is the WordPress version of a standard JSON schema (URL provided in the test).
If someone reads the standard JSON schema docs, or finds schema on SO for example, then they need to mirror the schema to fit in WordPress.
2 years ago
#6
Hi @TimothyBJacobs , the PR is ready for review again, and I left detailed information in the Trac ticket.
2 years ago
#7
Hi @TimothyBJacobs , the PR is the counterpart of the other one: https://github.com/WordPress/wordpress-develop/pull/3170, and ready for review.
I left detailed information in the Trac ticket about the two PRs.
This ticket was mentioned in PR #3386 on WordPress/wordpress-develop by annaghi.
2 years ago
#8
This PR is one of the series:
- Solution 1 closest to the original with notice + discard
https://github.com/WordPress/wordpress-develop/pull/3170
- Solution 1.5 A variant of the Solution 1 without notice + discard :point_left:
- Solution 2 extended object definition
https://github.com/WordPress/wordpress-develop/pull/3374
and offers:
- support for "allOf", "anyOf", "oneOf", ~and "items"~ as
schemaArray
typed keywords - unit tests
Trac ticket: https://core.trac.wordpress.org/ticket/56494
#9
@
2 years ago
Solution 1.5
I have just realized, that we have a third solution too :)
https://github.com/WordPress/wordpress-develop/pull/3386
It is that version which you have already seen @TimothyBlynJacobs, a variant of the Solution 1.
The difference is that Solution 1.5 does not throw notice if the type
is missing on the schema. Everything else is the same as for Solution 1.
#10
@
18 months ago
I landed here because of the Warning when type
is not defined. If type
is defined inside my anyOf
criteria, there's no point in having a type
in that level.
The solution here looks great, and kudos for all the explanation, but I wonder if this adds too much overhead and could have impact on performance. I was wondering if there's a better way to handle this, meaning expecting additionalProperties
to be false
if not declared.
Maybe this check could be done somewhere else, when we actually look for this property, instead of having to loop through all the schemas to add it.
For example here https://github.com/WordPress/wordpress-develop/blob/trunk/src/wp-includes/rest-api.php#L2353 we could just set it to false
on the fly if not set.
Same thing on https://github.com/WordPress/wordpress-develop/blob/trunk/src/wp-includes/rest-api.php#L2795 and https://github.com/WordPress/wordpress-develop/blob/trunk/src/wp-includes/rest-api.php#L3030 - and it seems that this is all that's needed.
This PR offers:
Trac ticket: https://core.trac.wordpress.org/ticket/56494#ticket