#43392 closed enhancement (fixed)
Support 'object' and 'array' types in register_meta()
Reported by: | diegoliv | Owned by: | TimothyBlynJacobs |
---|---|---|---|
Milestone: | 5.3 | Priority: | normal |
Severity: | normal | Version: | 4.9.4 |
Component: | Options, Meta APIs | Keywords: | has-unit-tests dev-feedback commit has-patch needs-docs has-dev-note |
Focuses: | rest-api | Cc: |
Description
Currently, register_meta()
is used to expose meta data by setting show_in_rest => true
. If you want to work with custom metadata on Gutenberg (by setting an attribute with source: 'meta'
), you need to use register_meta()
. Now, this function also has the parameter type
, where you set the metadata object type. Currently, allowed types are simple things like integer
or string
.
While I was working with Gutenberg, I felt the need to create nested attributes (that would translate into JSON objects). Here is an example:
attributes: { navigationStyles: { type: 'object', default: { fontSize: { type: 'number', default: 10, }, fontColor: { type: 'string', default: '#ccc', }, }, source: 'meta', meta: 'navigation_styles', }, },
Now, when Gutenberg tries to save this attribute on the database, the attribute will be a JSON object, which will translate into an associative array to be handled in PHP. But, currently, there's no support for registering a metadata with register_meta()
that will be an associative array:
<?php function blocks_meta_init(){ register_meta('post', 'navigation_styles', array( 'show_in_rest' => true, 'type' => 'array', // not possible at the moment )); } add_action('init', 'blocks_blocks_meta_init');
If you try to do that, the REST API will fail to save the metadata. My current workaround for this use case is to just register the meta as a string
, and handle the formatting of the metadata using sanitize_callback
:
<?php function blocks_meta_init(){ register_meta('post', 'navigation_styles', array( 'show_in_rest' => true, 'type' => 'string', 'sanitize_callback' => 'blocks_sanitize_attr' )); } add_action('init', 'blocks_blocks_meta_init'); function blocks_sanitize_attr( $meta_value, $meta_key, $meta_type ){ return serialize( json_decode( $meta_value ) ); }
This works, but has several limitations:
- You need to turn your javascript object into a string (
JSON.stringify
)if you want to store it on the database. - You need to handle sanitization yourself
- If you get the metadata using
get_post_meta()
, it won't applymaybe_unserialize()
before returning the metadata, you need to do it yourself.
I propose that we support the 'type' => 'array'
with register_meta()
. This way, we'll have a proper way to store arrays or JSON objects as associative arrays. This would make the REST API more flexible, and project Gutenberg would also benefit by making attributes more flexible than they currently are.
Attachments (9)
Change History (62)
#3
follow-up:
↓ 4
@
7 years ago
Hey @azaozz thank you for the response! I think I agree with you regarding maybe_unserialize()
and about handling sanitization yourself if you're returning a more complex object. But what about a simple JSON object? a JSON object is not the same thing as a javascript object. A JS object might have simple properties or functions as properties. This would not be a good object to store in the database, of course. But a plain JSON object is really equivalent as an associative array in PHP. It's just a collection of simple properties like 'key' : 'value'
. Returning this kind of object through Ajax, you end up with an associative array anyways (inside the $_POST
or $_REQUEST
globals).
We can easily save associative arrays already with update_post_meta()
. Just throw an array as the value and WP automatically saves it into a proper format (a serialized string) on the database. And using get_post_meta()
, your serialized array is automatically converted back into an array. Is there any strong reason not to allow this same behavior for meta data updated through the REST API?
#4
in reply to:
↑ 3
@
7 years ago
Replying to diegoliv:
We can easily save associative arrays already with
update_post_meta()
. Just throw an array as the value and WP automatically saves it into a proper format (a serialized string) on the database. And usingget_post_meta()
, your serialized array is automatically converted back into an array.
This is the exact bit that uses maybe_unserialize()
.
Is there any strong reason not to allow this same behavior for meta data updated through the REST API?
Well, if we want to avoid stuff like maybe_json_decode()
:)
I wish there was an easy way to specify the format of an encoded string in the meta table, but there isn't. The only other option is to "guess" the encoding, which is..... subpar.
#5
@
7 years ago
This post by Ryan McCue; The (Complex) State of Meta in the WordPress REST API explains why complex values & serialized data are not exposed by register_meta()
.
The key reason for this is that accepting serialized data is either lossy or unsafe.
This ticket was mentioned in Slack in #core-restapi by kadamwhite. View the logs.
6 years ago
#7
follow-up:
↓ 8
@
6 years ago
Today in slack @flixos90 said he thought this was a good addition. I also think this will be handy.
For our plugin's REST API endpoints, we have fields that are arrays and they support both reading and writing, and so far haven't had any trouble (see https://github.com/eventespresso/event-espresso-core/blob/master/docs/C--REST-API/ee4-rest-api-reading-data.md#serialized-php-objects-in-responses-are-removed).
Input from the API client needs to always be a JSON array, which is easily converted into a PHP array, and then can be stored easily via serialization. We never, ever accept a string that represents a serialized thing, which could be a serialized PHP class, which expands the attack vector like @rmccue was talking about in his blog post. So, this way we're never unserializing user input.
Those same fields only ever serve PHP array which map nicely into JSON objects. So this way it's not lossy either, which was @rmccue's other main hold-up.
So I don't see why WP core couldn't likewise support register_meta()
with 'type' => 'array'
, where the input is always JSON arrays (which is received as a PHP array and gets serialized, so it wouldn't actually need to run maybe_serialize()
), and the value is stored in the DB as a serialized array, and then when retrieving the data from the DB, we unserialize it to a PHP array, and then serve it as a JSON object again over the REST API. (Optionally, WP core could decide on a different method of serializing the PHP array, like json_encode
ing, but continuing with PHP's serialize
methods is more consistent).
#8
in reply to:
↑ 7
@
6 years ago
Replying to mnelson4:
Lets break this up a bit:
Input from the API client needs to always be a JSON array,
Good.
which is easily converted into a PHP array, and then can be stored easily via serialization.
Bad. :)
Why would you convert one way of storing an object into a string (JSON) to another way of storing the same object into a string (serialization)? This doesn't solve any of the shortcomings as it still adds an associative array to the DB that needs to be serialized before storing.
Note: the problem here is not with maybe_serialize()
when saving to the DB, it is with maybe_unserialize()
when retrieving the data from the DB.
As mentioned before (and in many other places) the ideal solution would be to add another field to the meta table(s) where we can store the data format. It can be a TINYINT
(0 to 255) that only takes 1 byte of space. I know, large sites with huge tables will still hate this, but...?
Another possible (but much worse) solution would be to prepend or append the encoding format to the actual string at the last moment, and remove it as soon as the data is retrieved. This has the disadvantage of introducing some back-compat problems (perhaps only with SQL exports and imports) but is not "totally impossible" to implement.
#9
@
6 years ago
Thanks @azaozz for the insight. Oh, and sorry I think I got a little side-tracked focusing solely on how this applies to the REST API.
Why would you convert one way of storing an object into a string (JSON) to another way of storing the same object into a string (serialization)?
That's not a requirement, I was just suggesting continuing old patterns- when storing a PHP array to a meta column, we usually serialize it with serialize()
, not json_encode()
.
As mentioned before (and in many other places) the ideal solution would be to add another field to the meta table(s) where we can store the data format. It can be a TINYINT (0 to 255) that only takes 1 byte of space. I know, large sites with huge tables will still hate this, but...?
I'm sorry I actually haven't been paying enough attention elsewhere to have heard that before, but I like where it's heading. Is there a separate ticket for that? Also, a question about it: couldn't we use the data from register_meta()
to decide whether a value should be unserialized or not? (ie, no need for another column). If the meta's type is an array
, we know it needs to be unserialized; if it's type is int
or string
, then it shouldn't need to be unserialized, and if the meta hasn't been registered, then I'm afraid we don't know, we'll need to maybe_unserialize()
. That would be backwards compatible (assuming nobody expects fields registered as string
will be unserialized). But it's also another incentive for developers to register their meta.
...But anyways, maybe that discussion should happen on another ticket focusing on improving how we store arrays of data in meta tables.
This ticket, I think, is mostly about using register_meta()
to indicate which fields can accept an array of data (and which fields already do).
Is it worth me making a patch as a proof of concept and make the discussion more concrete?
This ticket was mentioned in Slack in #core-restapi by flixos90. View the logs.
6 years ago
#11
@
6 years ago
- Keywords close removed
- Milestone changed from Awaiting Review to Future Release
I think we need to differentiate here a bit more: The way we use associative arrays is typically no different from using objects. They have properties and values. This becomes more obvious when looking at JavaScript where associative arrays do not exist.
The REST API already supports both objects and arrays in the settings controller. It doesn't actually make it easy to use those, but you can without problems specify a schema for an object or array, and even nest it. There is no harm in doing the same for metadata, at least for objects (as in equal to associative arrays). Arrays might be a little more complicated to figure out with meta, due to the $single
vs not-$single
caveat that settings do not have.
This is not unknown territory, the REST API already supports it elsewhere. Note that, when looking at JSON schema, the proper definition for an object would look like this:
{ type: 'object', properties: { my_prop1: { type: 'string', enum: [ 'Hello', 'Bye' ] }, my_prop2: { type: 'integer' } } }
For arrays, it would look like this:
{ type: 'array', items: { type: 'string', enum: [ 'Hello', 'Bye' ] } }
Again, we need to be careful about arrays. If you specified the above schema for an array while setting $single
to false, it would result in each meta value for the key being an array in the above format, so the full metadata returned would actually be an array of arrays.
Regarding serialization, I consider this problematic for the above workaround, passing serialized or JSON-encoded strings to the API. Once we actually support proper objects and arrays, this will happen internally. Developers have been saving arrays and objects as both metadata and settings for years without significant issues.
Let's think about this further, but I recommend drifting away from talking about associative arrays and rather talk about objects and "real" arrays, which have a clear differentiation and exist in that form in the majority of programming languages.
#12
@
6 years ago
- Summary changed from Support associative array type in register_meta() to Support 'object' and 'array' types in register_meta()
#13
@
6 years ago
@flixos90 you're absolutely right about moving the discussion away from associative arrays and split the types between object
and array
. I just changed the ticket summary to reflect this.
#14
@
6 years ago
+1 @flixos90.
Regarding serialization, I consider this problematic for the above workaround, passing serialized or JSON-encoded strings to the API. Once we actually support proper objects and arrays, this will happen internally
I don’t understand what you mean here though. Are you just saying we shouldn’t pass serialized data or json encoded data for meta values (over the REST API and the *_post_meta
functions). If so I think I agree.
This ticket was mentioned in Slack in #core-restapi by timothybjacobs. View the logs.
6 years ago
This ticket was mentioned in Slack in #core-restapi by kadamwhite. View the logs.
6 years ago
This ticket was mentioned in Slack in #core-restapi by spacedmonkey. View the logs.
6 years ago
This ticket was mentioned in Slack in #core-restapi by desrosj. View the logs.
6 years ago
This ticket was mentioned in Slack in #core-restapi by timothybjacobs. View the logs.
6 years ago
#21
@
6 years ago
- Milestone changed from Future Release to 5.3
- Owner set to TimothyBlynJacobs
- Status changed from new to assigned
#22
@
6 years ago
- Keywords has-patch has-unit-tests dev-feedback 2nd-opinion added; needs-patch removed
I've added a first go at a patch to support this.
One question/potential issue I'd like to highlight is when there is invalid data stored in the metadata.
In the settings controller, if the option value does not validate according to the schema, the value is set to null
in the response. On update, if a setting's value is null
and the stored data is invalid, the option is not deleted, instead a WP_Error
is returned.
In meta, we aren't currently that strict. The only time we set a value to null is if a PHP object that does not implement JsonSerializable
is stored. There is no protection on update to return an error instead of deleting the value. Additionally, we do not require the value to be valid according to it's schema, nor sanitize it when preparing the value in the default prepare_callback
.
This gives us a couple of questions.
- Do we want to enforce that the value is valid in addition to it not being a non-jsonserializable PHP object? I think we'd only be able to apply this treatment for
object
andarray
meta types for BC reasons. Importantly, I think this would only happen in the defaultprepare_callback
, so any custom preparation of values wouldn't be effected. - Do we want to sanitize the output value? Right now, the only sanitization applied to outgoing values is to cast it to the correct type. We could again call
rest_sanitize_value_from_schema
in the default prepare callback, and again probably only forobject
andarray
types. - When updating a
single
meta value withnull
when the stored value is invalid is easy to reject as a500
error ( matching the behavior in the settings controller ). However, when one value in a list of multiple isnull
, the request is rejected at the validation stage with a 400 error sincenull
is not supported according to the schema. Do we want to do a check before validation and return a500
error instead? What if the invalid value is omitted by the client? This would currently delete the value, do we instead want to reject any update multi request when there is an invalid value stored in any of the meta entries?
1 & 2 Prepare Callback Reference: https://github.com/WordPress/wordpress-develop/blob/df9caed81bd3189dd05ea49d044cae7a233cef41/src/wp-includes/rest-api/fields/class-wp-rest-meta-fields.php#L459
3 Request References:
add_post_meta( $post_id, 'object', [ 'project' => 'WordPress' ] );
add_post_meta( $post_id, 'object', (object) [ 'project' => 'bbPress' ] );
This would be rejected as a 400 error because meta.object[1]
does not match the schema type of object
.
{ "meta": { "object": [ {"project": "WordPress"}, null ] } }
This would delete the second value.
{ "meta": { "object": [ {"project": "WordPress"} ] } }
#23
@
6 years ago
I also tentatively added support for additionalProperties
other than false
. I think this is super important for object
meta types where you often have a list with arbitrary keys
but a required value
format.
#24
@
6 years ago
Updated patch to also update rest_sanitize_value_from_schema
. 43392.2.diff can be ignored, I missed the new file in the patch.
This ticket was mentioned in Slack in #core-restapi by timothybjacobs. View the logs.
6 years ago
This ticket was mentioned in Slack in #core-restapi by timothybjacobs. View the logs.
6 years ago
This ticket was mentioned in Slack in #core-restapi by timothybjacobs. View the logs.
6 years ago
#28
@
6 years ago
Thanks for the patches @TimothyBlynJacobs !
- Do we want to enforce that the value is valid in addition to it not being a non-jsonserializable PHP object? I think we'd only be able to apply this treatment for object and array meta types for BC reasons. Importantly, I think this would only happen in the default prepare_callback, so any custom preparation of values wouldn't be effected.
We discussed a bit in chat today, and thought it's probably rare that developers are registering meta AND providing a schema AND providing values that don't match that schema. Plus, the current behaviour could be considered a bug ("I provided a schema, and you're not enforcing it. What's up with that WP??") so a BC-breaking change could be seen as a bugfix, and so more palatable.
With that in mind, I think we should enforce that values should match the schema on reading and writing, even for scalar values. BUT we could probably be merciful if the data is just of the wrong type (eg providing a numeric string instead of a number, an object instead of an array etc).
- Do we want to sanitize the output value?
If that's what settings controller is doing, sure. It'd probably be safer. Just be careful to not escape things twice.
- ...Do we want to do a check before validation and return a 500 error instead?
A 500 error would probably be more expected, but a 400 isn't the end of the world.
What if the invalid value is omitted by the client? This would currently delete the value, do we instead want to reject any update multi request when there is an invalid value stored in any of the meta entries?
Oh, so what if the API client automatically removes the null value before sending it? Surprise deletion. That would be a bit annoying. And if we rejected "any update multi request when there is an invalid value stored in any of the meta entries", clients could still fix those invalid meta entries by issuing a query to update single items. So they'd have a workaround (albeit possibly inconvenient), and no accidental deletions. So ya rejecting those would probably be good.
#29
@
5 years ago
Thank you for the feedback @mnelson4!
I implemented those changes except for the last part of handling if a client removed the null values before sending it. It adds some more complexity, and I'm personally comfortable with calling a client doing that broken behavior. But if people have other thoughts, I'd love to hear them.
#30
follow-up:
↓ 33
@
5 years ago
Hi, I noticed a comment here:
https://stackoverflow.com/a/11735933/2078474
about the SORT_REGULAR
option of array_unique()
instead of the default SORT_STRING
.
I've not tested it, but wonder if it's relevant here.
In any case I wonder if it would be beneficial to add an inline comment with a little bit of explanation for this part:
$to_remove = array_map( 'maybe_unserialize', array_unique( array_map( static function( $value ) { return is_scalar( $value ) ? $value : maybe_serialize( $value ); }, $to_remove ) ) );
... this just came to mind, as this part stopped my quick code skimming :-)
I also wonder if:
$to_remove = array_map( 'maybe_unserialize', array_unique( array_map( 'maybe_serialize', $to_remove ) ) );
would have a noticeable performance difference.
ps: Just some minor notes regarding the inline docs:
- There are missing
@since 43392
on two new test methods.
- Missing translation comments.
- I also think the
"This is need"
should be"This is needed"
in the description for thedefault_additional_properties_to_false()
method.
This ticket was mentioned in Slack in #core-restapi by timothybjacobs. View the logs.
5 years ago
This ticket was mentioned in Slack in #core-restapi by timothybjacobs. View the logs.
5 years ago
#33
in reply to:
↑ 30
@
5 years ago
Thanks for the review @birgire!
Replying to birgire:
Hi, I noticed a comment here:
https://stackoverflow.com/a/11735933/2078474
about the
SORT_REGULAR
option ofarray_unique()
instead of the defaultSORT_STRING
.
I've not tested it, but wonder if it's relevant here.
It seems like SORT_REGULAR
does a loose comparison which has some gotchas: https://stackoverflow.com/questions/14802680/array-unique-sort-regular-flag
In any case I wonder if it would be beneficial to add an inline comment with a little bit of explanation for this part:
$to_remove = array_map( 'maybe_unserialize', array_unique( array_map( static function( $value ) { return is_scalar( $value ) ? $value : maybe_serialize( $value ); }, $to_remove ) ) );... this just came to mind, as this part stopped my quick code skimming :-)
That makes sense, I've added a comment explaining.
I also wonder if:
$to_remove = array_map( 'maybe_unserialize', array_unique( array_map( 'maybe_serialize', $to_remove ) ) );would have a noticeable performance difference.
I'm not sure. But I think it is also the correct way of doing it so double serialization is handled properly. Nice catch.
ps: Just some minor notes regarding the inline docs:
- There are missing
@since 43392
on two new test methods.
- Missing translation comments.
- I also think the
"This is need"
should be"This is needed"
in the description for thedefault_additional_properties_to_false()
method.
Fixed!
While looking at the array_unique
issue, I looked a bit at property order as well. JSON objects have no order, however PHP associative arrays do. Which means that the array_keys( $to_remove, $value, true )
line doesn't catch the existing value because when PHP does strict associative array comparisons it cares about the order of the elements.
I added a test to demonstrate this test_update_multi_meta_value_object_property_order_is_ignored
. It only fails due to the last assertion that the meta id didn't change. WordPress is deleting the old value and then re-adding it with the new ( differently ordered ) value.
Should we account for this?
This ticket was mentioned in Slack in #core-restapi by timothybjacobs. View the logs.
5 years ago
This ticket was mentioned in Slack in #core-restapi by timothybjacobs. View the logs.
5 years ago
This ticket was mentioned in Slack in #core-restapi by kadamwhite. View the logs.
5 years ago
#37
@
5 years ago
- Keywords 2nd-opinion removed
Tagging @johnjamesjacoby for input as meta component maintainer.
#38
@
5 years ago
Personally I am against the defaulting of additionalProperties
to true
, but it looks like this ship has sailed elsewhere, I _think_ it looks like objects with arbitrary data will already pass validation and sanitization from what I can see. So, _ok_ workaround with default_additional_properties_to_false
, but I think this probably should never had been necessary.
The patch here is all using spaces I think, that should be changed to tabs.
+1 on the patch from me.
This ticket was mentioned in Slack in #core by kadamwhite. View the logs.
5 years ago
#40
@
5 years ago
Thanks for the review @joehoyle!
I've fixed the spacing issues and other CS issues.
#41
@
5 years ago
- Keywords commit added
Considering for commit
, pending any other feedback after we called this out in dev chat earlier in the week. Will be testing today and may aim to land it should everything check out.
#45
@
5 years ago
@kadamwhite Thank you very much for committing this. I have ended up backporting this commit for WP 5.2.x so that I can use this feature before 5.3 is released.
Though I got a PHP notice:
Undefined index: items in wp-includes/rest-api/fields/class-wp-rest-meta-fields.php on line 554
I think the reason for this PHP notice is that items
is not defined in $default_schema
in get_registered_fields()
.
My code that I used:
<?php \register_meta( 'post', 'swisscomkbw-authors', [ 'show_in_rest' => true, 'single' => true, 'type' => 'array', ] );
The inline documentation still needs to be updated for the support or arrays and objects in register_meta()
https://github.com/WordPress/WordPress/blob/master/wp-includes/meta.php#L1126-L1127
I also noticed with the help of PHPCS that in_array()
is not using strict comparison.
if ( ! in_array( $type, array( 'string', 'boolean', 'integer', 'number', 'array', 'object' ) ) ) {
#46
@
5 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
Re-opening due to comment:45 which I just ran into as well.
#47
@
5 years ago
Going to bring this up at today's REST API office hours.
So I think the intention is that items
is always required. Without it, you'd be saying that anything is allowed, which we really don't want to be the default behavior.
In #38531, it was decided to follow the spec for the low-level rest_(sanitize|validate)_value_from_schema
and allow a schema definition without items
. However, this wasn't applied to rest_validate_value_from_schema
. It has no check for empty items. Looking at #38583, I think the goal was to apply the stricter validation requirements in the controllers that need them.
Right now, this isn't the case, any value will be accepted, and PHP warnings are emitted from the processing in WP_REST_Meta_Fields
and rest_validate_value_from_schema
which assumes an array
type has an items
definition.
We could enforce this in register_meta()
if someone has an array
type without providing an items
in show_in_rest.schema
. The function allows for a false
return value, and we could issue a _doing_it_wrong
. Alternatively just change the signature to bool|WP_Error
.
Of note you'll also get a similar warning if you don't provide items
when registering an array
type setting. We should probably make any changes here consistent if possible.
Test case
<?php function test_not_providing_items_for_array_meta_type_does_not_allow_any_value_through() { register_post_meta( 'post', 'list', array( 'type' => 'array', 'single' => true, 'show_in_rest' => true, ) ); $this->grant_write_permission(); $request = new WP_REST_Request( 'PUT', sprintf( '/wp/v2/posts/%d', self::$post_id ) ); $request->set_body_params( array( 'meta' => array( 'list' => array( 'WordPress', 'bbPress' ), ), ) ); // Suppress the "items" warning temporarily $response = @rest_get_server()->dispatch( $request ); $this->assertEquals( 400, $response->get_status() ); $meta = get_post_meta( self::$post_id, 'list', true ); $this->assertEmpty( $meta ); }
This ticket was mentioned in Slack in #core-restapi by timothybjacobs. View the logs.
5 years ago
#49
@
5 years ago
- Keywords needs-docs added
Ok, I was able to prevent the PHP notice by changing the code to the following:
register_meta(
'post',
'swisscomkbw-authors',
[
'type' => 'array',
'single' => true,
'show_in_rest' => [
'schema' => [
'type' => 'array',
'items' => [
'type' => 'string',
],
],
],
]
);
This means the documentation for show_in_rest
in register_meta()
needs to be updated too to mention that both booleans and arrays are supported.
#50
@
5 years ago
Uploaded a patch that updates the documentation to mention object and array types are valid and that "show_in_rest" can be an array. register_meta()
now issues a _doing_it_wrong
when registering an array
meta key that shows in the REST API and omits "items" in the schema.
When registering an "array" meta type to show in the REST API, you must specify the schema for each array item in "show_in_rest.schema.items".
#52
@
5 years ago
- Keywords has-dev-note added; needs-dev-note removed
Adding has-dev-note
to this since it has already received one.
https://make.wordpress.org/core/2019/10/03/wp-5-3-supports-object-and-array-meta-types-in-the-rest-api/
Hi @diegoliv thanks for the suggestions.
Imho this works quite well as it is now.
Correct, or you can use another format like CSV if it makes more sense for your particular case.
Correct, that's the best way to do it. You know what data to expect and when it would be "out of bounds".
maybe_unserialize()
is a terrible way to do things. It only exists because of back compat. The less it's used -- the better.Storing the data in a string format of your choice is the proper way here simply because you know what format to expect and how to sanitize it. I wish we could do that for all of WP :)