Make WordPress Core

Opened 6 months ago

Last modified 5 months ago

#60950 new feature request

JSON serialization for meta values

Reported by: inf3rno's profile inf3rno Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version:
Component: Options, Meta APIs Keywords: reporter-feedback
Focuses: performance Cc:

Description

It would be nice to have JSON serialization for complex meta values like arrays. MySQL and other databases support JSON specific functions like JSON_CONTAINS, JSON_OVERLAPS, JSON_SEARCH, etc. we could use for building new comparators. Currently all we can do with these complex meta values if we want to search them is unpacking them. Either we read all the database records and compare them with PHP (which requires a lot more resources) or we put them in separate meta values to make them searchable, comparable (which bloats the database). A really ugly hack some people do is searching the PHP serialized form of a partial value with LIKE comparator. If people are so desperate, then I think it is a sign that this is a really needed feature...

Change History (6)

#1 @aqsaj
6 months ago

  • Focuses performance added

By introducing JSON serialization for complex meta values, we can not only streamline queries and searches but also optimize database usage, leading to better performance and scalability for WordPress installations handling diverse data structures.

I'm in full support of this feature request and I'm looking forward to seeing further progress on this front.

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


6 months ago

This ticket was mentioned in Slack in #core-performance by mukeshpanchal27. View the logs.


5 months ago

#5 @johnbillion
5 months ago

  • Keywords reporter-feedback added

@inf3rno @aqsaj Thanks for the ticket and the comment.

  • Can you provide some specific examples of the queries you would like to perform against meta values containing JSON?
  • What sort of API do you think would help? Additional meta_query parameters? Or something else?

#6 @inf3rno
5 months ago

@johnbillion

I don't think I am the best person to decide about API changes or implementation, but ok.

Well it is a general concept, you can use any time you have multiple meta keys to group them and reduce database bloating. Currently I am not working on a project, but it is not a rare situation to have a single query with multiple meta keys. In all of those cases the keys could be grouped into a single JSON instead of storing them separately.

E.g. a delivery post type which contains an address and we query the city and the street, maybe we are in the delivery truck and we need addresses close to our location to plan our route.

Afaik. with the current API the meta query looks like this:

<?php
$arg['meta_query'] = [
    'relation' => 'AND',
    [
        'key' => 'city',
        'value' => 'New York',
        'compare' => '='
    ],
    [
        'key' => 'street',
        'value' => 'Broadway',
        'compare' => '='
    ]
];

Storage:

 meta_key       | meta_value
 address_city   | s:8:"New York";
 address_street | s:8:"Broadway";
 address_number | s:3:"128";,
 package_id     | s:3:"345";

With the JSON API we can use nested keys and reduce the number of meta keys:

<?php
$arg['meta_query'] = [
    'relation' => 'AND',
    [
        'key' => ['address','city'],
        'value' => 'New York',
        'compare' => '='
    ],
    [
        'key' => ['address','street'],
        'value' => 'Broadway',
        'compare' => '='
    ]
];

Storage:

 meta_key   | meta_format | meta_value
 address    | JSON        | {"city": "New York", "street": "Broadway", "number": 128}
 package_id | PHP         | s:3:"345";

I guess it could be solved this way.

#7 @inf3rno
5 months ago

Looks like it will require a different table structure, because it is a lot more optimized when the entire column has JSON type:
https://dev.mysql.com/doc/refman/8.0/en/json.html

We have several options here:

  • two meta tables with the current structure except the second table has JSON column type for meta_value
 meta_key   | meta_value
 package_id | s:3:"345";
 meta_key   | meta_value
 address    | {"city": "New York", "street": "Broadway", "number": 128}
  • one meta table, but having a separate column for JSON values:
 meta_key   | meta_value  | meta_json_value
 address    | NULL        | {"city": "New York", "street": "Broadway", "number": 128}
 package_id | s:3:"345";  | NULL
  • one meta table and storing everything in JSON
 meta_key   | meta_value
 address    | {"city": "New York", "street": "Broadway", "number": 128}
 package_id | "345"

It is hard to decide whether we need this without measuring. First we need to measure whether or how a bloated or very bloated database affects performance of meta queries. Second we need to measure how certain solutions and JSON serialization and unserialization affects performance. Maybe we don't need this at all, maybe the queries are faster with JSON, but the unserialization is a lot slower than PHP, I am not sure. I am not an expert of PHP + MySQL benchmarking, so we need somebody here, who can measure this.

Version 0, edited 5 months ago by inf3rno (next)
Note: See TracTickets for help on using tickets.