WordPress.org

Make WordPress Core

Opened 14 months ago

Closed 14 months ago

Last modified 4 months ago

#23753 closed feature request (wontfix)

Switch to JSON encoding instead of PHP serialisation

Reported by: sanchothefat Owned by:
Milestone: Priority: normal
Severity: minor Version: 3.6
Component: Database Keywords: dev-feedback
Focuses: Cc:

Description

When storing PHP arrays or objects in the database eg. options or metadata I think it would be better to use JSON encoding rather than PHP serialisation.

Pros:

  1. Better/simpler unicode support
  2. A straight search replace on the database wouldn't break anything
  3. Better portability
  4. Smaller DB footprint
  5. json_encode is approx 25% faster than serialize

Cons:

  1. json_decode is approx 55% slower than unserialize
  2. no __sleep() or __wakeup() magic methods (not used in core)
  3. no memory of what class an object was
  4. only public object properties encoded

The biggest drawback here is obviously the speed of the JSON decoding however we're still talking only 1,000ths at most or 10,000ths of a second. The other big wins in my opinion outweigh the cons.

There are of course a couple of instances where I've seen plugin authors store entire class instances in the options table but I'd argue over whether that's good practice. In consideration of backwards compatibility perhaps a set of alternative add|update|get_option() methods that use JSON encoding could be offered.

What sparked this suggestion was writing a twitter API plugin that needs to cache the JSON results as transients. In cases where memcache or APC is unavailable the default serialisation and unserialization was breaking on unicode characters causing the cache to be invalid almost every time.

I'd love to hear your thoughts on this issue and if you have any suggestions for how it could be implemented over time I'll be happy to work on a patch.

Change History (6)

comment:1 sanchothefat14 months ago

Just to clarify, I was using json_decode on the API result prior to caching initially so that when retrieving the cache I could just use it as is without having to use decode manually. That's not a problem so the question is really more to do with the other benefits of using JSON. The twitter use-case is simply what highlighted the possibility of JSON as a general data storage standard for me.

comment:2 johnbillion14 months ago

This doesn't make a lot of sense. Serialization serves to store data in its current state, allowing you to retrieve it later. JSON is not capable of doing this (as noted in your 'Cons' above).

If the current serialization method breaks with multibyte characters then we should fix the serialization, not change it to JSON. Could you provide an example which breaks the serialization? I'd like to see it and find out why it breaks.

The only real benefit I see in your list of 'Pros' is number 2 (the subject of which has been debated extensively on wp-hackers, IRC, etc) and is more easily solved by developer education.

comment:3 bananastalktome14 months ago

  • Cc bananastalktome@… added

comment:4 markoheijnen14 months ago

I think we discussed this already a few times. Also saving something you need to play later with is weird to do. The twitter data has a lot of data you don't use at all. There was a ticket somewhere that discussed the json to transient issue and that was closed as wont-fix if I can remember it correctly.

I would move this discussion to the hacker list. It seems more appropriate over there.

Last edited 4 months ago by markoheijnen (previous) (diff)

comment:5 sanchothefat14 months ago

  • Resolution set to wontfix
  • Status changed from new to closed

Replying to johnbillion:

Thanks John, now that I'm thinking about it more I can see why so as you suggest fixing unicode serialisation would be a better proposal.

I'll post an example from the twitter API where I noticed it first as soon as I find one. Can't seem to replicate it directly with serialize and unserialize using unicode emoticons and the like just yet. Must have been a very obscure character.

In the meantime this outlines one possible way to get around the issue:

http://davidwalsh.name/php-serialize-unserialize-issues

I think my question has been answered anyway. Will close the ticket.

comment:6 helen14 months ago

  • Milestone Awaiting Review deleted
Note: See TracTickets for help on using tickets.