#2591 closed defect (bug) (fixed)
users can enter dangerous serialized strings
Reported by: | random | Owned by: | markjaquith |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 2.0.2 |
Component: | Security | Keywords: | serialize has-patch |
Focuses: | Cc: |
Description
Users can enter serialized objects etc. as strings in (e.g.) the "first name" field on the profile page, and these strings aren't necessarily safe to unserialized.
For example, serialized objects run the magic _ _wakeup() function when they're unserialized. If the PDO extension is enabled -- and it is by default in PHP 5 -- you can cause a fatal error with this:
O:3:"PDO":0:{}
Much worse, you can enter something like:
a:100000000:{}
i.e., an array with 100,000,000 elements. PHP doesn't know they're empty, so it starts eating up memory. In my test it crashed Apache.
Either way, it's a problem. I think it's best to add something to maybe_unserialize(), since just sanitizing input will still leave plugins vulnerable down the road.
I'm not adding a patch since I can only think of hacks like checking for /O:/ or that the number of elements doesn't go over some arbitrary limit. Thoughts?
Maybe a type-hinting field for the setting, so something input as a string isn't unserialized?
Attachments (7)
Change History (36)
#4
@
19 years ago
random's suggestion is one I've also proposed. The counter-argument was that it would make it difficult for non-PHP scripts to make use of string data in our options tables.
I think it's pretty trivial to parse a PHP serialized string, so we shouldn't worry about non-PHP programs integrating with our options tables. Serializing every option is fine by me.
#5
@
19 years ago
Am I missing something? If we serialized everything, wouldn't we unserialize everything as well? Why would anyone have to worry about parsing strings, wouldn't a string be returned?
+1 for serializing everything.
#7
@
19 years ago
As long as plugins are using the API, it should be fine. I can think of one place in WP where the API is not used, (direct query in wp-settings.php:123) but it just checks that there is a value, it doesn't use the value itself.
As for upgrading of existing strings, couldn't we just let it happen naturally? We'd still be passing stuff through the "maybe unserialize" function, so it'd just get upgraded whenever it was updated.
#8
@
19 years ago
I'm testing out the uploaded patch on a local SVN test install. It serializes in update_option() and add_option() in every case (not just object and array).
Problem: using the secret "options.php" page will destroy your options table. The serialized strings are re-serialized, and serialized arrays are serialized as a string. We'll have to make a special case for this page, and skip serialization.
#9
@
19 years ago
- Owner changed from anonymous to markjaquith
- Status changed from new to assigned
Okay, I kludged it and made it skip serialization on options.php
Also, user options are now serialized on update, which is really the area where there is the most danger of a malicious attack (someone with a subscriber profile inserts a 1 million member array and crashes the server).
I thought of other ways of avoiding double serialization... there are GPL is_serialized() functions floating around, but they'd likely leave openings for abuse, as well as cause a lot of extra cycles on option updates... so a kludge for /wp-admin/options.php might actually be the best way. Thoughts?
#10
@
19 years ago
There's a serialize call in delete_usermeta() as well as update_usermeta(), and let's not forget postmeta.
Speaking of postmeta, changeset:3656 added querying by meta_value, so that'd need changing too. (Speaking of which, is querying going to be a problem elsewhere?)
For options.php, we could unserialize and gettype() the result, reserialize if necessary for display, but add a hidden
<input type="hidden" name="type[$option_name]" value="array" />
field (or similar) to keep track of what's already serialized when the data gets sent back.
#11
@
19 years ago
You know what? Serializing everything is really more trouble than it is worth. It will break dozens of plugins, and make troubleshooting a nightmare. Manually editing the options table will be a hellish experience.
The problem (i.e. the security risk) is this: serialized objects/arrays are inserted into the database as simple text, but are interpreted as objects/arrays when coming out. We need to recognize when someone is trying to do that, and block it.
So what we need is a function that can detect serialized data WITHOUT doing an acual unserialize test on it. I've seen several GPL functions that do that, using various techniques. So what I'm going to do is find all such GPL functions, merge them (if necessary) and use that is_serialized() check for the update/add/delete functions so that it rejects serialized data. Next, I'll need to make changes to options.php and post meta so that it doesn't display serialized data. For example, the feeds that are stored in the options table show up in options.php, and you can save them. Honestly, serialized data is a bear to edit, and I really don't think anyone is going to miss it if we just have the value field deactivated and maybe write "array" or "object" and ignore those fields on update.
How's that sound? I just really don't think that serializing plain text strings is a good move in terms of usability and backwards compatability.
#12
@
19 years ago
How about prepending a hash of the DB password or some such server secret to all serialized values, then unserializing only the values that matched that key?
#13
@
19 years ago
Hm, possible solution, but it'd have to be more permanent than that, as a DB password change would effectively kill the blog.
Still working on my patch.
#14
@
19 years ago
First stab at is_serialized()
and associated code. Noticed that a lot of the object/array serialization stuff was duplicated, so I made and used a new function prepare_data()
I also made maybe_unserialize()
first check is_serialized()
because it should be faster to run the small grep than do the unserialization... because most options are not serialized.
Post meta that is serialized is simply not shown to the user.
on options.php, serialized data is marked as such, and the value is not editable.
#15
@
19 years ago
This might be a bit simpler:
function is_serialized($arg) { return unserliaze($arg) !== false; }
See http://us2.php.net/unserialize for an explanation.
#16
@
19 years ago
Instead of throwing an error if is_serialized() is true, that string could be double-serialized -- I mean, if I self-identify as a serialized array, who are you to stop me putting that in my bio? ;)
Seriously though, I think the Right way is to ensure that something saved as a string isn't unserialized, not to put a fence up to keep certain types of strings out. Short of an ugly meta_value_type column I don't see a non-hackish way to do that, though, so +1 for Mark's latest patch.
David, running unserialize() on everything kind of defeats the purpose of having an is_serialized() function at all. :)
#17
@
19 years ago
Good idea, random. I was worried about false-positives. This way, data that looks to be serialized can still be stored, but won't be dangerous, because it's just a serialized string. Those instances should be very rare. And anyway, that data would cause problems now anyway... because what should be a string will come out looking like an array or an object... this way we preserve it as a string.
Writing it up now.
And David, the reason we don't want to try to unserialize it is that it could be a string masquerading as serialized data, like an array with 100 million members. the maybe_unserialize() function does just that, and that's what is causing this whole problem.
#19
@
19 years ago
Latest patch does the following:
- Strings that appear to be already serialized data are serialized again, when adding/updating the meta/option
- Serialized strings are unserialized for display in postmeta and option.php fields
- Serialized arrays/objects are displayed as %SERIALIZED_DATA% on options.php and the field is uneditable. These values are ignored on update.
- maybe_unserialize() runs an is_serialized() check first, instead of just blindly unserializing. This should make it faster (I think).
- A new function, prepare_data(), does the work of serializing objects/arrays as well as serializing strings that are already serialized data, and doing trim() on strings. This replaces about 4 or 5 instances of redundant code (yay!)
- Custom fields with non-string data are simply not displayed. They're not meant to be user-editable anyway.
I think that's it. Please test thoroughly. Try entering malicious data like the data "random" presented in a custom field or an options field. Try entering it in an options.php field. Try to find an instance where valid, serialized data is not serialized again (if it shows up in options.php and unserialize($data) doesn't return false, you've found a false negative).
#20
@
18 years ago
This is basically a data escaping problem. The requirement is to reliably return the same data which was put into the database, whether a scalar or a complex object.
The solution is to prefix all complex objects with a '*', and to prefix all scalar strings which begin with a '*' or a '$' with a '$'. That way, "$fred" becomes "$$fred" and "*fred" becomes "$*fred" but an object which is serialized looks something like "*a:5000:{}".
When processing data read from the database, if a string begins with "*" then deserialize it, and if it begins with "$" then strip the leading $ sign.
(You can choose less common characters than '*' and '$', I just used them to illustrate the concept).
#21
@
18 years ago
- Keywords has-patch added
+1 on the patch markjaquith, as this is still an open security hole.
#22
@
18 years ago
Take 4 is up. Things to try:
Enter this into postmeta, an option field, an options.php field, or a profile field:
s:4:"test";
It should stay the same. If you get test
back out, that's a problem.
Enter this into the same places:
a:100000{
}
your server shouldn't crash, and you should get that same string back.
In both cases, the serialized data you tried to sneak in should appear in the DB as that string re-serialized.
objects or arrays inserted by the system should look like like they did before... serialized representations. plain text strings should also stay the same. The only thing that gets treated specially is user-entered text masquerading as a pre-serialized object.
#25
@
18 years ago
- Milestone changed from 2.1 to 2.0.5
- Resolution fixed deleted
- Status changed from closed to reopened
Patch in the works for 2.0.5
#27
@
18 years ago
Would appreciate if people would test this out thoroughly, on test installs.
Make sure that a serialized string doesn't come out as the string it represents. Make sure that slashes are appropriately added or stripped. Make sure your options, usermeta, and postmeta are not corrupted, using the built-in functions.
One thing I've found, this makes saving your options.php page MUCH faster because it's not churning through those feed arrays.
We should probably get smart and maintain a list of what can be unserialised and what is just plain text. No untrusted data should ever be unserialised.