WordPress.org

Make WordPress Core

Opened 3 years ago

Last modified 5 weeks ago

#18322 new defect (bug)

The Road to Magic Quotes Sanity

Reported by: ryan Owned by:
Milestone: Future Release Priority: normal
Severity: major Version: 3.2.1
Component: Bootstrap/Load Keywords:
Focuses: Cc:

Description

For back compat reasons, wp_magic_quotes() performs addslashes() on GPCS data. This is a pain, especially given that some core API expects slashes and some doesn't. In hopes of someday losing the automatic GPCS slashing, let's introduce a flag to turn off the slashing as well as slash and unslash functions that consult the flag. If slashing is on, these functions add and strip slashes. If slashing is off, they return data unchanged. Plugin authors can start using these functions and testing their code with GPCS slashing turned off and on. Eventually, GPCS slashing would default to off and all calls to the slash and unslash functions could be removed from core.

Attachments (2)

18322.diff (45.2 KB) - added by ryan 3 years ago.
Incomplete patch showing the gist
WP_Request.php (3.2 KB) - added by dd32 3 years ago.

Download all attachments as: .zip

Change History (39)

ryan3 years ago

Incomplete patch showing the gist

comment:1 ryan3 years ago

Patch introduces a $wp_magic_quotes global flag, which defaults to on. wp_magic_quotes() slashes all GPCS if the flag is on. wp_slash() and wp_unslash() also consult this flag. If on then addslashes() and stripslashes() are performed by wp_slash() and wp_unslash(), respectively. If off then they simply return the data without adding or stripping.

Last edited 3 years ago by ryan (previous) (diff)

comment:2 nacin3 years ago

JJJ would want me to recommend that these functions are named maybe_unslash() and maybe_slash().

I like this - let's consider encapsulating the global inside a function that can be called to A) set and B) get the value. As in, one less global.

comment:3 scribu3 years ago

+1 on maybe_*()

I like this - let's consider encapsulating the global inside a function that can be called to A) set and B) get the value. As in, one less global.

Static variable?

comment:4 ryan3 years ago

I thought about maybe. I'm not so convinced that I want to propagate maybe everywhere.

comment:5 WraithKenny3 years ago

  • Cc Ken@… added

comment:6 CaptainN3 years ago

  • Cc CaptainN added

comment:7 CaptainN3 years ago

I wrote a small proof of concept for an idea I had to make runtime compatibility patching easier. The idea is you would use an ArrayObject, and overwrite the $_GET, etc. vars, to make it easy to switch between slashed and unslashed in different contexts.

The problem with this approach is that it changes those vars to an object type, so they'll fail is_array and from what I've read also won't work in some places where array is required (through php 5 type hinting), so I didn't flush it out any further. I thought I'd share, in case there is any merit to the idea, or a PHP wizard more magical than me could smooth the edges (or if PHP changes their core to accept ArrayObject in places where array is currently required).

<?php
// From formatting.php ln: 1233

/**
 * Navigates through an array and removes slashes from the values.
 *
 * If an array is passed, the array_map() function causes a callback to pass the
 * value back to the function. The slashes from this value will removed.
 *
 * @since 2.0.0
 *
 * @param array|string $value The array or string to be stripped.
 * @return array|string Stripped array (or string in the callback).
 */
function stripslashes_deep($value) {
	if ( is_array($value) ) {
		$value = array_map('stripslashes_deep', $value);
	} elseif ( is_object($value) ) {
		$vars = get_object_vars( $value );
		foreach ($vars as $key=>$data) {
			$value->{$key} = stripslashes_deep( $data );
		}
	} else {
		$value = stripslashes($value);
	}
	
	return $value;
}

function addslashes_deep( $value )
{
	if ( is_array( $value ) ) {
		$value = array_map( 'addslashes_deep', $value );
	}
	elseif ( is_object( $value ) )
	{
		$vars = get_object_vars( $value );
		foreach ( $vars as $key=>$data ) {
			$value->{$key} = addslashes_deep( $data );
		}
	}
	else {
		$value = addslashes( $value );
	}
	return $value;
}

class WP_GPC extends ArrayObject
{
	// set the default magicness here
	public $magic_quotes = true;
	
	private $raw;
	
	public function __construct( $data )
	{
		$raw = $data;
		
		// Check ini setting, get rid of built in slashes
		if ( get_magic_quotes_gpc() ) {
			$data = stripslashes_deep( $data );
		}
		// set the props
		parent::__construct( $data );
	}
	
	public function offsetGet($offset)
	{
		$value = null;
		if ( parent::offsetGet( $offset ) )
		{
			if ( $this->magic_quotes ) {
				$value = addslashes_deep( parent::offsetGet( $offset ) );
			}
			else {
				$value = parent::offsetGet( $offset );
			}
		}
		return $value;
	}
}

header('Content-Type: text/plain');
ini_set("html_errors", "0");

print_r($_GET);

$_GET = new WP_GPC( $_GET );
$_POST = new WP_GPC( $_POST );
$_COOKIE = new WP_GPC( $_COOKIE );
$_REQUEST = new WP_GPC( $_REQUEST );

echo $_GET['some_query_var'];

$_GET->magic_quotes = false;

echo $_GET['some_query_var'];

?>

comment:9 Mick P.3 years ago

Please remember. From an end user point of view, the main problem is having to constantly unslash some data that is wanted to be pristine. The easiest way to deal with it is to overwrite the data in the array at some point, but it has to be after whenever WP does its version of Magic Quotes (assuming WP uses the same algorithm as PHP) but to do that you need to be certain that WP is doing its MQ, which you cannot assume it always will.

http://wordpress.org/support/topic/wp-automatically-escaping-get-and-post-etc-globals?replies=3#post-2368782

Last edited 3 years ago by Mick P. (previous) (diff)

dd323 years ago

comment:10 dd323 years ago

attachment WP_Request.php added

The ArrayObject work above replacing $_GET inspired my thought, so I've whipped this class up.

It doesn't deal with the functions expecting slashed or otherwise, but reinvents the wheel by giving a wrapper for the superglobals - but in doing so, adds functionality which most plugins require (is the variable set? Else we've got a default for that), along with extracting only a selection of the fields (ie. ?key=1&key2=2&key3=3 we could use GET::get( array('key', 'key3', 'key99'), array('key99' => 'Agent 99') ); to only get those 3 fields, droping back to the $defaults for unset data)

Quick fun half hour project to see what was possible with PHP5 magic.

comment:11 markjaquith2 years ago

Ryan — I think it's futile to hope for a future where we can flip magic quotes off. There's a non-negligible chance that we'll introduce security issues for the plugins that don't change. And you know they won't. So what's the problem we're trying to solve? I think you outlined it here:

This is a pain, especially given that some core API expects slashes and some doesn't.

So why not give them a new way to access those superglobals that is non-slashed by default? Proposal (for how you'd use it... you can infer the implementation):

$foo = _GET( 'bar' ); // OLD: $foo = $_GET['bar'];
_GET( 'bar', 'newvalue' ); // OLD: $_GET['bar'] = 'newvalue';

And so forth for the other magic-quoted superglobals. Yes, we're doing double work in terms of pre-slashing and then unslashing on access... but we are already. And this is simpler than remembering to wrap everything with stripslashes(). Also, we can do fancy stuff, if we need to, and filter superglobal access through this API. Don't have a use case for that... but we'd have the option.

We could do complicated stuff with implementing the ArrayObject interface, like Dion's code did... which gives us iteration, etc. But I don't think we need to. Frankly, even setting superglobals is sketchy.

comment:12 ryan2 years ago

I think of this primarily as an audit where we mark all core slashing and unslashing mischief with new functions. Being able to set policy is bonus.

comment:13 ryan2 years ago

Auditing and using maybe_slash/unslash() would always make clear where esc_sql() and wpdb:escape() are being used incorrectly. Those are fine in a maybe slash/unslash context, but not when wanting a real escape.

comment:14 hakre2 years ago

About a new API, I'd like to give some pointers: Probably it's feasibly to create a request object which abstracts and encapsulates the request properly:

$request = WP_Request::createFromGlobals();

The key point here is that you don't have static GET::/POST::/WHATEVER:: functions (e.g. replacing $_GET with GET::get()). That's more or less only offering a new interface for the same thing. Instead encapsulate what varies and offer a real interface in form of an instance. If you need a global access to it, put it in a global variable as it's common within Worpdress.

That's then one point instead of (additional) global (static class) functions.

I suggest to take a look into the API of the Symfony2 Request, the sourcecode or the docs for some inspiration.

The current dependency of wordpress is $_GET, $_POST, $_SERVER, $_COOKIE and $_REQUEST afaik. So the request class must not be equally evolved as the Symfony2 Request.

The key point is to start moving away from the (super)global dependencies. Then you can change things easier over time including the slashes domain.

You want to change that, the root problem aren't the quotes technically, but to decide whether they are in or not globally. If we remove the many superglobals by encapsulating them, the problem of the superglobal depency is gone, it's easy to set policy. And you're even able to test the application as the request can be injected. That will become crucial if you want to change the processing of slashes for input data.

And take care: Every level of indirection comes with a price, take it lightly.


For the status quo, I think it's a good idea to go through the functions and find out if the data that is processed by these functions is native/raw, slash-encoded or undefined (often called maybe). One way could be to add a docblock tag that documents it so you can run a simple static code-analysis over all files afterwards to index functions used and to get a better impression.

The parts that are maybe are a smell then, that the usage of the function is not clear.

Render a dependency tree of all functions that are calling a maybe function and it would be more visible how maybe affects the codebase and how large the problem is.

Related: #17018, #12402, #10360, #5791

Last edited 2 years ago by hakre (previous) (diff)

comment:15 scribu2 years ago

Related: #20569

comment:16 johnbillion2 years ago

  • Cc johnbillion added

comment:18 rmccue18 months ago

Also related: #21767

comment:19 aaroncampbell15 months ago

  • Cc aaroncampbell added

comment:20 andyadams15 months ago

  • Cc andyadamscp@… added

comment:21 sc0ttkclark15 months ago

  • Cc lol@… added

comment:22 tomdxw15 months ago

  • Cc tom@… added

comment:23 dave101015 months ago

  • Cc dave1010@… added

comment:24 johnbillion12 months ago

#24130 was marked as a duplicate.

comment:25 thanatica212 months ago

So the biggest problem, realistically speaking, is that any sane standard test for magic quotes will always result in FALSE as of PHP 5.4, while magic quotes are still being applied. There seems to be no test for WP's magic quotes. So even in the future when WP fixes this, scripts that actually depend on it will fail. But I couldn't care less, because this is wrong behaviour. And wrong behaviour should not be maintained for the sake of old scripts and more bad code, not to mention performance hits. Just fix it and eat the consequences. You should have known it was coming to a point like this.

So here's a solution.

DO NOT apply magic quotes. Seriously. Remove them when PHP added them, but NEVER ADD THEM BACK. Quite simple really. I could even propose a piece of code that fixes everything. But it's so blazingly simple it shouldn't be neccesary.

comment:26 rlange8 months ago

addslashes()—called by add_magic_quotes(), which is itself called by wp_magic_quotes()—has the nasty side effect of silently converting integers to strings.

This is most obvious when working with $_SERVER['REQUEST_TIME'], which is an integer.

I had written code outside of WordPress that attempts to parse the given value using strtotime() if it's a string, or accept it as a timestamp if it's an integer. Unfortunately, strtotime() doesn't parse timestamps that are strings. This code failed when used within a WordPress site.

The workaround is simple enough, but I think this qualifies as unexpected behavior. A possible solution is for WordPress to check if the value is a string before applying addslashes().

comment:27 westonruter6 months ago

  • Cc weston@… added

comment:28 Denis-de-Bernardy5 months ago

  • Cc ddebernardy@… added

comment:29 jdwire5 months ago

  • Cc jdwire added

comment:30 nacin3 months ago

  • Component changed from General to Bootstrap/Load

comment:31 follow-up: arman.poghosyan2 months ago

  • Severity changed from normal to major

I was having issues with slashes in my plugin while sending JSON string in POST to manipulate object data in php. It's nonsense to have magic_quotes on in Wordpress and it's very hard to debug first time.

I'm using PHP > 5.4 both on local servers and production servers. Majority of shared hostings are already on PHP > 5.4, so magic_quotes are disabled and deprecated by default and nobody expects that it can cause problem.

Further checking magic_quotes in PHP with get_magic_quotes_gpc() always returns 0, so I couldn't even imagine that Wordpress could add quotes on its own.

I think it's time to change this behavior and or turn it off by default. Or intorduce some flag in config, that will make it obvious, that it's Wordpress issue.

comment:32 in reply to: ↑ 31 ; follow-up: Denis-de-Bernardy2 months ago

Replying to arman.poghosyan:

I'm using PHP > 5.4 both on local servers and production servers. Majority of shared hostings are already on PHP > 5.4, so magic_quotes are disabled and deprecated by default and nobody expects that it can cause problem.

It's closer to 10%-ish of hosts, actually… :-(

http://wordpress.org/about/stats/

comment:33 in reply to: ↑ 32 arman.poghosyan2 months ago

Replying to Denis-de-Bernardy:

Replying to arman.poghosyan:

I'm using PHP > 5.4 both on local servers and production servers. Majority of shared hostings are already on PHP > 5.4, so magic_quotes are disabled and deprecated by default and nobody expects that it can cause problem.

It's closer to 10%-ish of hosts, actually… :-(

http://wordpress.org/about/stats/

Are these charts up to date?
On all shared hostings that we use for our clients (I don't know if it is appropriate to name hosting companies here, but they are actually top (read most advertised and used) hostings)) PHP 5.4 is set up by default (on a few of them, PHP 5.2 is on, but you can easily change to 5.4 from CPanel) and on some of them you can even switch up to PHP 5.6.

On VPS and Dedis that we use we always update to latest stable version available (CentOS/RedHat default repository) and it's also PHP > 5.4.

Unfortunately this issue doesn't affect everybody even with lates PHP versions, as majority only uses GET and POST to update entries in MySQL database with $wpdb, in that case everything is always OK. But in case if you escape data on your own or if you use JSON data as me, you will for sure get acquanted with strange core behavior.

I think that it will be more appropriate to say that even if 100% use PHP > 5.4, only some 10% will be affected by the issue.

comment:34 follow-ups: thanatica22 months ago

Why are we still discussing this? Just remove the magic quotes. They are not neccesary.

comment:35 in reply to: ↑ 34 aaroncampbell2 months ago

Replying to arman.poghosyan:

Are these charts up to date?
On all shared hostings that we use for our clients (I don't know if it is appropriate to name hosting companies here, but they are actually top (read most advertised and used) hostings)) PHP 5.4 is set up by default (on a few of them, PHP 5.2 is on, but you can easily change to 5.4 from CPanel) and on some of them you can even switch up to PHP 5.6.

Yes, those charts are up to date. Even if 5.4 were the default on all hosts (and it definitely isn't), that would only affect new sites. All existing sites would still be on whatever the default was when they first signed up for hosting. Very few people ever change their PHP version unless a host forces them so, and hosts don't change it for you because it could break things. Unfortunately it will be quite a while until a majority of WordPress sites are on PHP 5.4+

Replying to thanatica2:

Why are we still discussing this? Just remove the magic quotes. They are not neccesary.

I know this is a pretty long ticket (especially if you account for all the related ones that are linked), but it's definitely worth taking the time to carefully read each comment. Currently magic quotes *are* necessary because removing them could easily open us to unexpected security vulnerabilities. And even if we fix all those in core, there would likely be hundreds (conservative estimate) of plugins that would be suddenly vulnerable because they were assuming slashed data and it wasn't.

I think we'd all like to get rid of the forced slashing, but we need to come up with a way to do it that doesn't result in thousands of vulnerable sites. We haven't yet found an elegant way to do that.

comment:36 in reply to: ↑ 34 knutsp2 months ago

Replying to thanatica2:

Why are we still discussing this? Just remove the magic quotes. They are not neccesary.

If you mean magic_quotes_gpc = Off in PHP (<5.4) that's the default, and the whole thing removed in PHP 5.4. That INI-setting doesn't affect WordPress. WordPress has it's own magic quotes function, if I have got it right. And then that usage is what's being discussed.

comment:37 nacin5 weeks ago

#27421 was marked as a duplicate.

Note: See TracTickets for help on using tickets.