Make WordPress Core

Opened 12 years ago

Last modified 2 months ago

#22325 new enhancement

Abstract GPCS away from the superglobals

Reported by: rmccue's profile rmccue Owned by:
Milestone: Priority: normal
Severity: normal Version:
Component: Bootstrap/Load Keywords: has-patch 2nd-opinion needs-unit-tests
Focuses: Cc:

Description

As discussed at #wpcs, it looks like we want to move away from directly using the GPCS superglobals. This gives us a way to handle slashing backwards compatibility moving forward.

This is still a heap of versions away, but this is a way to keep any notes central.

Attachments (5)

GPCS.php (5.2 KB) - added by MikeSchinkel 12 years ago.
Proposed implementations of _GET(), _POST() and _REQUEST()
GPCS-2.php (7.5 KB) - added by MikeSchinkel 12 years ago.
2nd Proposed implementation of _GET(), _POST() and _REQUEST(), etc.
22325.diff (2.0 KB) - added by rmccue 12 years ago.
Initial pass at GPCS abstraction
22325.2.diff (2.4 KB) - added by rmccue 12 years ago.
Use static variables with accessor functions
22325.3.diff (4.3 KB) - added by MikeSchinkel 9 years ago.
Updated patch for superglobals

Download all attachments as: .zip

Change History (56)

#1 @scribu
12 years ago

Yes, and the idea was to look at how various PHP frameworks, like Symfony, Zend etc. handle it.

#2 @SergeyBiryukov
12 years ago

Related/duplicate: #17856

#3 follow-up: @CaptainN
12 years ago

  • Cc Kevin@… added

I was bouncing that around my head while working on OnceForm - if you encourage users to use an alternative method for retrieving the CPCS data (OnceForm does this for example), there's hope the auto-slashing behavior can be retired at some point.

This should be a publicly advocated API (for plugin developers), not just something used within core.

Once it's in there, you may even be able to do some fancy plugin repo searches for the existing super globals, and at least get an audit of how many plugins are actually affected by this.

#4 follow-up: @rmccue
12 years ago

Symfony

In Symfony, the main class for this stuff is Symfony\Component\HttpFoundation\Request. This class is generic for all types of HTTP requests, including sending and receiving. When receiving, Request::createFromGlobals() is used to make a Request object from the superglobals. It also has a bunch of accessors for things like the host, host + scheme, whether it's a secure request, the request body, etc; basically, it abstracts all the request-related stuff.

It uses abstraction classes called "bags" for the various arrays, such as HeaderBag and ParameterBag. The HeaderBag ensures that keys are lowercased before looking up, since headers are case-insensitive, and it also does some fancy parsing with cache-control headers. The ParameterBag class handles most other ones, and does some fancy stuff: doing $bag->get('a[b][c]') accesses $bag->parameters['a']['b']['c']. They don't use ArrayAccess though.

Laravel

Laravel appears to use Symfony's HttpFoundation, but with an extra abstraction on top of it. Overall, pretty useless.

CakePHP/CodeIgniter

CakePHP and CodeIgniter doesn't abstract any of the GPCS data, but Cake has some accessors for some of the data (e.g. Accept headers, useragent stuff), and responding to the browser is also handled by responding to that.

Zend Framework 2

ZF2 has Zend\Http\Request which does abstract everything. This has accessors named getQuery(), getPost(), getCookie(), getFiles(), getHeaders() and getHeader(). Digging into this any further had me banging my head against the wall, so someone else is welcome to look at this.


Basically, my thoughts are that Symfony is the only one that we should consider looking at.

#5 in reply to: ↑ 3 @rmccue
12 years ago

Replying to CaptainN:

I was bouncing that around my head while working on OnceForm - if you encourage users to use an alternative method for retrieving the CPCS data (OnceForm does this for example), there's hope the auto-slashing behavior can be retired at some point.

This was actually born during the slashing discussion, and was our favourite way to move forward on the issue.

#6 @johnbillion
12 years ago

  • Cc johnbillion added

#7 @toscho
12 years ago

  • Cc info@… added

#8 follow-up: @mbijon
12 years ago

  • Cc mike@… added

Symfony is definitely the only future-proofed candidate here. But, if we add Symfony we should be prepared for a bump to requiring 5.3.

The Request component in Symfony will, I think, work on PHP 5.2.4, but we will likely need to grab a bigger part of Symfony than that. For example:

My day job lately is 90% Drupal (I know, give me a minute though...) and they, along with several other CMSes/apps, are integrating Symfony to enable REST-forward or "first-class citizen" REST. Here's Drupal's commit log from making that happen, http://drupalcode.org/project/drupal.git/commit/c85d62c60902086e39fdea3ac5daa7f29f92edeb. Note that all of Symfony's HttpFoundation and ClassLoader are included.

I think the above link can also serve as a template for WordPress if Symfony is selected.

Edit: Just found drupal/org's research on Zend vs. Symfony from their decision-making on this: http://groups.drupal.org/node/167299

Last edited 12 years ago by mbijon (previous) (diff)

#9 in reply to: ↑ 8 @rmccue
12 years ago

Replying to mbijon:

Symfony is definitely the only future-proofed candidate here. But, if we add Symfony we should be prepared for a bump to requiring 5.3.

At the moment, that's really out of the question. The stats page shows that we still have a huge 5.2 userbase, and I think it's somewhat unlikely that this will change soon. Personally, I'd love to drop 5.2, but practically, it's not possible.

I think what would most likely happen is that we would have to fork the Request component, convert to 5.2-compatible, and then attempt to keep that up-to-date. It sucks, but it's most likely the only way. The code it seems relatively stable (Kernel activity and line changes; Foundation activity and line changes) but those are still a fair number of changes to backport constantly. Granted, most probably don't need any changes for 5.2 compatibility.

The Request component in Symfony will, I think, work on PHP 5.2.4, but we will likely need to grab a bigger part of Symfony than that.

I'm not sure what we'd need apart from Kernel and Foundation. The ClassLoader is just a standard SPL class loader really, and the rest is self-contained.

Edit: Just found drupal/org's research on Zend vs. Symfony from their decision-making on this: http://groups.drupal.org/node/167299

That's a great link; thanks for that! I think that further confirms that Zend is dead-in-the-water as far as this is concerned; it's also very interlinked with their other stuff, while - as noted - Symfony components are very much standalone.

#10 @nacin
12 years ago

I'd like to adopt a GPCS wrapper pattern that makes sense. To do that, I was figuring we could study what other PHP frameworks and systems were doing. Adopting or actually forking them is not necessary (and may not always be okay license-wise).

The only thing we really care about for now is raw access to GPCS globals. We already do things like repair SERVER in wp_fix_server_vars(). Some of the helper functions in Symfony's Request are kind of nifty but they don't seem at all necessary at this point.

This should probably have been a P2 post to better facilitate conversation. I will likely move the conversation there in the coming days.

#11 follow-up: @CaptainN
12 years ago

I was thinking something a bit more simple, a set of functions that mimic the GPCS globals.

// instead of
$something = $_GET['something'];

// use the new
$something = _get('something');

We could simulate other things like calling those methods with no arguments would return the entire array, multie args for multidimensional arrays etc.

Last edited 12 years ago by CaptainN (previous) (diff)

@MikeSchinkel
12 years ago

Proposed implementations of _GET(), _POST() and _REQUEST()

#12 in reply to: ↑ 11 @MikeSchinkel
12 years ago

  • Cc mike@… added

Replying to CaptainN:

I was thinking something a bit more simple, a set of functions that mimic the GPCS globals.

// instead of
$something = $_GET['something'];

// use the new
$something = _get('something');

+1.

Suggestions:

1.) It would be great if we could finally get away from this pattern:

$foo = isset( $_GET['foo'] ) ? $_GET['foo'] : false

and instead be able to do (something like) this (which defaults to 100):

$foo = _GET( 'foo', 100 ); 

And maybe this defaults to null:

$foo = _GET( 'foo' ); 

2.) If it supports defaults we need a way to test for existence, and we could do this:

if ( ! has_GET( 'bananas' ) )
  echo "We have no bananas today.";

3.) We might need to test count of $_POST to see if a form was submitted, so passing no parameters to has_POST() could do that:

if ( ! has_POST() )
  echo "No form submitted.";

4.) For accessing multi dimensional arrays we could pass in an array of keys instead of a string or numeric key:

// For <input type="text" name="test[abc][123][xyz]" />
var_dump( _REQUEST( array( 'test', 'abc','123','xyz' ) ) );

5.) The examples above use uppercase for _GET() and _POST(); those would be more consistent with the existing vars than using _get() and _post(), be easier to see in code, and probably be less likely to conflict with existing code.

I've implemented and (mostly) commented _GET(), _POST() and _REQUEST() in a self-contained attached example file GPCS.php. You can drop in to the root of a WordPress site to test and play with. I didn't have time to implement the rest of the superglobals but might work on them if people like this approach.

#13 follow-up: @CaptainN
12 years ago

Another way to reduce validation boilerplate is to use a system like WP_OnceForm, which polyfills the missing values. OnceForm starts from the HTML form, so it knows up front which name/value pairs will need to be filled in.

The php GPCS system is different, it just deals with whatever is sent to the server without referencing anything else, so if the scope will be limited in the same way as PHP's GPCS system, some query methods to check for missing items as MikeSchinkel described makes sense (they even make sense for dealing with the standard PHP superglobals).

A comment on that implementation though - I'd prefer lower case function names (_get, _post, _request) to uppercase, because I think if these are lower case, it'll be less likely to be confused with the real superglobals at a glance.

Maybe this is less of an issue, but I'd also like to see little else other than a wrapper around the actual superglobals. In other words, it should behave pretty much exactly as the PHP standard on which it's based. The headache with the current system (especially when you are first getting started) is that WordPress's model differs from PHP in a couple of substantial (magic-quotes), and some less substantial (the way the GPCS vars are mixed) ways, which causes a lot of problems before you find out what WordPress is doing in there, and here's a change to fix that.

#14 in reply to: ↑ 13 ; follow-up: @MikeSchinkel
12 years ago

Replying to CaptainN:

A comment on that implementation though - I'd prefer lower case function names (_get, _post, _request) to uppercase, because I think if these are lower case, it'll be less likely to be confused with the real superglobals at a glance.

The reason I like uppercase is because I think _get(), _post(), _request() would too easily get lost in other code and it would be harder for developers (like me) to see them while scanning. But it all depends on what the decision makers think I guess, and I'm not one. :)

Maybe this is less of an issue, but I'd also like to see little else other than a wrapper around the actual superglobals. In other words, it should behave pretty much exactly as the PHP standard on which it's based.

What do you propose for accessing not existent elements; throwing an error? Personally I would hate to loose the opportunity to finally have a standard way to cleanly get a default value.

#15 @WraithKenny
12 years ago

  • Cc Ken@… added

#16 in reply to: ↑ 14 ; follow-up: @CaptainN
12 years ago

Here's a quick version I through together (without testing):

<?php
$wp_get = $_GET;
$wp_post = $_POST;
$wp_request = $_REQUEST;
$wp_cookie = $_COOKIE;
$wp_server = $_SERVER;

function wp_gpcs( $gprcs, $key )
{
	global $wp_get, $wp_post, $wp_request, $wp_cookie, $wp_server;

	// return the whole array if no $key specified
	if ( is_null( $key ) )
		return $$gprcs;

	// return the value of the key
	if ( isset( $$gprcs[ $key ] ) )
		return $$gprcs[ $Key ];

	// if not set, return NULL - this differs from standard PHP
	// behavior, which would throw a notice.
	else
		return NULL;

	// :TODO: Expand with support for multi-dimensional arrays.
}

function _get( $key = NULL ) {
	return wp_gpcs( 'wp_get', $key );
}
function _post( $key = NULL ) {
	return wp_gpcs( 'wp_post', $key );
}
function _request( $key = NULL ) {
	return wp_gpcs( 'wp_request', $key );
}
function _cookie( $key = NULL ) {
	return wp_gpcs( 'wp_cookie', $key );
}
function _server( $key = NULL ) {
	return wp_gpcs( 'wp_server', $key );
}

That whole thing could be wrapped in a class to make it much cleaner, but this is just a quick and dirty example.

This would have be be inserted before the WordPress magic quotes operations. I changed the part where it would normally throw a notice (or error, I don't remember which it does), because I don't actually know how to do that while retaining the call stack, etc. This may be better anyway. I also didn't implement multidimensional support, but it wouldn't be too hard (using func_get_args and family).

MikeSchinkel: You have a point, it would be nice (and simulating the PHP error system may be too problematic anyway). So how do you mix the default value specification in the method signature that retains the multidimensional capability of PHP forms (name="level1[level2]"). I'd use the second (third, etc.) arguments as a way to access the indexes in the array (_get('level1','level2');), but you've used them to specify the default value.

Also, how would you know that you've set the default value? You'd still have to check that. Maybe more methods are called for?

// default value and set in the GPCS method
if ( _get( 'myVar', 'default value' ) != 'default value' ) {
    // validation passes
}

// If we return NULL:
// to check if it's set at all (for checkboxes and radios)
if ( !is_null( _get( 'myVar' ) ) ) {
    // It's set!
}

// a function to check for a default value with NULL checks
if ( _check( _get( 'myVar' ), 'default value' )  {
    // the value is validated
}

// and if you need to output the value with a default
echo _default( _get( 'myVar' ), 'default value' );




// returns value of $default if $value is null, $value otherwise
function _default( $value, $default ) {
    return ( is_null( $value ) )? $default: $value;
}

// returns false if the value is either null, or matches $default
function _check( $value, $default ) {
    return ( is_null( $value ) || $value == $default )? false: true;
}

// :NOTE: The above is psuedo code. I didn't run or test any of it.

That way we can have the multidimensional support, and a reduction of boilerplate with no errors or notices being thrown, and the GPCS methods work almost the same as PHP does (except without annoying notices).

#17 in reply to: ↑ 16 ; follow-ups: @MikeSchinkel
12 years ago

Replying to CaptainN:

Here's a quick version I through together (without testing):
$wp_get = $_GET;
$wp_post = $_POST;
$wp_request = $_REQUEST;
$wp_cookie = $_COOKIE;
$wp_server = $_SERVER;

Ignoring everything else for the moment, I think WordPress is moving away from global vars and not adding any new ones. Can someone else confirm this?

So how do you mix the default value specification in the method signature that retains the multidimensional capability of PHP forms (name="level1[level2]"). I'd use the second (third, etc.) arguments as a way to access the indexes in the array (_get('level1','level2');), but you've used them to specify the default value.

The code I uploaded used an array for that, i.e.:

$print_r(  _GET( array( 'level1', 'level2' ) ) );

Also, how would you know that you've set the default value? You'd still have to check that. Maybe more methods are called for?

// default value and set in the GPCS method
if ( _get( 'myVar', 'default value' ) != 'default value' ) {
    // validation passes
}

What's the use case where that is needed that has_GET()and not passing a default _GET( 'myVar' ) doesn't address?

a function to check for a default value with NULL checks
if ( _check( _get( 'myVar' ), 'default value' ) {

the value is validated

}

and if you need to output the value with a default
echo _default( _get( 'myVar' ), 'default value' );

_default() and _check() seems like more work than doing it the old way. IMO, of course.

#18 in reply to: ↑ 17 @CaptainN
12 years ago

Replying to MikeSchinkel:

The code I uploaded used an array for that, i.e.:

$print_r(  _GET( array( 'level1', 'level2' ) ) );

Ah! I missed that (I wrote mine while you were posting yours, and didn't check yours thoroughly - and mine was a proof of concept, not meant to be production code - more of a design mockup).

The only weird thing with that, is I tend to think of array values as "next to" rather than "in to". That they are used essentially as pivot tables is a bit weird to me (but it's not unprecedented, and it's not bad). I suppose using multiple function arguments could be said to employ a similar hack, and multiple args does limit the ability to conveniently set a default in there.

Also, how would you know that you've set the default value? You'd still have to check that. Maybe more methods are called for?

// default value and set in the GPCS method
if ( _get( 'myVar', 'default value' ) != 'default value' ) {
    // validation passes
}

What's the use case where that is needed that has_GET()and not passing a default _GET( 'myVar' ) doesn't address?

If you set a default value in a form (like "enter first name", or some invalid value in a select box, or even just an empty string), and you want the user to change the default, you'd have to check it on post back. has_GET only checks if the key is in the array.

a function to check for a default value with NULL checks
if ( _check( _get( 'myVar' ), 'default value' ) {

the value is validated

}

and if you need to output the value with a default
echo _default( _get( 'myVar' ), 'default value' );

_default() and _check() seems like more work than doing it the old way. IMO, of course.

To just check if the key is set.

// old way:
if ( isset( $_GET['myVar'] ) ) 

// vs:
if ( is_null( _get('myVar') )

// vs:
if ( has_GET( 'myVar' ) )

Or if you need to check for a default value:

// old way:
if ( isset( $_GET['myVar'] ) && 'default value' != $_GET['myVar'] )

// vs:
if ( _check( _get( 'myVar' ), 'default value' ) )

// vs:
if ( _GET( 'myVar', 'default value' ) != 'default value' )
// or:
if ( has_GET( 'myVar' ) && _GET( 'myVar' ) != 'default value' )

And finally to output the value:

// old way:
if ( isset( $_GET['myVar'] ) ) echo $_GET['myVar']; else echo 'default value';

// vs:
echo _default( _get( 'myVar'), 'default value' );

// vs:
echo _get( 'myVar', 'default value' );

The second two in each of the three case (except maybe the first) are way better than the "old way". Of course, has_GET (et al) and _check could easily be written on top of either API solution for even more brevity - for example check_GET( 'myVar', 'default value' ) could be nice in the "check for default value example" above.

I'm approaching this from an API design perspective. My primary goal is to make the API not suck. :-)

#19 in reply to: ↑ 17 ; follow-up: @CaptainN
12 years ago

Replying to MikeSchinkel:

Replying to CaptainN:

Here's a quick version I through together (without testing):
$wp_get = $_GET;
$wp_post = $_POST;
$wp_request = $_REQUEST;
$wp_cookie = $_COOKIE;
$wp_server = $_SERVER;

Ignoring everything else for the moment, I think WordPress is moving away from global vars and not adding any new ones. Can someone else confirm this?

Right, ideally that'd all be wrapped in a class and store that data in properties instead of just dumping them into the global space.

Also note, your implementation would need to do something similar. Right now you wrap the superglobals directly, which carries with it whatever baggage WordPress is dumping into those (as well as any damage done to them by plugin authors etc. - I once manually unslashed the GPCS values, before I understood why they were being magic quoted even though the php.ini setting was off - I've seen others do that, and I actually wonder how many plugins still do it).

#20 in reply to: ↑ 19 ; follow-up: @MikeSchinkel
12 years ago

Replying to CaptainN:

I'm approaching this from an API design perspective. My primary goal is to make the API not suck. :-)

That's exactly my reason too. But as is typical with two programmers, we have three opinions and they are all different. :)

Replying to MikeSchinkel:
The only weird thing with that, is I tend to think of array values as "next to" rather than "in to". That they are used essentially as pivot tables is a bit weird to me (but it's not unprecedented, and it's not bad). I suppose using multiple function arguments could be said to employ a similar hack, and multiple args does limit the ability to conveniently set a default in there.

Another benefit is the $key is a single value which can be stored and passed around as a single item. You could still do that with your approach but accessing it would require the decidedly obtuse approach of:

$value = call_user_func_array( '_get', $keys_as_array );

I personally would prefer to just know the _GET()'s first parameter is always the key in whatever form the key takes.

If you set a default value in a form (like "enter first name", or some invalid value in a select box, or even just an empty string), and you want the user to change the default, you'd have to check it on post back. has_GET only checks if the key is in the array.

Seems like this would work fine:

$first_name = _GET( 'first_name' );
if ( is_null( $first_name ) ||  "enter first name" == $first_name )
   echo "Sorry, you must enter a first name please.";

That use-case is also mutually exclusive from the need for a default value so I don't see the default value conflicting with a placeholder. If it did however, here is a contrived example:

$first_name = _GET( 'first_name' );
if ( "enter first name" == $first_name )
   echo "Sorry, you must enter a first name please.";
else
   $first_name = 'John Doe';

This could somehow be turned into an expression by adding your _default() function but that's adding another helper method and I would prefer to keep the number of new WordPress' specific functions we have to learn to a minimum. If we want to add a _default() and _check() I'd propose creating a new ticket proposing that WordPress add them.

But if you really want a single expression:

$first_name = _GET( 'first_name', 'John Doe', 'enter first name' )
if ( is_wp_error( $first_name ) )
   echo "Sorry, you must enter a first name please.";

Or even:

$first_name = _GET( 'first_name', 'John Doe', array( 
   'stop_value' => 'enter first name',
   'not_empty' => true,
));
if ( is_wp_error( $first_name ) )
   switch ( $first_name->get_error_code() ) {
      case 'empty':  
      case 'invalid_value':  
         echo "Sorry, you must enter a first name please.";
         break
   }

But I wouldn't advocate for that level of complexity.

Or if you need to check for a default value:

// old way:
if ( isset( $_GET['myVar'] ) && 'default value' != $_GET['myVar'] )

// vs:
if ( _check( _get( 'myVar' ), 'default value' ) )

This adds even more surface area to WordPress. There's a cost in terms of runtime overhead, and learning; everyone will need to learn what _check() does. People who are familiar with PHP don't have to learn this:

if ( isset( $_GET['myVar'] ) && 'default value' != $_GET['myVar'] )

...
The ... are way better than the "old way".

Syntactical sugar in PHP is good IMO but adding helper functions in libraries and frameworks usually just adds unnecessary complexity. In my past where I went crazy with helper functions but learned that too much is not a good thing; best to just stick with what exists.

In the case of the proposed _GET(), we'd already be are adding it so we can kill three (3) birds with one stone; #2: Be graceful when $_GETfoo? doesn't exist, and #3: Allow us to add a default value when we need one.

// vs:
if ( _GET( 'myVar', 'default value' ) != 'default value' )
// or:
if ( has_GET( 'myVar' ) && _GET( 'myVar' ) != 'default value' )

You got that last one wrong. It should just be:

if ( _GET( 'myVar' ) != 'default value' )

And finally to output the value:

// vs:
echo _default( _get( 'myVar'), 'default value' );

// vs:
echo _GET( 'myVar', 'default value' );

It's clear which of these I prefer. :)

Replying to CaptainN:

Also note, your implementation would need to do something similar.

Agreed. I didn't address that part because I don't understand the requirements 100%. I was focused on suggesting an API that I think doesn't suck, much as you were. :)

But given that it's WordPress I would guess they might simply want to use the object cache?

So...we've now both made our cases. But neither of us makes the decision. :) I'm just going to leave it to let others give their opinions, and eventually @nacin or someone else from the core team will decide.

Last edited 12 years ago by MikeSchinkel (previous) (diff)

#21 in reply to: ↑ 20 ; follow-up: @CaptainN
12 years ago

Replying to MikeSchinkel:

The differences here are so minor, and there's no word on whether the approach would be preferred or not in broad terms over the more robust OOP based frameworks mentioned above, but let's keep going a bit more. :-)

In my past where I went crazy with helper functions but learned that too much is not a good thing; best to just stick with what exists.

Well in that case, why not just copy the 5 GPCS(R) array values, and call it a day. It'd be easy enough:

$WP_GET = $_GET;
$WP_POST = $_POST;
// etc.

These wouldn't be super global, but they could be globaled, and then they would work literally and exactly like the plain old superglobals. It'd be the easiest for any PHP developer to pick up. But I think there's room to improve things (I started with opposite position heh).

// vs:
if ( _GET( 'myVar', 'default value' ) != 'default value' )
// or:
if ( has_GET( 'myVar' ) && _GET( 'myVar' ) != 'default value' )

You got that last one wrong. It should just be:

if ( _GET( 'myVar' ) != 'default value' )

Don't think I did. If the value hasn't been set (eg !has_GET('myVar')) then it'll pass the != 'default value' test because the value is NULL, which is not what you want.

I did realize while playing with this stuff, you could use my _check and _default methods even with the normal superglobals if I just suppress the lookup error - what I did works exactly as an array lookup (returns NULL if the key doesn't exist in the array). I think all we've really accomplished vs the standard globalvars, is to not kick up a warning.

echo _default( @$_GET[ 'myVar' ], 'default value' );

And finally to output the value:

// vs:
echo _default( _get( 'myVar'), 'default value' );

// vs:
echo _GET( 'myVar', 'default value' );

It's clear which of these I prefer. :)

Having the ability to set a default there does have some appeal.

Related, there is a hole in the model we've been discussing, which is sometimes it's convenient to change the POST or GET data. Neither attempt has a solution for allowing that. Adding it would seem to complicate things.

Kevin N.

@MikeSchinkel
12 years ago

2nd Proposed implementation of _GET(), _POST() and _REQUEST(), etc.

#22 in reply to: ↑ 21 @MikeSchinkel
12 years ago

Replying to CaptainN:

Related, there is a hole in the model we've been discussing, which is sometimes it's convenient to change the POST or GET data. Neither attempt has a solution for allowing that. Adding it would seem to complicate things.

Could easily be like the following with no real added complexity:

set_GET( 'test', 'abc123' );
set_POST( array( 'foo', 'bar'), array( 'baz' => 'bazoom') );

Beyond that I'll wait to hear for the "powers that be" weigh in. :)

P.S. Also updated the code to add set_GET(), set_POST() and set_REQUEST() as well as created a WP_GPCS class that wraps copies of $_GET, $_POST and $_REQUEST.

#23 @rmccue
12 years ago

Guys, this is much too heavy on the implementation details. My framework overview was meant to be a fairly high-level overview of what the various frameworks abstract and what we might want in our own class.

Let's avoid implementation discussion until we hash this out further. +1 on moving to a P2.

#24 in reply to: ↑ 4 @TobiasBg
12 years ago

Replying to rmccue:

CakePHP and CodeIgniter doesn't abstract any of the GPCS data, [...]

CodeIgniter uses the "Input" class (see http://codeigniter.com/user_guide/libraries/input.html) as an abstraction layer that also does a few things that people have suggested in this discussion (like the isset() checks).

#25 @kovshenin
12 years ago

  • Cc kovshenin added

#26 @aaroncampbell
12 years ago

  • Cc aaroncampbell added

#27 @tomdxw
12 years ago

  • Cc tom@… added

#28 follow-up: @dave1010
12 years ago

  • Cc dave1010@… added

Is it not worth waiting until we require PHP 5.3+, so we can go with the more future-proof Symfony2 Request component (or Zend or whatever we go with)?

#29 in reply to: ↑ 28 ; follow-up: @rmccue
12 years ago

Replying to dave1010:

Is it not worth waiting until we require PHP 5.3+, so we can go with the more future-proof Symfony2 Request component (or Zend or whatever we go with)?

I'd say no; a 5.3 requirement isn't on the table for the near future, since we still have a huge percentage on 5.2. Waiting will mean it won't get done.

This ticket is also related to #18322 and how we end up handling it there, so it will want to be in sync with that.

#30 in reply to: ↑ 29 @dave1010
12 years ago

Replying to rmccue:

Replying to dave1010:

Is it not worth waiting until we require PHP 5.3+, so we can go with the more future-proof Symfony2 Request component (or Zend or whatever we go with)?

I'd say no; a 5.3 requirement isn't on the table for the near future, since we still have a huge percentage on 5.2. Waiting will mean it won't get done.

It would be nice if we had stats on how many PHP 5.2 users have upgraded to WordPress 3.4/3.5, as I guess many people who don't/can't upgrade PHP also don't/can't upgrade WordPress.

This ticket is also related to #18322 and how we end up handling it there, so it will want to be in sync with that.

I agree that this and the magic quotes issue need to get fixed ASAP. Perhaps an existing framework component (Symfony2/zf2) isn't the best way right now but I'd really like us to consider the API we use and, if possible, design it so it could easily be swapped out in the future.

#31 @dave1010
12 years ago

Do we need to consider the other superglobals: $_FILES, $_SESSION and $_ENV? Symfony2's Request::createFromGlobals() uses $_FILES but I'm not sure what the implications are of this.

#32 follow-up: @ryan
12 years ago

Perhaps something in the vein of Symfony with only the bits we need (we already have a request abstraction API) and the proper PHP requirements. WP:GET, WP:POST, WP:REQUEST would be straightforward and familiar.

#33 in reply to: ↑ 32 @rmccue
12 years ago

Replying to ryan:

Perhaps something in the vein of Symfony with only the bits we need (we already have a request abstraction API) and the proper PHP requirements. WP:GET, WP:POST, WP:REQUEST would be straightforward and familiar.

I'll work on a preliminary patch for this.

@rmccue
12 years ago

Initial pass at GPCS abstraction

#34 @rmccue
12 years ago

The attached patch is the simplest thing that can possibly work. All it does is set $wp->GET, $wp->POST and $wp->REQUEST after deslashing and before reslashing.

This does involve one change that may break a bit of compatibility. In order to set those variables without unslashing them again, wp_magic_quotes() needs to be called after the request object is instantiated. I chose to move this to after new WP() but this may affect things hooked into sanitize_comment_cookies or via WP_Rewrite::__construct(). I'll test this further, but we're going to have to move one of them to use $wp->...

Alternatively:

  1. We could set these as static variables on WP. This kind of defeats the purpose of having a request object though.
  2. We could unslash the superglobals in WP::init(). This means between wp_magic_quotes() and WP::init(), we need to manually unslash any superglobals, which may be a bit weird for plugins hooked in at those points. Additionally, this is extra work for the server (unslash magicked vars, slash all vars, unslash for $wp copy)
  3. We could instantiate WP earlier. This may affect plugins in subtle ways, so I haven't fully investigated this yet.

I'm happy to revise the patch for these choices. I'm in favour of 3, but it may have subtle effects.

#35 follow-up: @CaptainN
12 years ago

I always noticed WordPress makes it's own $_REQUEST variable by mixing $_GET and $_POST. This has been done again in the patch. Why?

PHP already has a $_REQUEST array, which is actually $_GET, $_POST, and $_COOKIE combined. Why build a different request array, instead of just using the (well understood) one PHP already has?

#36 in reply to: ↑ 35 @aaroncampbell
12 years ago

Replying to CaptainN:

PHP already has a $_REQUEST array, which is actually $_GET, $_POST, and $_COOKIE combined. Why build a different request array, instead of just using the (well understood) one PHP already has?

The PHP created one isn't guaranteed to contain $_GET and $_POST, or to load them in any predictable order because it's controlled with settings in the PHP config file (older: variables-order, newer: request-order). Also, if you read that paragraph on request-order it says that it doesn't include $_COOKIE in $_REQUEST due to security concerns. Most pre-5.3 PHP setups still do that, but WordPress fixes the issue by overriding $_REQUEST with a safe and predictable version.

@rmccue
12 years ago

Use static variables with accessor functions

#37 follow-up: @CaptainN
12 years ago

Ah, so it's less well understood, and more well assumed. ;-) (In practice it's probably more consistent than not, but that's irrelevant.)

Whatever the case, I would suggest WP doesn't override the PHP version (or remove the current override behavior if possible), but instead create a WordPress specific version. I guess having the property on $wp is good enough, but it could also have a different name to be even clearer - maybe only marginally.

I think changing default PHP behavior in the past is the reason this ticket exists now.

#38 in reply to: ↑ 37 @nacin
12 years ago

Replying to CaptainN:

I think changing default PHP behavior in the past is the reason this ticket exists now.

It's actually because we followed PHP behavior (magic quotes) rather than go against PHP behavior, because going against it would have meant opening up scores of vulnerabilities in plugins. Yes, PHP's behavior was stupid. And now, we get to be punished for their transgressions.

#39 follow-up: @CaptainN
12 years ago

I don't think it was a dumb decision. I think magic quotes is confusing, not very secure in a broader sense, and teaches bad practice.

I wasn't around WordPress at the time, but I'd guess the enforcement of magic quotes on the PHP superglobals was added to WordPress in response to the change in upstream default behavior by PHP - so it was probably more of an active change to default WP behavior than a change that happened through attrition. The result is the same either way, WordPress is left in a state where its behavior changes default PHP behavior to match legacy PHP behavior.

Maybe it would have been better for PHP to create a different GPCS variable set or framework and deprecate the old one, instead of just changing the default behavior the way they did - that was pretty silly.

It doesn't really matter now, it's great that WordPress is finally making an effort to match the default PHP behavior. :-)

#40 in reply to: ↑ 39 @nacin
12 years ago

Replying to CaptainN:

I wasn't around WordPress at the time, but I'd guess the enforcement of magic quotes on the PHP superglobals was added to WordPress in response to the change in upstream default behavior by PHP - so it was probably more of an active change to default WP behavior than a change that happened through attrition. The result is the same either way, WordPress is left in a state where its behavior changes default PHP behavior to match legacy PHP behavior.

Nah, we're talking 9, 10 years ago. It wasn't legacy PHP behavior at the time.

#41 @goto10
12 years ago

  • Cc dromsey@… added

#42 @jdgrimes
11 years ago

  • Cc jdg@… added

#43 @nacin
11 years ago

  • Component changed from General to Bootstrap/Load

#44 @chriscct7
9 years ago

  • Keywords has-patch needs-refresh added
  • Severity changed from minor to normal

@MikeSchinkel
9 years ago

Updated patch for superglobals

#45 @MikeSchinkel
9 years ago

  • Keywords 2nd-opinion added; needs-refresh removed

Added an updated version of @mccue's last patch that made the static vars private. Also added in $_COOKIE and $_SERVER vars and methods.

However, before uploading the patch I started to worry that we can possibly be creating two sources of truth for the superglobals and would like to get other people's take on it. This could become a concern if a plugin changes one of the superglobals in a manner similar to how WordPress does during boot/load.

We may want to add a filter to each of the access methods in my path so that a plugin can ensure the values are in sync, if needed. Or maybe we want to avoid the private variables and have the methods do a stripslashes_deep() on the superglobals? Or...maybe something else?

Or maybe this patch is sufficient?

#46 @dd32
9 years ago

I started to worry that we can possibly be creating two sources of truth for the superglobals and would like to get other people's take on it.

I worry about that too actually, and because of that my initial thought here would be to have these as aliases of the superglobals. How can it be an alias and also be unslashed? My initial thought would be an object implementing ArrayAccess wrapping each super global.

Just a thought.

#47 @johnbillion
9 years ago

#17856 was marked as a duplicate.

#48 @MikeSchinkel
9 years ago

Given how we have been bit hard by this issue, isn't it time to address if finally?

https://make.wordpress.org/core/2016/04/06/rest-api-slashed-data-in-wordpress-4-4-and-4-5/#respond

See #36438 for yet another solution.

#50 @calin
6 years ago

Just opened #45832 which might relate to this.

#51 @jorbin
3 months ago

  • Keywords needs-unit-tests added

Coming back to this, I think we are going to need some tests that will show that this returns the data in the format that is expected so adding the needs-unit-tests tag.

Also, cc @dmsnell based on some conversations during WordCamp US contributor day. Would be interested in hearing your thoughts on this approach.

#52 @dmsnell
2 months ago

Magic quotes were removed in March 2012 with PHP 5.4. Thought this might be relevant here; no supported version of WordPress supports running on PHP with magic quotes.

Perhaps some of the original motivation for this discussion is gone, but I think there's still value in handling $_GET and $_POST more explicitly. I've grown skeptical of what happens automatically and what doesn't happen automatically.

I'm in favor of encapsulating access to these variables, but in a way that runs in parallel to existing code (so as not to break it), but which presents a more explicit and defined interface. Some things I find surprising:

  • dots and spaces in the query args are transformed into underscores
  • duplicate params not using PHP's array syntax overwrite previous copies of the same name
  • array-named args explicitly create array structure, but…
  • duplicates of some subset of array names overwrites previous nested parameters
  • names and values are accepted as byte streams, not as encoded text
  • GET values are submitted as percent-encoded bytes (which are not guaranteed to be UTF-8), but…
  • POST values may be transformed to HTML character references, e.g. when a browser is set to latin1

these are also the kind of "gotchas" that I tend to see people struggle with, because the basic mental model they form while getting started paints an inaccurate picture - the reality is much more complicated. and this is only coming from an examination of legitimate uses, ignoring malicious attacks.

Although a bit less magical, I find a number of other standard approaches to query args and POST values simpler to reason about and teach about.

<?php
// ?q=one&q=two
'one'                 === wp_get( 'q' );
array( 'one', 'two' ) === wp_get_all( 'q' );
null                  === wp_get( 'r' );

// ?q=😄&r=&#x1f604&s=%F0%9F%98%84;
'😄' === wp_get( 'q' );
'😄' === wp_get( 'r' );
'😄' === wp_get( 's' );
  • Providing a default value in the absence of the query arg seems very reasonable. Of course we can use ?? now so it's less of a big deal, and there's no way a query param can bet set to null - only "null", which is different.
  • I'm having trouble understanding the stop values, or the values to compare against for detecting the presence of a query arg. the get() function can serve the purpose of the has() because it can return null when the arg is missing.
  • might want to consider rejecting values that are invalid UTF-8. it's quite possible that non-UTF-8 data comes in anyway, but many non-UTF-8 encodings also produce valid UTF-8 byte streams. so we can't ensure the right decoding, but we can reject some invalid ones.

It's late and I'm tired so I'm stopping for now, but I'd like to add some more illustrating examples. for one, names can be weird and wild. for two, PHP's native system for these values is wild. it makes so many params unavailable, and makes it really hard on developers to get the right values when they want to.

as a reminder, always send accept-charset=utf8 on your <form> elements! even if a page has <meta charset=utf8>, the browser will still send other encodings in the POST body, including HTML character references, if the browser is set to an encoding override.

Last edited 2 months ago by dmsnell (previous) (diff)
Note: See TracTickets for help on using tickets.