WordPress.org

Make WordPress Core

Opened 4 years ago

Last modified 7 months ago

#17047 reviewing defect (bug)

Not following spec for REQUEST_URI

Reported by: sterlo Owned by: sterlo
Milestone: Future Release Priority: normal
Severity: normal Version: 3.1
Component: Rewrite Rules Keywords: has-patch
Focuses: Cc:

Description

Possibly related: #16932

Spec: http://www.w3.org/Protocols/rfc2616/rfc2616-sec5.html#sec5.1.2

Structure: http://en.wikipedia.org/wiki/URI_scheme#Examples

The spec for REQUEST_URI:

Request-URI    = "*" | absoluteURI | abs_path | authority

The specs for REQUEST_URI in Apache are such that it allows for absolute paths to a given resource.

Given that throughout WordPress there are concatenations like:

$_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI']

These are intended to generate "mysite.com/resources"

But in certain cases will generated "mysite.com/mysite.com/resources"

Case Study:

GET http://subdomain.mydomain.com/ HTTP/1.1

This should be allowed.

Apache in this case sets the URI to "http://subdomain.mydomain.com/myfile.php"

Solution: Do not assume that URI is not an absolute path.

A quick fix is something like the patch attached.

Attachments (3)

uri_no_host.2.patch (744 bytes) - added by sterlo 4 years ago.
Patch against latest wp-includes/load.php (replaced previous patch)
uri_no_host.patch (745 bytes) - added by sterlo 4 years ago.
New patch: the regex should be case insensitive.
17047.patch (799 bytes) - added by SergeyBiryukov 4 years ago.

Download all attachments as: .zip

Change History (18)

comment:1 @sterlo4 years ago

  • Owner set to sterlo
  • Status changed from new to reviewing

Here's a test prior to the patch:

[grok@jutsu:~] $ curl -sv -x 209.20.77.244:80 "http://subdomain.mysite.com/"
* About to connect() to proxy 209.20.77.244 port 80 (#0)
*   Trying 209.20.77.244... connected
* Connected to 209.20.77.244 (209.20.77.244) port 80 (#0)
> GET http://subdomain.mysite.com/ HTTP/1.1
> User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
> Host: slice37.dealertrend.com
> Accept: */*
> Proxy-Connection: Keep-Alive
> 
< HTTP/1.1 301 Moved Permanently
< Date: Tue, 05 Apr 2011 02:59:30 GMT
< Server: Apache/2.2.14 (Ubuntu)
< X-Powered-By: PHP/5.3.2-1ubuntu4.7
< X-Pingback: http://subdomain.mysite.com/xmlrpc.php
< Location: http://subdomain.mysite.comhttp/subdomain.mysite.com/
< Vary: Accept-Encoding
< Content-Length: 0
< Content-Type: text/html; charset=UTF-8
< 
* Connection #0 to host 209.20.77.244 left intact
* Closing connection #0

After the patch:

[grok@jutsu:~] $ curl -sv -x 209.20.77.244:80 "http://subdomain.mysite.com/" -o /dev/null
* About to connect() to proxy 209.20.77.244 port 80 (#0)
*   Trying 209.20.77.244... connected
* Connected to 209.20.77.244 (209.20.77.244) port 80 (#0)
> GET http://subdomain.mysite.com/ HTTP/1.1
> User-Agent: curl/7.19.7 (universal-apple-darwin10.0) libcurl/7.19.7 OpenSSL/0.9.8l zlib/1.2.3
> Host: subdomain.mysite.com
> Accept: */*
> Proxy-Connection: Keep-Alive
> 
< HTTP/1.1 200 OK
< Date: Tue, 05 Apr 2011 03:01:49 GMT
< Server: Apache/2.2.14 (Ubuntu)
< X-Powered-By: PHP/5.3.2-1ubuntu4.7
< X-Pingback: http://subdomain.mysite.com/xmlrpc.php
< Vary: Accept-Encoding
< Content-Length: 6407
< Content-Type: text/html; charset=UTF-8
< 
{ [data not shown]
* Connection #0 to host 209.20.77.244 left intact
* Closing connection #0

Prior to the patch - it caused issues with proxies.

After the patch - there is no problem. The assumptions previously in place are just enforced.

@sterlo4 years ago

Patch against latest wp-includes/load.php (replaced previous patch)

comment:3 follow-up: @nacin4 years ago

I figured we already had this fix, but I've also never seen a server do this. Is this really seen outside of the spec?

comment:4 in reply to: ↑ 3 @sterlo4 years ago

Replying to nacin:

I figured we already had this fix, but I've also never seen a server do this. Is this really seen outside of the spec?

I popped into the Apache IRC channel, and said "What causes the REQUEST_URI to have a full path", I was then bombarded with responses like: "Someone is attempting to do a proxy."

So I would say that it's a common thing.

I work with another developer who loves to RTFM and he's the one who pointed out it was actually against the spec, and not just a local problem.

We tested this using the latest from the subversion trunk - the problem still existed.

It's very easy to test against, just get the IP of your server and do:

curl -sv -x YOURIP:80 "http://YOURWEBSITE.TLD/" -o /dev/null

You'll see the problem manifest.

comment:5 follow-up: @nacin4 years ago

  • Milestone changed from Awaiting Review to 3.2

Sounds like this might fix a number of problems that we've attributed to issues with server configuration.

It'd be nice to do this without a regex. Perhaps an str_replace using HTTP_HOST.

comment:6 in reply to: ↑ 5 @sterlo4 years ago

Replying to nacin:

Sounds like this might fix a number of problems that we've attributed to issues with server configuration.

It'd be nice to do this without a regex. Perhaps an str_replace using HTTP_HOST.

I wanted to avoid the assumption that the domain would not be used anywhere else in the URI.

For instance: "http://www.mysite.com/wp-content/uploads/www.mysite.com-screenshot.jpg" - this could cause problems.

With the regular expression, it says "See if you can find the HTTP_HOST value with http OR https and remove those - but only if that are at the beginning of the URI".

comment:7 follow-up: @dd324 years ago

It'd be nice to do this without a regex. Perhaps an str_replace using HTTP_HOST.

Probably best just wraping it in an if, if strpos(HTTP_HOST) !== false, then preg_replace.

Could also just use 2 static string replacements https://HOST and http://HOST

comment:8 in reply to: ↑ 7 ; follow-up: @duck_4 years ago

Replying to dd32:

It'd be nice to do this without a regex. Perhaps an str_replace using HTTP_HOST.

Probably best just wraping it in an if, if strpos(HTTP_HOST) !== false, then preg_replace.

if ( $_SERVER['REQUEST_URI'][0] != '/' )

I assume we're expecting it to begin with '/'.

comment:9 @sterlo4 years ago

Props to another dev who sent me the spec docs: @duonoid -> https://github.com/duonoid

@sterlo4 years ago

New patch: the regex should be case insensitive.

comment:10 @ryan4 years ago

  • Milestone changed from 3.2 to Future Release

Punted per bug scrub.

@SergeyBiryukov4 years ago

comment:11 in reply to: ↑ 8 @SergeyBiryukov4 years ago

Replying to duck_:

Probably best just wraping it in an if, if strpos(HTTP_HOST) !== false, then preg_replace.

if ( $_SERVER['REQUEST_URI'][0] != '/' )

Added a patch with that condition.

comment:12 @tobias3 years ago

Even though this is a side problem to this ticket: In my opinion wordpress should not use http_host inside the core methods at all. It causes problems when using wordpress behind a apache rewrite rule. For more about this see http://core.trac.wordpress.org/ticket/17168#comment:15

comment:13 @tobias3 years ago

  • Cc t@… added

comment:14 @jeremyfelt7 months ago

#18685 was marked as a duplicate.

comment:15 @jeremyfelt7 months ago

We do a similar check in wp-login.php and auth_redirect() when determining if SSL should be used:

if ( 0 === strpos( $_SERVER['REQUEST_URI'], 'http' ) ) {
	wp_redirect( set_url_scheme( $_SERVER['REQUEST_URI'], 'https' ) );
	exit();
} else {
	wp_redirect( 'https://' . $_SERVER['HTTP_HOST'] . $_SERVER['REQUEST_URI'] );
	exit();
}

I think the approach in 17047.patch makes sense, though I'm not sure that we should use HTTP_HOST as the replacement in this situation. If the server (likely Apache) is passing on REQUEST-URI untouched, it's possible that HTTP_HOST may also be unreliable.

I haven't actually tested any of this. :) I think an Apache VM and proxy are needed to really go all in. It seems that Nginx corrects these values before hitting PHP.

Related #29709, which may technically be a duplicate due to the root cause.

Note: See TracTickets for help on using tickets.