Make WordPress Core

Opened 12 years ago

Last modified 18 months ago

#21602 assigned defect (bug)

redirect_canonical can lead to infinite loop on index navigation if site url is not all lower case

Reported by: sreedoap's profile sreedoap Owned by:
Milestone: Future Release Priority: normal
Severity: blocker Version:
Component: Canonical Keywords: needs-unit-tests has-patch
Focuses: Cc:

Description

The function redirect_canonical in wp-includes/canonical.php (WordPress 3.4.1) on line 406 and 422 makes the following check:

if ( !$redirect_url || $redirect_url == $requested_url )
		return false;

This ensures that it does not attempt to redirect you to the page you requested in the first place. However this function is not case sensitive so if the redirect URL is in a different case than the requested URL then the user can enter an infinite redirect loop. (For example if the Site Address (URL) of the site is set to be in all upper case.)

This function should do a case-insensitive string comparison since domain names are case-insensitive.

The issue only appears to happen with certain plugins installed (ShareThis and PilotPress both led to this issue,) I haven't figured out yet why it's only an issue with certain plugins but it should still be fixed in WordPress to make the proper string comparison.

Attachments (6)

canonical.php.patch (959 bytes) - added by sreedoap 12 years ago.
21602.patch (1.5 KB) - added by SergeyBiryukov 12 years ago.
21602.diff (7.3 KB) - added by dd32 9 years ago.
21602.2.diff (8.8 KB) - added by dd32 9 years ago.
another attempt at supporting mixed-case hostnames in canonical - always rebuild the URL
21602.3.diff (7.8 KB) - added by dd32 9 years ago.
a variant which forces home_url + site_url to have a lowercase hostname - same tests as 21602.2.diff
21602.4.diff (8.2 KB) - added by dd32 9 years ago.
another option, when we override the hostname, override it in both $redirect and $redirect_url

Download all attachments as: .zip

Change History (42)

#1 @sreedoap
12 years ago

Correction: The "Theme My Login" plugin is one that will lead to this issue.

#2 @sreedoap
12 years ago

  • Keywords has-patch added

#3 @SergeyBiryukov
12 years ago

  • Component changed from General to Canonical

#4 @sreedoap
12 years ago

Narrowed this down a bit.. this will not happen with the default permalink setting but will happen with all the other ones (Post name, Numeric, etc.)

#5 @sreedoap
12 years ago

Alright.. I've compiled below a complete list of steps to reproduce. Turns out this only will happen if there is a plugin activated that uses the is_front_page() function.

1) Change the URL of the site address/wordpress address setting to have uppercase characters (for example testsite.example.com to TESTSITE.example.com)

2) Change the permalinks setting to an option besides the default (for example post name)

3) Install the following plugin: https://gist.github.com/9649e00c6acec8805f39

4) Attempt to access the sites homepage, a infinite redirect loop will occur.

Version 1, edited 12 years ago by sreedoap (previous) (next) (diff)

#6 @sreedoap
12 years ago

Sorry, forgot to include one step to reproduce.. in settings > reading change the "front page displays" setting to a static page for the front page.

Last edited 12 years ago by sreedoap (previous) (diff)

#7 follow-up: @nacin
12 years ago

Confirmed.

All that is necessary is this:

add_action( 'wp', function() {
   is_front_page();
});

This is a tricky one.

I haven't tracked down what exactly fails to trigger an infinite redirect, but I do know why is_front_page() is the trigger. In canonical, there is this conditional:

} elseif ( is_page() && !is_feed() && isset($wp_query->queried_object) && 'page' == get_option('show_on_front') && $wp_query->queried_object->ID == get_option('page_on_front') && ! $redirect_url ) {
			$redirect_url = home_url('/');

This is designed to redirect a /front-page/ page that is assigned to the page_on_front to home_url(). But, since it checks if $wp_query->queried_object is set, it doesn't actually work. is_front_page() calls is_page() with the right arguments to trigger a proper get_queried_object(). This then causes the conditional to run.

As to why it spirals into a redirect loop, home_url() returns something different than what was submitted, and all that jazz.

I see the need for multiple unit tests here, both to cover front-page -> / redirect failures, and the domain issue.

#8 in reply to: ↑ 7 @SergeyBiryukov
12 years ago

Replying to nacin:

I haven't tracked down what exactly fails to trigger an infinite redirect

home_url('/') returns http://trunk.WordPress/, wp_redirect() fires, browser converts that to http://trunk.wordpress/, and the loop starts again.

There's a block in line 375 (introduced in [6097] for #4773) to prevent exactly this situation:
http://core.trac.wordpress.org/browser/tags/3.4.1/wp-includes/canonical.php#L375

However, it doesn't seem to work as intended. As far as I can see, the purpose of the block is to discard the differences in host capitalization, so there shouldn't be a redirect in this case.

What happens instead is $compare_original and $compare_redirect end up being the same, $redirect_url is not rewritten, and the redirect proceeds.

21602.patch moves the block lower, after $compare_original and $compare_redirect are filled.

#9 @SergeyBiryukov
12 years ago

  • Keywords needs-unit-tests added

#10 @meloniq
11 years ago

  • Cc meloniq@… added

#11 @chriscct7
9 years ago

  • Keywords needs-refresh added; has-patch removed

This ticket was mentioned in Slack in #core by clorith. View the logs.


9 years ago

#13 @dd32
9 years ago

  • Milestone changed from Awaiting Review to 4.4.2

However, it doesn't seem to work as intended. As far as I can see, the purpose of the block is to discard the differences in host capitalization, so there shouldn't be a redirect in this case.

The existing check actually works as intended, The host capitalisation differences are ignored when a redirect will happen.
Unfortunately in the case where the only difference is the capitalization, it's not caught as $redirect_url doesn't get modified, which is later used for the comparison (Skipping the $redirect_url rebuilding which occurs and effectively strtolower()'s it).

I think the only thing that's needed here is adding the else branch, like so:

	if ( $compare_original !== $compare_redirect ) {
		$redirect_url =....rebuild
	} else {
		$redirect_url = false;
	}

I'm pulling this into 4.4.2, as we've triggered this issue through [36129]

@dd32
9 years ago

#14 follow-up: @dd32
9 years ago

21602.diff is an attempt at fixing this, unfortunately our Canonical unit tests are not designed to test the hostname (It specifically excludes it..) so the unit tests are a little messy.

Hopefully someone else can see a way to fix that.

Another option which we have, is to force the hostname in home and site_url being lower-case, which makes a lot of sense to me. Capitalizing the domain adds no benefit, as hostnames are inherently lower-case and most browsers treat them as such.

#15 in reply to: ↑ 14 @rmens
9 years ago

Replying to dd32:

Another option which we have, is to force the hostname in home and site_url being lower-case, which makes a lot of sense to me. Capitalizing the domain adds no benefit, as hostnames are inherently lower-case and most browsers treat them as such.

+1 for this.

@dd32
9 years ago

another attempt at supporting mixed-case hostnames in canonical - always rebuild the URL

@dd32
9 years ago

a variant which forces home_url + site_url to have a lowercase hostname - same tests as 21602.2.diff

This ticket was mentioned in Slack in #core by dd32. View the logs.


9 years ago

#17 @Clorith
9 years ago

I'm in favor of the forced lowercase URLs, but i think it's important we add a note in the admin if we actually do have to enforce them.

Users add emphasis in their URLs thinking it matters, when they might be hard to read or understand otherwise. If we keep forcing lowercase without any user feedback I suspect there will be some frustration out and about as they kee ptrying to update it a few times in a row and then take to Google and the forums.

Your Site and Home URLs have been changed to lowercase to ensure compatibility with other systems.

I don't think explaining to users in broad terms what it means will help them, but this would be beneficial I suspect in other terms as well, suddenly their site needs to send an email to someone, and the receiving email system is case sensitive etc. (as most other services I've seen also convert the addresses to lowercase in my experience).

@dd32
9 years ago

another option, when we override the hostname, override it in both $redirect and $redirect_url

#18 @dd32
9 years ago

  • Keywords has-patch added; needs-refresh removed

Here's the options and my thoughts on each

  • Revert to 4.3 behaviour, which requires reverts of #35031 & #33920 - the result would be that If you embed a post on a page and then they rename the slug, your embed will break.
  • 21602.3.diff Lowercase hostname - Not 4.4.x material IMHO, possibly something we could consider, probably not worth it really though for a UX perspective, it should just work.
  • 21602.4.diff Simply replaces the hostname in $redirect_url just incase it's not rebuilt a few lines later, seems like a waste of resources when the next two options exist:
  • 21602.2.diff simpler than 21602.4.diff and effectively the same. I don't understand why we don't just rebuild the URL each time if we're modifying the contents of the array above often. This would need testing with a scenario where there's multiple hostnames though
  • 21602.diff Also makes sense, triggers the $redirect_url rebuilding with any URL change, but at that point we might as well simplify it with 21602.2.diff
  • 21602.patch also works, but I think it only works because it doesn't trigger a redirect in some cases where we do want one or vice-versa.

#19 @dd32
9 years ago

  • Owner set to dd32
  • Resolution set to fixed
  • Status changed from new to closed

In 36280:

Canonical / Query: Restore the is_404() check in wp_old_slug_redirect() which was removed in [34659].
This reverts part of [34659] due to excessive canonical problems it's caused in 4.4.x.

Fixes #35344, #21602 for the 4.4 branch.

#20 @dd32
9 years ago

  • Milestone changed from 4.4.2 to 4.5
  • Resolution fixed deleted
  • Status changed from closed to reopened

Shifting to 4.5 instead. This shouldn't be a problem for 4.4 after [36280] / #35344, but the problem still exists with trunk/4.5

This ticket was mentioned in Slack in #core by mike. View the logs.


9 years ago

#22 @chriscct7
9 years ago

@dd32 should this be punted?

#23 @dd32
9 years ago

  • Owner dd32 deleted
  • Status changed from reopened to assigned

This cannot be punted. At a minimum [36280] needs to be applied to trunk, bringing forth the breakages it causes and fixing those it fixes.

I'm removing myself as the owner in the hope that someone else will step in and sort this mess out.

This ticket was mentioned in Slack in #core by chriscct7. View the logs.


9 years ago

#25 @chriscct7
9 years ago

  • Severity changed from normal to blocker

This ticket was mentioned in Slack in #core by mike. View the logs.


9 years ago

#27 @aaroncampbell
9 years ago

Okay, so I spent a good chunk of time going over this one. It seems to me that we either need to flesh out the $compare_redirect and $compare_original variables quite a bit and use them for all comparisons, or we need to make $redirect_url and $requested_url into something that we can compare (in this case with lowercase host names).

I'm with Dion on this. It doesn't seem like it makes sense to flesh out the compare variables, since what we really care about are the actual urls. I ran a bunch of tests, and 21602.2.diff works really well for me. I'm in favor of moving forward with that one.

#28 @dd32
9 years ago

I'm thinking [36280] should probably be applied to trunk, and something (possibly 21602.2.diff) done first thing in 4.6 to allow for soak time.

The problem is that these edgecase environments aren't often using non-stable releases, so it's hard to catch early on.

#29 @kirasong
9 years ago

@dd32 What are the downsides of applying [36280]? If less than leaving things the way they are, I'm in favor of moving forward there.

This ticket was mentioned in Slack in #core by mike. View the logs.


9 years ago

#31 @SergeyBiryukov
9 years ago

In 37075:

Canonical / Query: Restore the is_404() check in wp_old_slug_redirect() which was removed in [34659].
This reverts part of [34659] due to excessive canonical problems it's caused in 4.4.x.

Remove the unit tests which are no longer supported.
This also removes the is_feed() code to avoid confusion - only pages & embeds will be redirected.

Merges [36280] and [36281] to trunk.

Props dd32.
See #21602, #35344.

#32 @SergeyBiryukov
9 years ago

  • Keywords 4.6-early added
  • Milestone changed from 4.5 to Future Release

This ticket was mentioned in Slack in #core by billybigpotatoes. View the logs.


7 years ago

#34 @niceguyit
6 years ago

I ran into this today and narrowed it down to how redirect_canonical() handles mixed-case domains. The first time through, $requested_url is NULL and the URL is built from $_SERVER['HTTP_HOST'] which will always be lowercase.

Line 63: wp-includes/canonical.php

    if ( ! $requested_url && isset( $_SERVER['HTTP_HOST'] ) ) {
        // build the URL in the address bar
        $requested_url  = is_ssl() ? 'https://' : 'http://';
        $requested_url .= $_SERVER['HTTP_HOST'];
        $requested_url .= $_SERVER['REQUEST_URI'];
    }

As it goes through the checks, this section is called and the $redirect_url is pulled from home_url('/'). @nacin mentioned this above. What isn't made clear is that it uses the site address setting which could be mixed case.

Line 177: wp-includes/canonical.php

        } elseif ( is_page() && !is_feed() && 'page' == get_option('show_on_front') && get_queried_object_id() == get_option('page_on_front')  && ! $redirect_url ) {
            $redirect_url = home_url('/');

For me, the solution was to make the home_url('/') lowercase to match $_SERVER['HTTP_HOST'].

        } elseif ( is_page() && !is_feed() && 'page' == get_option('show_on_front') && get_queried_object_id() == get_option('page_on_front')  && ! $redirect_url ) {
            $redirect_url = strtolower(home_url('/'));

Before the actual redirect is called, a recursive call to redirect_canonical() is made except this time $redirect_url is not NULL and could be mixed case. The second time though all the checks seem to be accurate/work.

Line 678: wp-includes/canonical.php

    if ( $do_redirect ) {
        // protect against chained redirects
        if ( !redirect_canonical($redirect_url, false) ) {
            wp_redirect($redirect_url, 301);
            exit();

The line numbers correspond to WordPress 4.9.8.

I'm using the enigma theme, version 4.6, which makes use of is_front_page().

#35 @johnbillion
3 years ago

  • Keywords 4.6-early removed

#36 @SergeyBiryukov
18 months ago

#58222 was marked as a duplicate.

Note: See TracTickets for help on using tickets.