Opened 7 years ago
Last modified 4 years ago
#41712 new defect (bug)
canonical infinite redirect bug
Reported by: | yprince | Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | |
Component: | Canonical | Keywords: | has-patch |
Focuses: | Cc: |
Description
I had an infinite redirection problem caused by "redirect_canonical" (wp-includes\canonical.php)
I have a query string parameter ex: "http://mysite.com/mypage/?q=test%2C+test" and it was redirected infinitly to itself because of:
<?php // yes, again -- in case the filter aborted the request if ( ! $redirect_url || strip_fragment_from_url( $redirect_url ) == strip_fragment_from_url( $requested_url ) ) { return; }
my $requested_url was in lowercase "http://mysite.com/mypage/?q=test%2c+test" because of:
<?php $requested_url = preg_replace_callback('|%[a-fA-F0-9][a-fA-F0-9]|', 'lowercase_octets', $requested_url);
but my $redirect_url was "http://mysite.com/mypage/?q=test%2C+test"
so the both url was different because of the "%2c" vs "%2C"
I solved it by adding:
<?php add_filter( 'redirect_canonical', 'my_redirect_canonical', 10, 1 ); function my_redirect_canonical($redirect_url){ $redirect_url = preg_replace_callback('|%[a-fA-F0-9][a-fA-F0-9]|', 'lowercase_octets', $redirect_url); return $redirect_url; }
Change History (3)
This ticket was mentioned in PR #865 on WordPress/wordpress-develop by roborourke.
4 years ago
#3
- Keywords has-patch added
When processing the redirect URL query string and rebuilding it a false positive can be reported because the new query string is URL encoded and will have trailing =
operators removed. This means the new redirect URL does not match the requested URL and an unnecessary redirect will occur.
This can cause caching plugins like Batcache to cache the redirect and create and endless redirect loop as the two query strings will yield the same cache key.
Trac ticket: https://core.trac.wordpress.org/ticket/41712
Hi, I've hit essentially the same problem while using Batcache for page caching.
In situations where
redirect_canonical()
finds a$redirect_url
(such as when you have a static page set for the home page) and you have a query string then the query string will be removed, processed and rebuilt.When the query string contains values that are not URL encoded, or have a trailing
=
such as?foo=&bar=abc[xyz]
a redirect will occur to?foo&bar=abc%5Bxyz%5D
.This is mostly benign as the end result for the application will be the same however the redirect is unnecessary. It becomes problematic in the case of Batcache because it generates cache keys based on the query parameters passed to it. Because it can cache redirects it does not differentiate between the 2 versions of the query string and will keep returning the cached redirect resulting in and endless loop.
For a real world example of query strings being used outside of the applications' control marketing and adtech tools often append a bunch of query string parameters used for tracking and personalisation, for example:
example.org/?utm_campaign=aaa&hsa_content=&gclid=bbbb
.It would be great if
redirect_canonical()
could compare the query string before and after preprocessing to determine if the URL truly is different.