Opened 6 years ago
Last modified 6 years ago
#45459 new defect (bug)
Infinite Loop By redirect_canonical (canonical.php) due to mishandling of Port Information in URL creation/rebuilt
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | major | Version: | 4.9.8 |
Component: | Canonical | Keywords: | |
Focuses: | Cc: |
Description
Background
Fresh install version 4.9.8 for a simple website (no SSL, using standard theme twentyseventeen). Standard installation went well without any hiccup.
However immediately after installation "Too many redirect errors" occurred using any browser.
Infinite loop Symptoms
- Happen on URL that required index.php e.g: http://[domain.com] , http://[domain.com]/ (with or without trailing slash)
- Full permalink URL such as:
- http://[domain.com]/2018/12/01/hello-world/
- http://[domain.com]/category/uncategorised/ will render without any problem
This symptom actually is actually has happened before in this particular server and all my other website on this server requiring the removal of redirect_canonical function.
Applied in functions.php of the themes file as follow:
remove_filter('template_redirect', 'redirect_canonical');
With the removal of redirect_canonical above, the infinite loop is mitigated. But I decided this time to try to pin out the real problem because removing canonicalization is not a very good thing for SEO.
Investigation
- This fresh install infinite loop happen in my server that uses "reverse proxy" setting using Nginx, i.e the requested url is passed around between reverse proxy and upstream server using http://[domain.com]:80 and http://[domain.com]:8080 (or any other arbitrary port)
- Trace the problem to canonical.php and catch the bug as follows
- Original snippet of canonical.php (in wp-includes)
530 if ( $do_redirect ) { 531 // protect against chained redirects 532 if ( !redirect_canonical($redirect_url, false) ) { 533 wp_redirect($redirect_url, 301); 534 exit(); 535 } else { 536 // Debug 537 // die("1: $redirect_url<br />2: " . redirect_canonical( $redirect_url, false ) ); 538 return; 539 } 540 } else { 541 return $redirect_url; 542 }
- Minor modification to catch the bug (reverse the logic in line 532 so it goes to the debug section and add $requested_url to be printed as well):
530 if ( $do_redirect ) { 531 // protect against chained redirects 532 if ( redirect_canonical($redirect_url, false) ) { 533 wp_redirect($redirect_url, 301); 534 exit(); 535 } else { 536 // Debug 537 die("1: $redirect_url<br />2: $requested_url<br />3: " . redirect_canonical( $redirect_url, false ) ); 538 return; 539 } 540 } else { 541 return $redirect_url; 542 }
- From the print out of the debugging, it is confirmed that the differences between the $requested_url and $redirect_url is causing the infinite loop: (i.e: the port ":80" is missing in the $redirect_url, causing it to recursively redirected to the same http://[domain.com]/)
- $redirect_url: http://[domain.com]/
- $requested_url: http://[domain.com]:80/
If the url were http://[domain.com]/2018/12/01/hello-world/ it will not produce the infinite loop as the result of the debugging is:
- $redirect_url: http://[domain.com]/2018/12/01/hello-world/
- $requested_url: http://[domain.com]/2018/12/01/hello-world/
(no port number produced by the function)
- Trace further how the function handle port, and found this "lazy" snippet (original):
386 // Handle ports 387 if ( !empty($user_home['port']) ) 388 $redirect['port'] = $user_home['port']; 389 else 390 unset($redirect['port']); 391
Apparently from the code above, if the $WP_HOME doesn't have port component it just removes the component for redirection. Seems a bit lazy to me.
True enough, I add below line in wp-config.php and fix the problem
define('WP_HOME','http://[domain.com]:80');
But of course, it is never that simple. While the infinite loop for the home page is handled, the other link is affected. In this case, going back to the debuging mode, the result os requesting url http://[domain.com]/2018/12/01/hello-world/ produced:
- $redirect_url: http://[domain.com]:80/2018/12/01/hello-world/
- $requested_url: http://[domain.com]/2018/12/01/hello-world/
(i.e: the port ":80" is now added on the $redirect_url)
- Trace further up as the source of the port, it seems that it come from $original[] variable within the function
70 $original = @parse_url($requested_url);
which sometimes have port information parsed, but sometimes is not.
TEMPORARY SOLUTION (so far okay)
The idea is how the code could handle "port" better:
- if there is no port information in $original variable, why bother adding port info to the redirection
- but if there is one, the redirection needs to also include the port info to prevent mismatch (hence infinite loop) i.e http://[domain.com]:80/ redirect to http://[domain.com]/ which then redirect to http://[domain.com]:80/ and so on.
Here is the code that I come up that solve this issue so far
386 // Handle ports 387 if ( isset($original['port']) && !empty($original['port'])) { 388 if ( !empty($user_home['port']) ) 389 $redirect['port'] = $user_home['port']; 390 else { 391 $redirect['port']='443'; 392 if (isset($original['scheme']) && !strcmp($original['scheme'],'http')) $redirect['port']='80'; 393 } 394 }
i.e: if user is using exoctic/unusual port they need to put it on the $WP_HOME anyway - otherwise assuming 80 and 443 would be quite common.
What causing the port number to come up? I assume it is due to the reverse proxy configuration in 1 server where parameter passing is using the port number that is not common. (e.g between https:443 to http:8080), but nevertheless, the more proper handling of port info as described above should be logical and necessary to prevent the unnecessary infinite loop.
I categorize this bug initially as "major" because it happens without any variety of user data, it is from a fresh install and reverse-proxy caching is a common technique used even in a shared hosting environment.
The expert at WordPress can determine whether the solution is robust enough to be implemented in more permanently.
So, with modification of that, people that use reverse proxy setting can still use canonicalization feature from WordPress without infinite loop problem. Also they can remove:
remove_filter('template_redirect', 'redirect_canonical');
from their functions.php
Thanks, hope this helps.
Related: #33821