WordPress.org

Make WordPress Core

Opened 3 months ago

Last modified 2 months ago

#43539 reviewing defect (bug)

Custom feed types breaks redirect_canonical behavior

Reported by: satantime Owned by: SergeyBiryukov
Milestone: 5.0 Priority: normal
Severity: normal Version: 4.9.4
Component: Canonical Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

Hi.

There's plugin "fb-instant-articles" which adds "instant-articles" feed type for facebook. final url looks like http://localhost/feed/instant-articles. But because "redirect_canonical" doesn't respect custom feed types it triggers 301 redirect to http://localhost/feed/instant-articles/feed/instant-articles/.

"instant-articles" is registered with $wp_rewrite->feeds but this property isn't respected in "redirect_canonical" method: http://take.ms/kq0zJ http://take.ms/cbRPm We need there to support custom types too like that for example: http://take.ms/ljija

Attachments (4)

canonical.php (27.1 KB) - added by satantime 3 months ago.
Possible fix
43539.diff (2.4 KB) - added by soulseekah 3 months ago.
Tests + patch
43539.1.diff (3.4 KB) - added by soulseekah 3 months ago.
Fix global scope infection
43539.2.diff (2.5 KB) - added by soulseekah 3 months ago.

Download all attachments as: .zip

Change History (22)

@satantime
3 months ago

Possible fix

#1 @mukesh27
3 months ago

  • Keywords good-first-bug added

#2 @soulseekah
3 months ago

  • Keywords good-first-bug removed

Thanks for the report. This behaviour is in line with all the feed types.

http://localhost/feed/atom will 301 to http://localhost/feed/atom/ this is expected behavior, is it not?

Also sounds awfully specific to https://github.com/Automattic/facebook-instant-articles-wp/issues

Last edited 3 months ago by soulseekah (previous) (diff)

#3 @satantime
3 months ago

Hi,

thank you for your comment.

Redirect http://localhost/feed/atom to 301 http://localhost/feed/atom/ is default WP behavior out of the box without any plugins activated.

The source of this ticket is to show problem of how WP handles custom feed types, and facebook-instant-articles-wp plugin is just an example to demonstrate the issue.

#4 @soulseekah
3 months ago

What is the problem with this redirect? Is it wrong? Why should it not redirect? Should it redirect from /atom/ to /atom?

#5 @satantime
3 months ago

Please pay attention to what I wrote in the description of this ticket.

For custom feeds url is doubled because variable wasn't reduced properly, so if atom would be custom feed then /feed/atom/ or /feed/atom redirect to /feed/atom/feed/atom/.

#6 @soulseekah
3 months ago

Cannot reproduce in 4.9.4 clean install with that plugin enabled. It doesn't even redirect to the slashed version.

Both localhost/feed/instant-articles and localhost/feed/instant-articles/ just work without any redirection in either.

So it looks like it SHOULD redirect to the slashed, canonical version, but it's not. Either way, I'm not seeing double URLs in the form of /feed/atom/feed/atom/.

Any thoughts?

#7 @satantime
3 months ago

Might you pay more attention to details which I gave in the description? As I told this plugin is just an example when redirect_canonical doesn't work properly. You don't get redirect because of infinity redirect which always adds one more /feed/test to your: http://take.ms/YLhV6

to simulate custom feed please add to functions.php of your current theme something like that:

add_action('init', 'test_feed');
function test_feed()
{
    add_feed('test', 'void');
}

add_filter('request', 'test_request');
function test_request($value)
{
    $value = array(
        'feed' => 'test',
    );
    return $value;
}

now you have custom feed type "test". then add debug output there: http://take.ms/Ajkjb

header('Content-Type: text/plain');
var_dump($redirect_url);
exit;

now try to access localhost/feed/test or localhost/feed/test/ and you'll see https://localhost/feed/test/feed/test/

why it's https://localhost/feed/test/feed/test/ you can read in description with proposed fix.

#8 @soulseekah
3 months ago

Oh, I'm so sorry for annoying you!

There is no redirect happening, though.

now try to access localhost/feed/test or localhost/feed/test/
and you'll see https://localhost/feed/test/feed/test/

This does not happen on a fresh install of WordPress:

~: curl -v http://stream.localhost/feed/test
*   Trying 127.0.0.1...
* Connected to stream.localhost (127.0.0.1) port 80 (#0)
> GET /feed/test HTTP/1.1
> Host: stream.localhost
> User-Agent: curl/7.47.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: nginx/1.10.3 (Ubuntu)
< Date: Sat, 17 Mar 2018 18:54:34 GMT
< Content-Type: application/octet-stream; charset=UTF-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< X-Powered-By: PHP/7.2.0
< Last-Modified: Sat, 17 Mar 2018 18:00:35 GMT
< ETag: "b93afcef48181c119af6e2dc30f9e48a"
< Link: <http://stream.localhost/wp-json/>; rel="https://api.w.org/"
< 
* Connection #0 to host stream.localhost left intact
~: curl -v http://stream.localhost/feed/test/
*   Trying 127.0.0.1...
* Connected to stream.localhost (127.0.0.1) port 80 (#0)
> GET /feed/test/ HTTP/1.1
> Host: stream.localhost
> User-Agent: curl/7.47.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Server: nginx/1.10.3 (Ubuntu)
< Date: Sat, 17 Mar 2018 18:55:28 GMT
< Content-Type: application/octet-stream; charset=UTF-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< X-Powered-By: PHP/7.2.0
< Last-Modified: Sat, 17 Mar 2018 18:00:35 GMT
< ETag: "b93afcef48181c119af6e2dc30f9e48a"
< Link: <http://stream.localhost/wp-json/>; rel="https://api.w.org/"
< 
* Connection #0 to host stream.localhost left intact

There is no redirect. My feed outputs void, just like I told it to. No redirects. Period.

What am I missing? What's the actual use-case here? What's the setup?

#9 @satantime
3 months ago

No problem, try to do next steps:

1 fresh wp installation. 2 add to functions.php of current theme next code:

add_action('init', 'test_feed');
function test_feed()
{
    add_feed('test', 'void');
}

add_filter('request', 'test_request');
function test_request($value)
{
    $value = array(
        'feed' => 'test',
    );
    return $value;
}
  1. in file /wp-includes/canonical.php after line 531 and before 532 add
    header('Content-Type: text/plain');
    var_dump($redirect_url);
    exit;
    
  2. execute curl -v http://stream.localhost/feed/test/

and you'll see status 200 but output will be http://stream.localhost/feed/test/feed/test/ - this url is potential url to trigger redirect.

But there's no redirect because check on original line 532 if ( !redirect_canonical($redirect_url, false) ) { returns http://stream.localhost/feed/test/feed/test/feed/test/ instead of empty string and that's caused because core WP doesn't handle properly custom feed types in redirect_canonical function and always adds /feed/test/ on top of current url. More info you can find on screenshots from the description of this ticket.

Last edited 3 months ago by satantime (previous) (diff)

#10 @soulseekah
3 months ago

Does an actual Location: ... redirect header happen? Or are you basing your assumptions based on a theoretical canonical semi-state that you see inside an internal function? I'd like to see this theoretical bug being applied, if possible.

I doubt that you were intentionally just dumping variables inside the canonical functions one day and went "Oh, this is weird, why is this $redirect_url wrong?" No. Something happened and you had to go digging into that function. So what exactly happened? How does this state manifest itself to a point that one would need to go look at irrelevant canonical states? How do you make that redirect actually happen?

And thank you for your patience. I'm trying to get the full picture here, the implications. So far, that redirect doesn't seem to be manifesting itself. So in what practical (or even theoretical) scenario would it manifest itself?

Thanks!

#11 @soulseekah
3 months ago

  • Keywords needs-unit-tests needs-patch added

I do see that the $redirect_url value in there is set to http://stream.localhost/feed/test/feed/test/ and canonical intends to do a redirect. But redirect_canonical() returns false, and end of story.

So question is:

  1. Why does redirect_canonical() think it needs to do a redirect?
  2. And then why does the nested call to redirect_canonical() think that it no longer needs to do a redirect?

I do agree that redirect_canonical() should not even reach down there, $do_redirect should not be set, and the function has to bail very early.

Moreover shouldn't the canonical URL be set to feed/test/ with a trailing slash. Might be what it's trying to do, but with some luck it seems that it prevents a redirect loop.

This is a good issue. Thanks for the explanation, although the story of how you came about to learn about it, seeing how it's all hidden and everything is still going to bug me for a long time :)

Seems to be a valid bug, without an apparent manifestation (some sort of redirect loop protection is kicking in). I'll look into this more tomorrow and you and I will figure out a good fix, and more importantly tests for this. But do explain how you found out about this bug, please! :)

Cheers.

#12 follow-up: @satantime
3 months ago

If you would check code above then you can see there's no location: because // protect against chained redirects on line 532 is triggered. It is triggered because wp doesn't respect custom feed types properly and instead of to resolve /feed/type/ as /feed/type/ and does nothing, or just redirects /feed/type to /feed/type/, and doesn't trigger extra redirects as /feed/type/feed/type/ to /feed/type/feed/type/feed/type/ and to so on. So // protect against chained redirects is hotfix for broken implementation about custom feed types.

Your doubt are wrong. if you can see this bug isn't theoretical, it's real and requires real fix. I would suggest you to show that task to someone from engineering team and they'll understand the problem instantly. I know how it's hard to work with code when there's not enough experience with it.

About patch and unit test - might you point me to source code repo, I can implement everything without an issue, so right guy just need to click merge button.

#13 @satantime
3 months ago

Also for your requirements we can change description.

custom feed types don't follow canonical redirect standard. when http://localhost/feed redirects to http://localhost/feed/ but http://localhost/feed/test doesn't redirect to http://localhost/feed/test/

source of problem and solution are the same as in was mentioned in the description.

#14 in reply to: ↑ 12 @soulseekah
3 months ago

Replying to satantime:

Your doubt are wrong. if you can see this bug isn't theoretical, it's real and requires real fix. I would suggest you to show that task to someone from engineering team and they'll understand the problem instantly. I know how it's hard to work with code when there's not enough experience with it.

We are all the engineering team on this blessed day. Right, guys? :)

#15 @satantime
3 months ago

Sure, while we're going to fix this issue - I'm fine.

Let's change the issue of this ticket to

custom feed types don't follow canonical redirect standard. when http://localhost/feed redirects to http://localhost/feed/, when http://localhost/feed/atom redirects to http://localhost/feed/atom/, but http://localhost/feed/test doesn't redirect to http://localhost/feed/test/

should I just update title and description of this ticket?

About your questions:

redirect_canonical() should issue redirect in case when current url isn't canonical url.

So for http://localhost/feed/test canonical url is http://localhost/feed/test/ and when we access http://localhost/feed/test we should get redirected to http://localhost/feed/test/ as we have with default feed types like atom, rss etc.

Moreover shouldn't the canonical URL be set to feed/test/ with a trailing slash. Might be what it's trying to do, but with some luck it seems that it prevents a redirect loop. right - it should redirect to feed/test/ but because every call just adds one more feed/test/ and by the end // protect against chained redirects doesn't allow to trigger redirect.

About how I found this - I'll tell you story after released fix :)

@soulseekah
3 months ago

Tests + patch

#16 @soulseekah
3 months ago

  • Keywords has-patch has-unit-tests added; needs-unit-tests needs-patch removed

Tests and patch. The use of preg_quote here is contingent on #43571 otherwise right now added feeds are not quoted.

@soulseekah
3 months ago

Fix global scope infection

@soulseekah
3 months ago

#17 @soulseekah
2 months ago

  • Milestone changed from Awaiting Review to Future Release

#18 @SergeyBiryukov
2 months ago

  • Milestone changed from Future Release to 5.0
  • Owner set to SergeyBiryukov
  • Status changed from new to reviewing
Note: See TracTickets for help on using tickets.