Make WordPress Core

Opened 5 years ago

Last modified 4 years ago

#49369 new defect (bug)

redirect_canonical() should strip trailing protocols

Reported by: onlyonemj's profile onlyonemj Owned by:
Milestone: Awaiting Review Priority: normal
Severity: major Version: 5.3.2
Component: Canonical Keywords: needs-patch needs-unit-tests
Focuses: Cc:

Description

I was crawling some of my sites as I was setting up SEO strategies and found really odd 301 redirects on all tested sites.

  1. https://domain.com/http:// redirects to https://domain.com/http:/ (one forward slash less). So it's basically a nonexistent page redirecting to another nonexistent page
  1. To figure what could be causing this I needed to check if this was happening to just this site or others. I discovered that all sites suffered the same issue regardless of host or webserver stack.
  1. Wondering if Cloudflare a common plugin was causing it, I unproxied and entered all the sites in dev mode and had all plugins disabled and the problem still persists

Spiders/crawlers pick this up - appending http:// to end of root URLs manually is not required.

This issue persists - likely a bug in WordPress redirect API. Anybody care to replicate and confirm? Is this a bug?

Change History (3)

#1 in reply to: ↑ description @SergeyBiryukov
5 years ago

  • Component changed from HTTP API to Canonical
  • Focuses rest-api removed
  • Keywords needs-patch needs-unit-tests added; 2nd-opinion dev-feedback needs-dev-note removed
  • Summary changed from Phantom redirects to redirect_canonical() should strip trailing protocols

Hi there, welcome to WordPress Trac! Thanks for the report.

Replying to onlyonemj:

This issue persists - likely a bug in WordPress redirect API. Anybody care to replicate and confirm? Is this a bug?

Yes, this appears to be an issue with the redirect_canonical() function, which strips multiple slashes from URL, but doesn't handle a case like this. It seems stripping http:// completely would be the expected behavior here.

#2 @onlyonemj
5 years ago

Thanks for replying @SergeyBiryukov, not a hardcore coder but the multiple slashes link you gave looks like it strips extraneous forward slashes at the end of URLs which is expected behaviour in the case of folder paths. It doesn't address whether if WP is responsible for creating the http: prepend to the forward slashes? Is WordPress creating those or are the crawlers wrong? One crawler detects it but Screaming Frog isn't. The http:// page appending right after root does not exist anywhere nor was ever created.

Last edited 5 years ago by onlyonemj (previous) (diff)

#3 @donmhico
4 years ago

@onlyonemj - Thanks for the ticket. I'm having trouble finding out where in WP Core is http:// is being appended in the URL.

While it's possible to add a logic in redirect_canonical() which strips out http:// at the end of the URL but I feel like that won't be tackling the source of the problem.

Did you try disabling the Cloudflare plugin since you mentioned its a common plugin. Also try checking your sitemaps if https://domain.com/http:// is there.

Note: See TracTickets for help on using tickets.