Canonical redirect kicks in in case of category/tag base containing other chars then a-z, 0-9, _ and -
|Reported by:||nbachiyski||Owned by:|
|Component:||Canonical||Keywords:||needs-patch close needs-testing|
Whatever the category base is, if we go to a properly formed category pretty permalink, canonical redirects shouldn't kick in.
If the category base is non-ASCII (for example: баба), the canonical redirect tries to redirect to the same URL. The redirect sanitizer removes the category base from the URL, because it is non-ASCII and redirects to <root>//category-name/. This prevents endless redirects and usually results in 404.
Why does it happen?
Category base is always used verbatim. It can't be URL-encoded, because the percent signs will be interpreted as permalink variables. Because of that the generated urls will be always in the form: <root>/баба/<url-encoded-category-name>/.
The contents of $_SERVER['REQUEST_URI'] are always URL-encoded, so the requested URI is: <root>/%D0%B1%D0%B0%D0%B1%D0%B0/<url-encoded-category-name>/.
Canonical redirect functionality assumes the requested URL would be the same as the generated term URL and since they are different tries to redirect.
The easiest one is to assume that if we had come to the right category page without any get variables, we don't need the logic for redirecting to the canonical category page. This is valid statement, because that logic relies only on removing get arguments.
The only disadvantage with that solution is that doesn't solve the more general problem of discrepancies between generated and requested URLs. But for now it will do a good job.
Change History (27)
- Summary changed from Canonical redirect kicks in in case of non-ASCII category/tag base to Canonical redirect kicks in in case of category/tag base containing other chars then a-z, 0-9, _ and -
- Component changed from General to Canonical
- Keywords needs-patch added; has-patch early 3.2-early removed