Opened 14 years ago
Last modified 5 years ago
#14201 new defect (bug)
Canonical redirect kicks in in case of category/tag base containing other chars then a-z, 0-9, _ and -
Reported by: | nbachiyski | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | |
Component: | Canonical | Keywords: | needs-patch close needs-testing |
Focuses: | Cc: |
Description
Expected behaviour
Whatever the category base is, if we go to a properly formed category pretty permalink, canonical redirects shouldn't kick in.
Actual behaviour
If the category base is non-ASCII (for example: баба), the canonical redirect tries to redirect to the same URL. The redirect sanitizer removes the category base from the URL, because it is non-ASCII and redirects to <root>//category-name/
. This prevents endless redirects and usually results in 404.
Why does it happen?
Category base is always used verbatim. It can't be URL-encoded, because the percent signs will be interpreted as permalink variables. Because of that the generated urls will be always in the form: <root>/баба/<url-encoded-category-name>/
.
The contents of $_SERVER['REQUEST_URI']
are always URL-encoded, so the requested URI is: <root>/%D0%B1%D0%B0%D0%B1%D0%B0/<url-encoded-category-name>/
.
Canonical redirect functionality assumes the requested URL would be the same as the generated term URL and since they are different tries to redirect.
Solutions
The easiest one is to assume that if we had come to the right category page without any get variables, we don't need the logic for redirecting to the canonical category page. This is valid statement, because that logic relies only on removing get arguments.
The only disadvantage with that solution is that doesn't solve the more general problem of discrepancies between generated and requested URLs. But for now it will do a good job.
Attachments (3)
Change History (27)
#3
@
14 years ago
The taxonomy redirect logic was added after 2.9 was released. Just nobody had tested with non-ASCII base.
#4
@
14 years ago
Fix my wp-includes/canonical.php via the attachment. Wait and see if the tag permalink with Chinese or Japanese (non-ASCII) still have the redirect errors in Google webmaster tools.
#5
@
14 years ago
It seems that the spiders and visitors are still get 301.
"/tag/%E6%B8%B8%E6%88%8F%E9%9F%B3%E6%95%88 Http Code: 301"
If just nobody tested the logic or something else add to 3.0, I am wondering if anybody had tested the patch with non-ASCII base?
#7
@
14 years ago
I was able to reproduce the issue:
- Set category base to "баба" in the "Category base"-textbox in Admin -> Permalinks Settings.
- Visiting a blog post with the "Uncategorized" category.
- Clicking on that category link (
http://host/wordpress-trunk/%D0%B1%D0%B0%D0%B1%D0%B0/uncategorized/
). - HTTP Client navigates to
http://host/wordpress-trunk/uncategorized/
displaying a 404 / Not Found Page (HTTP Status code is 404).
#8
@
14 years ago
Request URI is 'REQUEST_URI' => string '/wordpress-trunk/%D0%B1%D0%B0%D0%B1%D0%B0/uncategorized/'
and there is a match:
object(WP) ... public 'request' => string 'баба/uncategorized' (length=22) public 'matched_rule' => string 'баба/(.+?)/?$' (length=17) public 'matched_query' => string 'category_name=uncategorized' (length=27) public 'did_permalink' => boolean true
Maybe this is not an input / option store issue.
#9
@
14 years ago
the template_redirect
hook in template-loader.php:7
calls redirect_canonical()
(wp-includes\canonical.php:160
) and no parameters which results in a different parsing of $_SERVER['REQUEST_URI']
then in WP.
This difference is the cause of error.
#11
@
14 years ago
Found the place where the invalid chars were inserted into the URL. Fixed in patch, please test.
#12
@
14 years ago
- Summary changed from Canonical redirect kicks in in case of non-ASCII category/tag base to Canonical redirect kicks in in case of category/tag base containing other chars then a-z, 0-9, _ and -
The following ASCII Category base does not work as well: %D0%B1%D0%B0%D0%B1%D0%B0
So it's even not possible for users to fix the problem by providing an encoded value via the backend.
#13
@
14 years ago
My first patch did interfere with rewrites-generation which did not preserve the benefit afer rebuilding the permalink structure. Second patch is now one level over that layer. Fixes both, the category and the tag base.
#14
@
14 years ago
In the end this is basically the same route as the reporter nbachiyski already wrote [reproduceable, reviewed], but differs in the solution.
Next to the reported broken request-ability of any category/tag, the following functions are broken:
wp_xmlrpc_server::wp_getTags()
- invalid URLswp_xmlrpc_server::mw_getCategories()
- invalid URLscategory-template.php()
- invalid URLsget_the_category_list()
- invalid URLsget_term_link()
- invalid URLsget_the_term_list()
- invalid URLswp_tag_cloud()
- invalid URLsget_term_feed_link()
- invalid URLswp_setup_nav_menu_item()
- invalid URLsget_the_taxonomies()
- invalid URLsWalker::start_el()
- invalid URLs
WP_Rewrite::$extra_permastructs
is something else probably worth to deal with (in output).
#15
@
14 years ago
Use Nikolay's patch for 3.0.1 and pursue fixing all of the places noted by hakre for 3.1?
#16
@
14 years ago
I think hakre's is the right approach, but we need to get the code checked-in very early in the cycle, so that people can test. I have a couple of people, who reported it and they can test in almost production environment.
Out of curiosity, this a regression?