Opened 8 years ago
Closed 3 years ago
#41211 closed defect (bug) (duplicate)
When the /category/category-name portion is repeated in the URL, it serves content instead of 404
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 4.8 |
Component: | Rewrite Rules | Keywords: | has-patch |
Focuses: | Cc: |
Description
Scenario
We have two urls:
- http://example.com/category/category-name/
- http://example.com/category/category-name/category/category-name/
The second URL above has the '/category/category-name/' portion of the URL repeated.
Expected outcome
- When visiting http://example.com/category/category-name/category/category-name/ the displays a 404 page as it's not a valid URL.
Actual outcome
- The page with the same content as http://example.com/category/category-name/ is served when visiting http://example.com/category/category-name/category/category-name/
Change History (4)
#2
@
3 years ago
This issue also exists when you use a custom permalink structure such as: /%category%/%postname%/
. I'm using category as an example here, but this would happen for all hierarchical taxonomies.
Due to how the rewrites are added, when %category%
appears first in the permalink structure, we get the following rule added category/(.+?)/?$
. Due to the openness of the regex used for hierarchical taxonomies, it will match anything between the /category
and the end of the URL.
For example, set up a category with the slug testing
. It should only be accessible at:
/category/testing/
It's actually accessible at /category/(.+?)/testing
. In practice, this could look like any of the following:
/category/asdf/testing
/category/asdf/1234/testing
/category/asdf/1234/xy-z/testing
I agree with @bradleyt here; taxonomies should have a function like get_page_by_path()
that can double-check that the path in use correctly matches that of a term.
There's a patch incoming that should handle this.
This ticket was mentioned in PR #2344 on WordPress/wordpress-develop by darylldoyle.
3 years ago
#3
- Keywords has-patch added; needs-patch removed
This PR adds a new get_term_by_path()
function that works similarly to get_page_by_path()
. It's then used to stop hierarchical taxonomies from returning terms on paths that should instead 404.
Trac ticket: https://core.trac.wordpress.org/ticket/41211
#4
@
3 years ago
- Milestone Awaiting Review deleted
- Resolution set to duplicate
- Status changed from new to closed
Hi there, welcome to WordPress Trac!
Thanks for the report, we're already tracking this issue in #18734.
@enshrined Thanks for the PR! Could you move it to the other ticket, to keep the discussion in one place? Thanks again!
I can reproduce this. It's worth noting that this has SEO implications as WordPress does not output canonical meta tags on categories by default (
rel_canonical
only runs on singular posts and pages).Digging deeper it seems that any hierarchical categories are affected (including custom taxonomies registered with
register_taxonomy()
). Non-heirachical taxonomies, such as tags, do not seem to be affected.In
class-wp-query.php
I've found the following code:I'm not 100% sure, but it looks like WordPress may be extracting the taxonomy type from the URL, extracting the last URL section, and then ignoring everything in between.
For pages, this situation is avoided by checking for
get_page_by_path
inparse_request
inclass-wp.php
. Heirachical taxonomies could be checked in the same way. Alternatively, thehandle_404
method inclass-wp.php
could be extended to check the URL sections match the found taxonomies parent.