#8948 closed defect (bug) (fixed)
permalink redirection is too liberal
Reported by: | mrmist | Owned by: | markjaquith |
---|---|---|---|
Milestone: | 3.0 | Priority: | normal |
Severity: | normal | Version: | 2.7 |
Component: | Canonical | Keywords: | has-patch featured |
Focuses: | Cc: |
Description
As described here -
http://wordpress.org/support/topic/237126?replies=2
If I visit
www.example.com/t
It'll happily redirect me to (something like)
www.example.com/2002/06/11/taadaa
And so on.
(The above assumes date & name permalinks, but similar stuff happens with custom permalinks like category/article/ when it'll even pick a random category for you.)
This should really be returning 404s.
Attachments (1)
Change History (56)
#1
@
16 years ago
- Component changed from General to Canonical
- Owner changed from anonymous to markjaquith
#2
@
15 years ago
- Milestone changed from 2.8 to Future Release
Punting to be evaluated in next development cycle due to time constraints.
#5
in reply to:
↑ 3
@
15 years ago
- Priority changed from normal to low
Replying to Denis-de-Bernardy:
wontfix?
It's hardly the biggest deal in the world, but to explain the rational behind it, I was thinking that in a piece of software that really tries to give one URL to one thing, this sticks out as doing just the opposite - it allows many things to redirect to one place.
Viper's cited case is perhaps understandable, but in no way can I understand the idea behind redirecting
/foo/baa/
To
/news/barry-white/
Which does happen, and is just random.
#6
@
15 years ago
I think it would be okay to require like 3 characters in the path for canonical to kick in.
#7
@
15 years ago
- Cc miqrogroove@… added
Sorry to pile on here guys, but you need to think bigger. In taxonomy handling the exact opposite problem exists:
/category/subcat2/ is treated as identical to /category/cat1/subcat2/ with no forwarding.
Whatever strategy is adopted, it should be applied globally instead of piecemeal.
#10
@
15 years ago
if the issue truly is about emails, how about this approach?
a) if it doesn't look like a singular, then bail
b) if it's a singular page, then check the length of the url. if it's not 70-80 chars long then it was probably wasn't cut in an email, so bail.
c) we're sure we've a singular whose url was cut. we try to find the correct one using chosen bits from our current algorithm
#13
@
15 years ago
I agree with mrmist about the random redirects. Make a better 404 page that says, "maybe you were looking for .. <link> " That's what Google recommends to do anyway. And then stop treating categories as non-hierarchical!
#14
@
15 years ago
+1 to most of that. I particularly like the idea of a "Maybe you were looking for" type page.
Really I just don't want a completely random / short selection of characters to redirect to a post URL. I can see the point about cut-off email URLs, but in that case I think that DDBs logic should apply.
#15
follow-up:
↓ 23
@
15 years ago
+1 for taking care. If we led the SEO-Kids know about this feature, that would start a hell of a fire. Redirect poisioning of WP based sites, outch.
#16
@
15 years ago
Just tested if there is a problem with trashed posts ( #11236 ) which is not the case.
#17
@
15 years ago
Found another forwarding problem in attachments.
/yyyy/mm/slug1/attach1/ is treated as identical to /yyyy/mm/ablabllskjfsakd/attach1/
#18
@
15 years ago
/2010/01testing-123/ forwarded to /2009/12/testing/ for no apparent reason. The only reasonable replacement in this situation would be /2010/01/testing-123/
#20
@
15 years ago
/2010/01/slug1/&ct=ga&cd=XmAHzoevp-g&usg=AFQjCNGzZI7HVL_W48417QI_E5dOtXDWJQ/ is treated as identical to /2010/01/
This one brought to you by some random URL spammer.
#21
@
15 years ago
... or perhaps not so random. The previous request appears to be an invalid representation of the Google Image Search imageref query param, canonical slashed, and then re-mis-interpreted by WordPress.
If the canonical redirection can't handle that type of request correctly, then it needs to be met with a 404 response.
#22
@
15 years ago
/yyyy/mm/slug1/slug2/ redirects to /yyyy/mm/slug2/ I think this is the opposite of what is supposed to happen in canonicalization, since the canonical URL for /yyyy/mm/slug1/garbage* should always be /yyyy/mm/slug1/
#23
in reply to:
↑ 15
@
15 years ago
Replying to hakre:
+1 for taking care. If we led the SEO-Kids know about this feature, that would start a hell of a fire. Redirect poisioning of WP based sites, outch.
It's worth noting that these redirects that are created by canonical redirection are 301s, so the redirection is permanent.
Fixing SEO problems was the whole point of this sort of thing. A 301 redirect from a URL is treated as the redirected, final, URL by search engines. Not as the original one.
So somebody having a cut-off URL posted not only goes to the right place, but also makes Google take notice and not count that cut-off as a separate URL.
Whether the code is redirecting unnecessarily or not is up for debate, but SEO-problems are not a valid reason to change this, because SEO-problems are the reason it was added in the first place.
#24
@
15 years ago
/categoryslug1/ is treated as identical to /category/slug1/
This one is not just theoretical. These are getting Googled in the wild. :(
#25
@
15 years ago
Same problem at /tagslug1/
I realize the [/]* regexp is impossible for hierarchical URLs in rewrite, so these taxonomy issues probably have to be fixed in the canonical system.
#26
@
15 years ago
- Keywords featured added; 2nd-opinion removed
I don't think there is any need for 2nd-opinion on this ticket. There are obvious and severe problems with URL duplication. These problems are not handled correctly by any existing feature, including rel=canonical, canonical redirect, and .htaccess.
#27
follow-up:
↓ 28
@
15 years ago
/category/uncategorized/&oogabooga/ is treated as identical to /category/uncategorized/
Again, rel=canonical is not showing up and there is no redirection.
#28
in reply to:
↑ 27
;
follow-ups:
↓ 29
↓ 30
@
15 years ago
Replying to miqrogroove:
/category/uncategorized/&oogabooga/ is treated as identical to /category/uncategorized/
Again, rel=canonical is not showing up and there is no redirection.
This is actually correct because you are viewing /category/uncategorized/
but with $_GET['oogabooga/']
set I believe.
#29
in reply to:
↑ 28
@
15 years ago
Replying to Viper007Bond:
Replying to miqrogroove:
/category/uncategorized/&oogabooga/ is treated as identical to /category/uncategorized/
Again, rel=canonical is not showing up and there is no redirection.
This is actually correct because you are viewing
/category/uncategorized/
but with$_GET['oogabooga/']
set I believe.
It's no different from visiting /category/uncategorized/?foo=bar
which shouldn't redirect either. Who knows what a plugin is making that page do -- nothing or maybe display totally different content.
#30
in reply to:
↑ 28
;
follow-up:
↓ 36
@
15 years ago
Replying to Viper007Bond:
This is actually correct because you are viewing
/category/uncategorized/
but with$_GET['oogabooga/']
set I believe.
Heh, that would be incredibly unfortunate if it's true. I know PHP would never do that. You're saying WordPress adds extra values to super global variables?
#31
@
15 years ago
I added var_dump($_GET);
to the default single.php template to test Viper007Bond's claim that ampersands are arbitrarily parsed into query variables.
First I tried hitting /2010/01/testing-123/&oogabooga/ and that path was treated as identical to the front page o_O The single.php template didn't even load, so that was a total failure.
Next, I tried hitting /2010/01/testing-123/&oogabooga/page/1/ and that path was forwarded to /2009/12/testing/ for no apparent reason. Another canonical failure.
Thirdly, I added var_dump($_GET);
to the index.php template and tried hitting /&oogabooga/ which was also treated as indentical to the front page. The output was as expected array(0) { }
#32
@
15 years ago
Thirdly, I added var_dump($_GET); to the index.php template and tried hitting /&oogabooga/ which was also treated as indentical to the front page. The output was as expected array(0) { }
I've not seen many servers which are setup to treat ?& as the query seperator. Most browsers will encode the & behind the scenes.
/category/<cat>/ExtraUselessDataThatsNotASubCat/ should be returning a 404, As it currently is.
/category/<cat>/&ExtraUselessDataThatsNotASubCat/ seems to handle the & incorrectly.
/category/<cat>/ seems to handle the slashes incorrectly, That should be redirecting too i believe.
#35
@
15 years ago
/category/<cat>/&ExtraUselessDataThatsNotASubCat/ seems to handle the & incorrectly.
Fixed, The new eval() replacement query parser was creating category_name=cat-name-here/&foo=bar&post_name=blah
, and parsing that, Whereas, It should've been treating &foo=bar as part of the category name. Fixed in [13071]
/category/<cat>/ seems to handle the slashes incorrectly, That should be redirecting too i believe.
That doesnt happen for posts, as we generate the posts permalink and redirect if its different, [13072] strips multiple slashes out of the URL path if they're present, that should solve that problem.
/yyyy/mm/slug1/slug2/ redirects to /yyyy/mm/slug2/ I think this is the opposite of what is supposed to happen in canonicalization, since the canonical URL for /yyyy/mm/slug1/garbage* should always be /yyyy/mm/slug1/
Remains.
#36
in reply to:
↑ 30
@
15 years ago
Replying to miqrogroove:
Replying to Viper007Bond:
This is actually correct because you are viewing
/category/uncategorized/
but with$_GET['oogabooga/']
set I believe.
Heh, that would be incredibly unfortunate if it's true. I know PHP would never do that. You're saying WordPress adds extra values to super global variables?
Mmm, you're right -- I was incorrect. I swore that worked, but I guess I was incorrect.
#38
@
15 years ago
Wow, dd32 is officially my fav WP dev :D Keep up the awesome effort here! :)
btw you've got if ( strpos($original['path']
preceding the $redirect stuff. Needs consistency?
#39
@
15 years ago
btw you've got if ( strpos($originalpath? preceding the $redirect stuff. Needs consistency?
Ah.. Yeah forgot that one! Thanks
#40
follow-up:
↓ 41
@
15 years ago
I swore that worked, but I guess I was incorrect.
I think what dd32 was getting at was the /&var=val syntax was being consumed by the WP_Query object, so it was internally equivalent to /?var=val but totally bogus at the HTTP layer.
#41
in reply to:
↑ 40
@
15 years ago
Replying to miqrogroove:
I swore that worked, but I guess I was incorrect.
I think what dd32 was getting at was the /&var=val syntax was being consumed by the WP_Query object, so it was internally equivalent to /?var=val but totally bogus at the HTTP layer.
I thought file.php&foo=bar
worked, but that may only be with odd server configs. :)
#43
@
15 years ago
When WP_DEBUG is off, /?cat=99999 is forwarding to /category/ Shouldn't that be a 404?
When WP_DEBUG is on, Canonical crashes. See #12246
#44
@
15 years ago
btw, I have a bad feeling that /?cat=99999 falls under the jurisdiction of #11348 which is currently at -1. I would appreciate SOME agreement that status 200 and 302 are inappropriate for infinite namespaces.
#45
@
15 years ago
I found a different issue:
going to ?tag=foo,bar
redirects to /tags/foo
See http://ryan.boren.me/2007/10/01/taxonomy-intersections-and-unions/
#46
@
15 years ago
- Keywords has-patch added; needs-patch removed
- Milestone changed from Future Release to 3.0
multiple-terms.diff checks for tagin, tagand etc. before attempting a redirect.
#48
@
15 years ago
Hey I've got some more for ya :D
/page1/ is treated as identical to / on 2.9 and 3.0.
/page2/ redirects to /page2/page/2/ on 3.0 only. I get a 404 in 2.9.
This is screwing up valid slugs.
#49
@
15 years ago
/page1/ is treated as identical to / on 2.9 and 3.0.
The WordPress paging system actually accepts /page/1/ and /page1 by design. The same goes for /page2 and /page/2
The canonical system doesnt reconise /page2 as a paging link however. 2 options are available for this situation:
- Remove Support for /page(\d+)/? whilst retaining /page/(\d+)/?
- Fix canonical to support the /page(\d+)/? links (An additional 2
?
's)
'page(\d+)' are not valid slugs in WordPress when using pretty permalinks, its not possible to separate the requests.
#50
@
15 years ago
The canonical system doesnt reconise /page2 as a paging link however
Which is why its redirecting to /page2/page/2/
The paging rules will always have a higher priority than page rules, so it'll always be impossible to separate the requests correctly.
#51
@
15 years ago
Okay uh... We should probably re-open the numeric slugs ticket then, because the post editor is generating those pagen slugs.
Perhaps, but the thinking was if a URL got cut off, it should guess it.
We'll take your example:
http://www.example.com/2002/06/11/taadaa/
If that got cut off in an e-mail, IM, whatever and was
http://www.example.com/2002/06/11/taa
, WordPress would be smart enough to redirect to the proper post.