#14619 closed defect (bug) (wontfix)
404 Errors from RTL characters appended to URL
| Reported by: |
|
Owned by: | |
|---|---|---|---|
| Priority: | normal | Milestone: | |
| Component: | Canonical | Version: | 3.0.1 |
| Severity: | minor | Keywords: | rtl |
| Cc: | list.andy@… |
Description
This is an interesting bug I found via webmaster tools
Example URL
http://andybeard.eu/2210/google-stopbadware.html%20-%20%D8%A8%D8%B1%DB%8C%D8%AA%D8%A7%D9%86%DB%8C
The URL has some arabic characters added to it I assume by mistake, but is constructed to theoretically handle quite a lot of errors with URL formation by using the post ID at the start.
This results in a 404 error so can't even be handled by canonicalization tags.
Change History (5)
comment:2
markjaquith — 2 years ago
- Resolution set to wontfix
- Status changed from new to closed
The slug is wrong... we can't really determine where the legit slug stops and the extra characters begin. I think this is wontfix for now.
The slug shouldn't be needed on the first example because you have the page ID
If you chop off the characters it works
comment:5
SergeyBiryukov — 2 years ago
If a permalink structure contains post ID, we can probably use it as the essential part and cut everything else. However this seems to be an edge case which can be solved by a custom rewrite rule.

Another example showing it isn't just RTL
http://andybeard.eu/2007/02/ultimate-list-of-dofollow-plugins-banish-nofollow-from-comments-and-trackbacks.html%3Cbr%20/%3E
Without the trailing characters it should (and does) redirect to
http://andybeard.eu/434/ultimate-list-of-dofollow-plugins-banish-nofollow-from-comments-and-trackbacks.html