Opened 3 years ago
Last modified 3 years ago
#52865 new enhancement
Strip 'enclosed' trailing spaces in URLs
Reported by: | jonoaldersonwp | Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | low |
Severity: | normal | Version: | |
Component: | Canonical | Keywords: | seo |
Focuses: | performance | Cc: |
Description (last modified by )
#20383 made improvements that strip trailing punctuation from URLs. E.g., https://ma.tt/2012/03/productivity-per-square-inch%20 redirects correctly to the canonical URL.
However, URLs like https://ma.tt/2012/03/productivity-per-square-inch%20/ (which 'enclose' the trailing space with a trailing slash) are not redirected. It, and others like it, typically return a 404 error.
This kind of 'broken link' pattern is extremely common on the web; particular as a trailing slash is often appended to a malformed URL before WP runs (e.g., via a server/htaccess/nginx configuration).
We should refine the canonical redirect logic (in redirect_canonical
) to also consider and redirect these types of requests.
Considerations
- The "Remove trailing spaces and end punctuation from the path" section of
redirect_canonical
doesn't consider the presence of trailing slashes in the URL. This could/should be adapted to catch those.
- There might be cases where a user 'legitimately' has a permalink structure (or slug) that ends in
%20
or%20/
. That might(?) make a fix more complicated than just sniffing for whether the permalink structure ends with/
.
- It looks like it's inconsistent in WP where
%20
(and/or%20/
) can be added to slugs or structures. It's stripped in some places, but not in others.
- Should a permalink or slug be 'allowed' to contain, or end in, a space character? If this is being stripped in some parts of WP, maybe that's a good argument to prevent it elsewhere/everywhere. In which case, fixing this becomes a lot simpler.