Opened 11 days ago
#63063 new defect (bug)
IDN domains are erroneously URL-encoded in the wp_sanitize_redirect() function
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | minor | Version: | 6.7.2 |
Component: | Charset | Keywords: | needs-patch |
Focuses: | Cc: |
Description
Overview
There is an issue with how Internationalised Domain Names (IDNs) are handled in the WordPress redirect system, specifically when an IDN is used in the "WordPress Address" or "Site Address" settings. The problem occurs when WordPress tries to redirect the user back to the post after submitting a comment. The domain part of the URL, which should remain in its IDN format, is incorrectly processed and URL-encoded by WordPress.
WordPress and IDNs
WordPress fully supports IDN domains in the General site settings (under "WordPress Address (URL)" and "Site Address (URL)"). These fields allow users to set an IDN domain (such as simon.schönbeck.dk) for their website without any issues.
The IDN domain should not undergo any transformation when used in URLs within WordPress, as the domain is already properly handled and encoded when set in the site's settings.
Redirection Process
When a comment is posted, WordPress triggers a redirect to the comment's location on the post. This is done using the $location variable, which contains the full URL (including the post's IDN domain).
The Problem
During this process, the function wp_sanitize_redirect() is called. This function is responsible for sanitising and cleaning up the redirect URL.
Unexpected Behaviour
The wp_sanitize_redirect() function calls _wp_sanitize_utf8_in_redirect(). This function URL-encodes any UTF-8 characters in the URL, which includes characters in the domain name (e.g., the ö in simon.schönbeck.dk is encoded as %C3%B6).
This transformation should not occur for IDN domains, as the domain part is already in a valid format (IDN is treated differently from regular UTF-8 encoding).
Effect
The problem arises because the sanitisation process applies URL encoding to the domain part of the IDN URL, such as converting characters like ö to %C3%B6. This encoding breaks the validation of the domain name in wp_validate_redirect(), which expects the domain to be in a valid, non-encoded format.
Since the domain with URL-encoded characters does not pass the validation checks, the fallback URL is triggered. By default, this fallback URL is set to the WordPress admin page (admin.php), resulting in the user being incorrectly redirected to the admin dashboard rather than back to the post they came from.
This issue mainly affects guest commentators who do not need to log in before commenting. After the comment is successfully submitted, but due to the failed URL validation, WordPress redirects them to the admin panel as a fallback URL. Since they are not logged in, they are then redirected to the login page, even though no login is required to post a comment.
Solution
The IDN domain part of the URL should not be sanitised or URL-encoded for UTF-8 characters, as it is already in a valid format. The sanitisation process should respect the IDN format, preventing unnecessary transformations that break the validation.
Workaround
Until a fix is implemented, a workaround is to manually encode your site/blog URL as Punycode in the "WordPress Address (URL)" and "Site Address (URL)" settings. This ensures that the domain part is in the correct format and avoids the encoding issues caused by the sanitisation process.