Make WordPress Core

Opened 11 days ago

#63063 new defect (bug)

IDN domains are erroneously URL-encoded in the wp_sanitize_redirect() function

Reported by: calpeconsulting's profile calpeconsulting Owned by:
Milestone: Awaiting Review Priority: normal
Severity: minor Version: 6.7.2
Component: Charset Keywords: needs-patch
Focuses: Cc:

Description

Overview

There is an issue with how Internationalised Domain Names (IDNs) are handled in the WordPress redirect system, specifically when an IDN is used in the "WordPress Address" or "Site Address" settings. The problem occurs when WordPress tries to redirect the user back to the post after submitting a comment. The domain part of the URL, which should remain in its IDN format, is incorrectly processed and URL-encoded by WordPress.

WordPress and IDNs

WordPress fully supports IDN domains in the General site settings (under "WordPress Address (URL)" and "Site Address (URL)"). These fields allow users to set an IDN domain (such as simon.schönbeck.dk) for their website without any issues.

The IDN domain should not undergo any transformation when used in URLs within WordPress, as the domain is already properly handled and encoded when set in the site's settings.

Redirection Process

When a comment is posted, WordPress triggers a redirect to the comment's location on the post. This is done using the $location variable, which contains the full URL (including the post's IDN domain).

The Problem

During this process, the function wp_sanitize_redirect() is called. This function is responsible for sanitising and cleaning up the redirect URL.

Unexpected Behaviour

The wp_sanitize_redirect() function calls _wp_sanitize_utf8_in_redirect(). This function URL-encodes any UTF-8 characters in the URL, which includes characters in the domain name (e.g., the ö in simon.schönbeck.dk is encoded as %C3%B6).

This transformation should not occur for IDN domains, as the domain part is already in a valid format (IDN is treated differently from regular UTF-8 encoding).

Effect

The problem arises because the sanitisation process applies URL encoding to the domain part of the IDN URL, such as converting characters like ö to %C3%B6. This encoding breaks the validation of the domain name in wp_validate_redirect(), which expects the domain to be in a valid, non-encoded format.

Since the domain with URL-encoded characters does not pass the validation checks, the fallback URL is triggered. By default, this fallback URL is set to the WordPress admin page (admin.php), resulting in the user being incorrectly redirected to the admin dashboard rather than back to the post they came from.

This issue mainly affects guest commentators who do not need to log in before commenting. After the comment is successfully submitted, but due to the failed URL validation, WordPress redirects them to the admin panel as a fallback URL. Since they are not logged in, they are then redirected to the login page, even though no login is required to post a comment.

Solution

The IDN domain part of the URL should not be sanitised or URL-encoded for UTF-8 characters, as it is already in a valid format. The sanitisation process should respect the IDN format, preventing unnecessary transformations that break the validation.

Workaround

Until a fix is implemented, a workaround is to manually encode your site/blog URL as Punycode in the "WordPress Address (URL)" and "Site Address (URL)" settings. This ensures that the domain part is in the correct format and avoids the encoding issues caused by the sanitisation process.

Change History (0)

Note: See TracTickets for help on using tickets.