Make WordPress Core

Opened 15 years ago

Closed 15 years ago

Last modified 15 years ago

#10298 closed defect (bug) (worksforme)

Error in /wp-admin/ when using danish characters (æøå) in domain names

Reported by: kjeldsen's profile kjeldsen Owned by: markjaquith's profile markjaquith
Milestone: Priority: normal
Severity: normal Version: 2.8
Component: Canonical Keywords: has-patch
Focuses: Cc:

Description

I have set up a wp-install at a domain name that contains the danish character "æ" on the .dk-TLD. E.g. "domæne.dk"

It works seamlessly on the front end of the site, but it fails in /wp-admin/.

When I log in, post or update pages/posts the script forwards me to "domne.dk" - the "æ" has been removed from the domain name.

Log in, saves and changes are all registered properly, but I have to manually insert the missing character in the url to continue. It's like as if the error occurs on the final forward.

Kind regards,
Michael

Attachments (1)

pluggable.php.patch (487 bytes) - added by dwright 15 years ago.
patch assumes the idna library idna_convert.class.php is available in the path

Download all attachments as: .zip

Change History (16)

#1 @Denis-de-Bernardy
15 years ago

  • Component changed from Administration to Security
  • Milestone changed from Unassigned to 2.8.1
  • Owner set to ryan

likely related to esc_url_raw()

#2 @markjaquith
15 years ago

  • Milestone changed from 2.9 to Future Release

This needs someone to spearhead and narrow down.

#3 follow-up: @dwright
15 years ago

  • Cc david_v_wright@… added

Honestly, I feel this is attributable to a bug in php itself
http://bugs.php.net/bug.php?id=45657 (they of course disagree)

so, the (ugly) work around (acceptable?) is to use a html meta redirect.

If we could count on a idn_to_ascii method being available in a typical WordPress environment
then 'header' would work. (but I doubt we can)

(PHP 5 >= 5.3.0, PECL intl >= 1.0.2)
$location = idn_to_ascii($location)

The included diff is one approach that would resolve the issue.

this fix assumes that the domain name they wish to use
is set in general settings (wp-admin/options-general.php) as WordPress address (URL)

Another possible approach would be to add a (optional) field to the wp-config.php file

Something like 'IDN name', one could put the punycode version of the url here.

then if that variable is set, make sure the header redirect would use it.

# support for IDN names
define('IDN_AS_PUNYCODE', 'xn--domne-ura.dk');
domæne.dk -> (as punycode) -> xn--domne-ura.dk

#4 @dwright
15 years ago

  • Keywords has-patch added

#5 @scribu
15 years ago

Related: #10690

#6 @dwright
15 years ago

  • Keywords has-patch removed

Nevermind, cancel the patch, it's too simplistic, it just bypasses the location ascii regex

It would be nice if we could assume that if the user specified siteurl was the same as the actual location that everything was ok but we probably can't.

I'll leave the patch here, so it's easy to see what the issue is.

boils down to 2 issues.

  1. the native php header function does not support IDN names.
  2. WordPress needs actual support for IDN names, something like idn_to_ascii (if we had the above, we could still use the existing ascii preg_replace, just run the idn_to_ascii prior)

there are several IDN libraries around, most are GPL'd, I don't mind adding one but I'm
probably not the right person to make that call.

#7 in reply to: ↑ 3 ; follow-up: @hakre
15 years ago

Replying to dwright:

Honestly, I feel this is attributable to a bug in php itself
http://bugs.php.net/bug.php?id=45657 (they of course disagree)

Because it's not a bug in PHP. Your comment is pretty much misleading.


This is not a bug. I suggest to just configure such blogs to the PUNYCODE notation of the domain instead the (technically non-existing) UTF8 representation.

I suggest to close as invalid. The patch IMHO is a wreck. It might do the job for the reporter but it actually does break parts of the sofware which are not broken.

#8 in reply to: ↑ 7 ; follow-up: @dwright
15 years ago

Replying to hakre:

I suggest to close as invalid. The patch IMHO is a wreck. It might do the job for the reporter but it actually does break parts of the sofware which are not broken.

As I mentioned in bug 10690, I believe Wordpress should be an IDNA-enabled application.

In order to achieve that, an php IDN library needs to be employed. The most promising one appears to be: http://freshmeat.net/projects/php_net_idna and it is GPL'd.

Using that library this issue would be resolved by the included patch, this current patch should work for all cases and not break any existing functionality.

@dwright
15 years ago

patch assumes the idna library idna_convert.class.php is available in the path

#9 @scribu
15 years ago

  • Component changed from Security to Canonical
  • Keywords has-patch added
  • Owner changed from ryan to markjaquith

#10 @dwright
15 years ago

In re-thinking about this, it seems that IDN support may well be best served by a plugin.

I wrote the plugin and just need to finish up the docs, I'll submit it to WP plugins in the next few days, so if it's decided a plugin is the way to go, it will be there. (otherwise, I can always remove it)

#11 follow-up: @scribu
15 years ago

It's good to have a plugin first, to see what chalenges appear.

It could be integrated into core in the future, when IDN are more prevalent.

Please close as 'worksforme' when you have the plugin ready.

#12 in reply to: ↑ 11 @dwright
15 years ago

  • Resolution set to worksforme
  • Status changed from new to closed

Replying to scribu:

It's good to have a plugin first, to see what chalenges appear.

Ok, that makes a lot of sense.

Please close as 'worksforme' when you have the plugin ready.

I submitted to WP Plugins proper, until I hear from them the plugin is available at http://www.dwright.us/misc/idna/index.html

#13 in reply to: ↑ 8 ; follow-up: @hakre
15 years ago

Replying to dwright:

As I mentioned in bug #10690, I believe Wordpress should be an IDNA-enabled application.

Shure, why not? I never spoke against IDNs but using UTF8 in HTTP headers. And having HTTP working is much more important if you ask me. I see you've found IDN / PUNYCODE, that's a directorion to go.

In order to achieve that, an php IDN library needs to be employed. The most promising one appears to be: http://freshmeat.net/projects/php_net_idna and it is GPL'd.

Sorry that's me who needs to say that: That library is written in PHP 5. Net_IDNA ships with PHP 4 support. WordPress is PHP 4 code, so PHP 5 is not an option here (but PHP 5 is very much OK for a plugin).

Using that library this issue would be resolved by the included patch, this current patch should work for all cases and not break any existing functionality.

to load library files, please see this line of code:

require_once dirname(dirname(__FILE__)) . '/Renderer.php';

(Taken from wp-includes/Text/Diff/Renderer/inline.php line 19)

This might give you an idea how you can include, for example with your plugin:

require_once dirname(dirname(__FILE__)) . 'idna_convert.class.php';

(having the full path prevents the system to search your include file in the various places)

Keep in mind that HTTP headers must be valid ASCII, that's why

header("Location: http://www.uddannelsesstøtte.dk");

is not a bug in PHP. that's not ASCII (7bit) for the headers, so just do not wonder if that does not work properly here and there. You can learn all details about HTTP in RFC 1945 and RFC 2616.

I hope this information was helpfull to you to improve your plugin and to think about how this can be properly implemented in the core. Libraries for core are located in the wp-includes directory.

#14 @nacin
15 years ago

  • Milestone Future Release deleted

#15 in reply to: ↑ 13 @dwright
15 years ago

Replying to hakre:

Sorry that's me who needs to say that: That library is written in PHP 5. Net_IDNA ships with PHP 4 support. WordPress is PHP 4 code, so PHP 5 is not an option here (but PHP 5 is very much OK for a plugin).

version 6.0 was PHP4 (current is 6.3), it wouldn't really be much work to to get a php4 only version. If it's decided in the future that IDNA support is indeed desired in WordPress core, I'm sure a code review of any potential libraries would happen then. I ended up using php_net_idna in the IDNA http://wordpress.org/extend/plugins/idna/ plugin.

Keep in mind that HTTP headers must be valid ASCII, that's why

header("Location: http://www.uddannelsesstøtte.dk");

is not a bug in PHP. that's not ASCII (7bit) for the headers, so just do not wonder if that does not work properly here and there. You can learn all details about HTTP in RFC 1945 and RFC 2616.

Yes, I'm aware that HTTP headers are US-ASCII, until I had read http://www.rfc-editor.org/rfc/rfc3490.txt (IDNA rfc) I expected that the PHP header function should be able to do the conversion. Now, I see that it's an enhancement to be implemented by developers on the application level.

Note: See TracTickets for help on using tickets.