#209 closed defect (bug) (fixed)
Subject line of sent e-mails is not UTF-8
Reported by: | crculver | Owned by: | rob1n |
---|---|---|---|
Milestone: | 2.2 | Priority: | normal |
Severity: | normal | Version: | |
Component: | Administration | Keywords: | |
Focuses: | Cc: |
Description
Although WordPress suggests using UTF-8 internally, this is not respected in the subject line of the e-mails sent out by the admin scripts. For example, my blog's name is a series of Greek letters in UTF-8, but when I get an e-mail after reporting a lost password, the name of the blog in the subject line appears as gibberish.
(I'm using Emacs-mew which normally shows UTF-8 subject lines fine, so I assume this is a WP thing).
Attachments (2)
Change History (29)
#3
@
20 years ago
Would optionally using the mb_send_mail function (if available) have any effect on this?
#4
@
20 years ago
the attached patch adds the function mail_encoded() with a similar prototype to mail().
it added the relevant email headers (mime-type, version and optionally from) and escapes the subject using quoted printable.
#5
@
20 years ago
Note that bug 263 contains a workaround for a PHP bug which will need applying to the patch for this bug.
#6
@
20 years ago
I wrote a similiar patch :-)
Some problems i see with your patch:
There might be a problem with "_" for space in different charsets where %20 is not the space. Plus your function doesn't take care of the fact that subject-lines may only be 76 chars of length ...
#7
@
20 years ago
WARNING! you MUST encrypt ALL the headers, not only the subject. Currently, wp isn't following RFC's. With cal's patch, it won't be following them either.
BTW, it would be a lot easier to use Base64 encoding for headers:
encode a given header text to base64
function encode_header($header) {
$ret = '=?' . get_settings('blog_charset') . '?B?' . base64_encode($header) . '?=';
return $ret;
}
I would suggest to use this function for encoding MIME headers, and add those MIME-version and Content-type headers on each send mail request. That would allow easy processing of EVERY header or a part of it (i.e., the first part of the "From" header).
#8
@
20 years ago
- Owner changed from anonymous to rboren
- Patch set to No
- Status changed from new to assigned
#11
@
19 years ago
- Keywords bg|needs-patch added
Just tried putting ™ in the blog name (as in a random UTF8 char) and this is still and issue.
For experience i think base64 encoding the subject line is probably a good plan here
#12
@
19 years ago
- Cc sjmurdoch added
This also affects my blog, "A ſecurity diſcourſe". Emails are sent with an invalid subject which is displayed as "[A Å¿ecurity diÅ¿courÅ¿e] ..."
Switching to base64 would make subjects unreadable for clients that do not support MIME types
in headers (e.g. exmh). What about using quoted printable for non ASCII characters? This would allow UTF-8 but would still make subject legible for non-MIME aware clients.
#13
@
19 years ago
- Cc kpumuk added
I have same problem and solved it in following way (version 2.0.2):
if ( !function_exists('wp_mail') ) : function wp_mail($to, $subject, $message, $headers = '') { if( $headers == '' ) { $headers = "MIME-Version: 1.0\n" . "From: wordpress@" . preg_replace('#^www\.#', '', strtolower($_SERVER['SERVER_NAME'])) . "\n" . "Content-Type: text/plain; charset=\"" . get_settings('blog_charset') . "\"\n"; } return @mail($to, wp_encodeMimeSubject($subject), $message, $headers); } function wp_encodeMimeSubject($s) { $lastspace=-1; $r=""; $buff=""; $mode=1; for ($i=0; $i<strlen($s); $i++) { $c=substr($s,$i,1); if ($mode==1) { $n=ord($c); if ($n & 128) { $r.="=?" . get_settings('blog_charset') . "?Q?"; $i=$lastspace; $mode=2; } else { $buff.=$c; if ($c==" ") { $r.=$buff; $buff=""; $lastspace=$i; } } } else if ($mode==2) { $r.=wp_qpchar($c); } } if ($mode==2) $r.="?="; return $r; } function wp_qpchar($c) { $n=ord($c); if ($c==" ") return "_"; if ($n>=48 && $n<=57) return $c; if ($n>=65 && $n<=90) return $c; if ($n>=97 && $n<=122) return $c; return "=".($n<16 ? "0" : "").strtoupper(dechex($n)); } endif;
#14
@
18 years ago
- Severity changed from trivial to normal
- Version changed from 1.2 to 2.0.4
Is this ever going to be fixed? I'm using kpumuk's change and it works fine. This is not a 'trivial' bug for foreign language blogs.
#16
@
18 years ago
I've taken kpumuk's code and turned it into a plugin so that it's easily installable.
http://jameslick.com/wp-rfc2047/
This is tested on WordPress 2.0.4.
#18
@
18 years ago
- Version 2.0.4 deleted
I think developers should think about fixing this issue. It can get very annoying. No, sorry. It does get very annoying, when you receive notifications on new comments (from your own or some other blog) and you are not able to tell by glancing at subject (in some cases at sender, too) what this email is about. For example:
From: XXXXXXXXXXXXXXXXXXXX XXXXXXXX XXXXXXXXXXXX XXXXXXXXXXXXXXXXXX <wordpress@somesite.lv> Subject: [XXXXXXXXXX XXXXXXXXXX] Comment: "PXXrXXtis frXXXXu"
Also, we need to take in account, that, according to RFC one not only needs to base64_encode contents of headers, but it is also required to split it into multiple lines, if resulting encoded string is longer than 84 chars.
This bug/feature is open since Aug 2004. I hope, that it is time to fix this.
#19
follow-up:
↓ 20
@
18 years ago
- Cc Sebbi cal Citizen K sjmurdoch kpumuk jimlick removed
- Keywords needs-testing dev-feedback added; bg|2nd-opinion bg|dev-feedback bg|has-patch removed
- Milestone set to 2.1.1
IMO this should go into 2.1.1. I'm not sure (it's not likely, considering the age of the patch) that the patch still applies to the current trunk. Probably needs a new patch.
#20
in reply to:
↑ 19
@
18 years ago
- Milestone changed from 2.1.1 to 2.2
Replying to rob1n:
IMO this should go into 2.1.1. I'm not sure (it's not likely, considering the age of the patch) that the patch still applies to the current trunk. Probably needs a new patch.
Only high severity bugs should be targeted for 2.1.1 . This patch also adds new functionality to fix the problem making it higher risk. WP 2.2 is targeted for release April 23rd.
Your position would be much more compelling if you confirmed that the patch still applies, and tested it.
#21
@
18 years ago
- Owner changed from ryan to rob1n
- Status changed from assigned to new
Okay, patch doesn't apply to the current trunk.
I'll work on a new patch.
#23
@
18 years ago
- Owner changed from ryan to rob1n
- Status changed from new to assigned
Okay, new patch added that adds crculver code to wp_mail, since WordPress no longer uses PHP's mail() function directly.
details for sending utf-8 email subjects:
http://code.iamcal.com/php/utf8_mail/readme.txt
i'll work on a patch