Make WordPress Core

Opened 20 years ago

Closed 18 years ago

Last modified 18 years ago

#209 closed defect (bug) (fixed)

Subject line of sent e-mails is not UTF-8

Reported by: crculver's profile crculver Owned by: rob1n's profile rob1n
Milestone: 2.2 Priority: normal
Severity: normal Version:
Component: Administration Keywords:
Focuses: Cc:

Description

Although WordPress suggests using UTF-8 internally, this is not respected in the subject line of the e-mails sent out by the admin scripts. For example, my blog's name is a series of Greek letters in UTF-8, but when I get an e-mail after reporting a lost password, the name of the blog in the subject line appears as gibberish.

(I'm using Emacs-mew which normally shows UTF-8 subject lines fine, so I assume this is a WP thing).

Attachments (2)

0000209-email_mime.diff (4.3 KB) - added by crculver 20 years ago.
209.diff (708 bytes) - added by rob1n 18 years ago.

Download all attachments as: .zip

Change History (29)

#2 @cal
20 years ago

details for sending utf-8 email subjects:
http://code.iamcal.com/php/utf8_mail/readme.txt

i'll work on a patch

#3 @matt
20 years ago

Would optionally using the mb_send_mail function (if available) have any effect on this?

#4 @cal
20 years ago

the attached patch adds the function mail_encoded() with a similar prototype to mail().

it added the relevant email headers (mime-type, version and optionally from) and escapes the subject using quoted printable.

#5 @mikelittle
20 years ago

Note that bug 263 contains a workaround for a PHP bug which will need applying to the patch for this bug.

#6 @Sebbi
20 years ago

I wrote a similiar patch :-)

Some problems i see with your patch:
There might be a problem with "_" for space in different charsets where %20 is not the space. Plus your function doesn't take care of the fact that subject-lines may only be 76 chars of length ...

#7 @rq
20 years ago

WARNING! you MUST encrypt ALL the headers, not only the subject. Currently, wp isn't following RFC's. With cal's patch, it won't be following them either.

BTW, it would be a lot easier to use Base64 encoding for headers:

encode a given header text to base64
function encode_header($header) {

$ret = '=?' . get_settings('blog_charset') . '?B?' . base64_encode($header) . '?=';

return $ret;
}

I would suggest to use this function for encoding MIME headers, and add those MIME-version and Content-type headers on each send mail request. That would allow easy processing of EVERY header or a part of it (i.e., the first part of the "From" header).

#8 @matt
20 years ago

  • Owner changed from anonymous to rboren
  • Patch set to No
  • Status changed from new to assigned

#9 @markjaquith
19 years ago

Prodding this one. Still an issue? Realistic to fix?

#10 @markjaquith
19 years ago

  • Keywords bg|2nd-opinion bg|dev-feedback added

#11 @westi
19 years ago

  • Keywords bg|needs-patch added

Just tried putting ™ in the blog name (as in a random UTF8 char) and this is still and issue.

For experience i think base64 encoding the subject line is probably a good plan here

#12 @sjmurdoch
19 years ago

  • Cc sjmurdoch added

This also affects my blog, "A ſecurity diſcourſe". Emails are sent with an invalid subject which is displayed as "[A Å¿ecurity diÅ¿courÅ¿e] ..."

Switching to base64 would make subjects unreadable for clients that do not support MIME types
in headers (e.g. exmh). What about using quoted printable for non ASCII characters? This would allow UTF-8 but would still make subject legible for non-MIME aware clients.

#13 @kpumuk
19 years ago

  • Cc kpumuk added

I have same problem and solved it in following way (version 2.0.2):

if ( !function_exists('wp_mail') ) :
function wp_mail($to, $subject, $message, $headers = '') {
	if( $headers == '' ) {
		$headers = "MIME-Version: 1.0\n" .
			"From: wordpress@" . preg_replace('#^www\.#', '', strtolower($_SERVER['SERVER_NAME'])) . "\n" . 
			"Content-Type: text/plain; charset=\"" . get_settings('blog_charset') . "\"\n";
	}

	return @mail($to, wp_encodeMimeSubject($subject), $message, $headers);
}
function wp_encodeMimeSubject($s) {
   
   $lastspace=-1;
   $r="";
   $buff="";
   
   $mode=1;
   
   for ($i=0; $i<strlen($s); $i++) {
       $c=substr($s,$i,1);
       if ($mode==1) {
           $n=ord($c);
           if ($n & 128) {
               $r.="=?" . get_settings('blog_charset') . "?Q?";
               $i=$lastspace;
               $mode=2;
           } else {
               $buff.=$c;
               if ($c==" ") {
                   $r.=$buff;
                   $buff="";
                   $lastspace=$i;
               }
           }
       } else if ($mode==2) {
           $r.=wp_qpchar($c);
       }
   }
   if ($mode==2) $r.="?=";
   
   return $r;
   
}

function wp_qpchar($c) {
   $n=ord($c);
   if ($c==" ") return "_";
   if ($n>=48 && $n<=57) return $c;
   if ($n>=65 && $n<=90) return $c;
   if ($n>=97 && $n<=122) return $c;
   return "=".($n<16 ? "0" : "").strtoupper(dechex($n));
   
}
endif;

#14 @jimlick
18 years ago

  • Severity changed from trivial to normal
  • Version changed from 1.2 to 2.0.4

Is this ever going to be fixed? I'm using kpumuk's change and it works fine. This is not a 'trivial' bug for foreign language blogs.

#15 @jimlick
18 years ago

  • Cc jimlick added

#16 @jimlick
18 years ago

I've taken kpumuk's code and turned it into a plugin so that it's easily installable.

http://jameslick.com/wp-rfc2047/

This is tested on WordPress 2.0.4.

#17 @shorty114
18 years ago

  • Keywords bg|has-patch added; bg|needs-patch removed

#18 @laacz
18 years ago

  • Version 2.0.4 deleted

I think developers should think about fixing this issue. It can get very annoying. No, sorry. It does get very annoying, when you receive notifications on new comments (from your own or some other blog) and you are not able to tell by glancing at subject (in some cases at sender, too) what this email is about. For example:

From: XXXXXXXXXXXXXXXXXXXX XXXXXXXX XXXXXXXXXXXX XXXXXXXXXXXXXXXXXX <wordpress@somesite.lv> 
Subject: [XXXXXXXXXX XXXXXXXXXX] Comment: "PXXrXXtis frXXXXu"

Also, we need to take in account, that, according to RFC one not only needs to base64_encode contents of headers, but it is also required to split it into multiple lines, if resulting encoded string is longer than 84 chars.

This bug/feature is open since Aug 2004. I hope, that it is time to fix this.

#19 follow-up: @rob1n
18 years ago

  • Cc Sebbi cal Citizen K sjmurdoch kpumuk jimlick removed
  • Keywords needs-testing dev-feedback added; bg|2nd-opinion bg|dev-feedback bg|has-patch removed
  • Milestone set to 2.1.1

IMO this should go into 2.1.1. I'm not sure (it's not likely, considering the age of the patch) that the patch still applies to the current trunk. Probably needs a new patch.

#20 in reply to: ↑ 19 @foolswisdom
18 years ago

  • Milestone changed from 2.1.1 to 2.2

Replying to rob1n:

IMO this should go into 2.1.1. I'm not sure (it's not likely, considering the age of the patch) that the patch still applies to the current trunk. Probably needs a new patch.

Only high severity bugs should be targeted for 2.1.1 . This patch also adds new functionality to fix the problem making it higher risk. WP 2.2 is targeted for release April 23rd.

Your position would be much more compelling if you confirmed that the patch still applies, and tested it.

#21 @rob1n
18 years ago

  • Owner changed from ryan to rob1n
  • Status changed from assigned to new

Okay, patch doesn't apply to the current trunk.

I'll work on a new patch.

#22 @rob1n
18 years ago

  • Owner changed from rob1n to ryan

@rob1n
18 years ago

#23 @rob1n
18 years ago

  • Owner changed from ryan to rob1n
  • Status changed from new to assigned

Okay, new patch added that adds crculver code to wp_mail, since WordPress no longer uses PHP's mail() function directly.

#24 @rob1n
18 years ago

  • Keywords has-patch added; dev-feedback removed

#26 @rob1n
18 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

Should be fixed as we now use phpmailer (see #3862).

#27 @rob1n
18 years ago

  • Keywords needs-testing has-patch removed
Note: See TracTickets for help on using tickets.