Ticket #209 (closed defect (bug): fixed)

Opened 8 years ago

Last modified 5 years ago

Subject line of sent e-mails is not UTF-8

Reported by: crculver Owned by: rob1n
Priority: normal Milestone: 2.2
Component: Administration Version:
Severity: normal Keywords:
Cc:

Description

Although WordPress suggests using UTF-8 internally, this is not respected in the subject line of the e-mails sent out by the admin scripts. For example, my blog's name is a series of Greek letters in UTF-8, but when I get an e-mail after reporting a lost password, the name of the blog in the subject line appears as gibberish.

(I'm using Emacs-mew which normally shows UTF-8 subject lines fine, so I assume this is a WP thing).

Attachments

0000209-email_mime.diff Download (4.3 KB) - added by crculver 7 years ago.
209.diff Download (708 bytes) - added by rob1n 5 years ago.

Change History

comment:2   cal7 years ago

details for sending utf-8 email subjects:  http://code.iamcal.com/php/utf8_mail/readme.txt

i'll work on a patch

comment:3   matt7 years ago

Would optionally using the mb_send_mail function (if available) have any effect on this?

comment:4   cal7 years ago

the attached patch adds the function mail_encoded() with a similar prototype to mail().

it added the relevant email headers (mime-type, version and optionally from) and escapes the subject using quoted printable.

Note that bug 263 contains a workaround for a PHP bug which will need applying to the patch for this bug.

I wrote a similiar patch :-)

Some problems i see with your patch: There might be a problem with "_" for space in different charsets where %20 is not the space. Plus your function doesn't take care of the fact that subject-lines may only be 76 chars of length ...

comment:7   rq7 years ago

WARNING! you MUST encrypt ALL the headers, not only the subject. Currently, wp isn't following RFC's. With cal's patch, it won't be following them either.

BTW, it would be a lot easier to use Base64 encoding for headers:

encode a given header text to base64 function encode_header($header) {

$ret = '=?' . get_settings('blog_charset') . '?B?' . base64_encode($header) . '?=';

return $ret; }

I would suggest to use this function for encoding MIME headers, and add those MIME-version and Content-type headers on each send mail request. That would allow easy processing of EVERY header or a part of it (i.e., the first part of the "From" header).

comment:8   matt7 years ago

  • Owner changed from anonymous to rboren
  • Status changed from new to assigned
  • Patch set to No

Prodding this one. Still an issue? Realistic to fix?

  • Keywords bg|2nd-opinion bg|dev-feedback added
  • Keywords bg|needs-patch added

Just tried putting ™ in the blog name (as in a random UTF8 char) and this is still and issue.

For experience i think base64 encoding the subject line is probably a good plan here

  • Cc sjmurdoch added

This also affects my blog, "A ſecurity diſcourſe". Emails are sent with an invalid subject which is displayed as "[A Å¿ecurity diÅ¿courÅ¿e] ..."

Switching to base64 would make subjects unreadable for clients that do not support MIME types in headers (e.g.  exmh). What about using quoted printable for non ASCII characters? This would allow UTF-8 but would still make subject legible for non-MIME aware clients.

  • Cc kpumuk added

I have same problem and solved it in following way (version 2.0.2):

if ( !function_exists('wp_mail') ) :
function wp_mail($to, $subject, $message, $headers = '') {
	if( $headers == '' ) {
		$headers = "MIME-Version: 1.0\n" .
			"From: wordpress@" . preg_replace('#^www\.#', '', strtolower($_SERVER['SERVER_NAME'])) . "\n" . 
			"Content-Type: text/plain; charset=\"" . get_settings('blog_charset') . "\"\n";
	}

	return @mail($to, wp_encodeMimeSubject($subject), $message, $headers);
}
function wp_encodeMimeSubject($s) {
   
   $lastspace=-1;
   $r="";
   $buff="";
   
   $mode=1;
   
   for ($i=0; $i<strlen($s); $i++) {
       $c=substr($s,$i,1);
       if ($mode==1) {
           $n=ord($c);
           if ($n & 128) {
               $r.="=?" . get_settings('blog_charset') . "?Q?";
               $i=$lastspace;
               $mode=2;
           } else {
               $buff.=$c;
               if ($c==" ") {
                   $r.=$buff;
                   $buff="";
                   $lastspace=$i;
               }
           }
       } else if ($mode==2) {
           $r.=wp_qpchar($c);
       }
   }
   if ($mode==2) $r.="?=";
   
   return $r;
   
}

function wp_qpchar($c) {
   $n=ord($c);
   if ($c==" ") return "_";
   if ($n>=48 && $n<=57) return $c;
   if ($n>=65 && $n<=90) return $c;
   if ($n>=97 && $n<=122) return $c;
   return "=".($n<16 ? "0" : "").strtoupper(dechex($n));
   
}
endif;
  • Version changed from 1.2 to 2.0.4
  • Severity changed from trivial to normal

Is this ever going to be fixed? I'm using kpumuk's change and it works fine. This is not a 'trivial' bug for foreign language blogs.

  • Cc jimlick added

I've taken kpumuk's code and turned it into a plugin so that it's easily installable.

 http://jameslick.com/wp-rfc2047/

This is tested on WordPress 2.0.4.

  • Keywords bg|has-patch added; bg|needs-patch removed
  • Version 2.0.4 deleted

I think developers should think about fixing this issue. It can get very annoying. No, sorry. It does get very annoying, when you receive notifications on new comments (from your own or some other blog) and you are not able to tell by glancing at subject (in some cases at sender, too) what this email is about. For example:

From: XXXXXXXXXXXXXXXXXXXX XXXXXXXX XXXXXXXXXXXX XXXXXXXXXXXXXXXXXX <wordpress@somesite.lv> 
Subject: [XXXXXXXXXX XXXXXXXXXX] Comment: "PXXrXXtis frXXXXu"

Also, we need to take in account, that, according to RFC one not only needs to base64_encode contents of headers, but it is also required to split it into multiple lines, if resulting encoded string is longer than 84 chars.

This bug/feature is open since Aug 2004. I hope, that it is time to fix this.

comment:19 follow-up: ↓ 20   rob1n5 years ago

  • Cc Sebbi, cal, Citizen K, sjmurdoch, kpumuk, jimlick removed
  • Keywords needs-testing dev-feedback added; bg|2nd-opinion bg|dev-feedback bg|has-patch removed
  • Milestone set to 2.1.1

IMO this should go into 2.1.1. I'm not sure (it's not likely, considering the age of the patch) that the patch still applies to the current trunk. Probably needs a new patch.

comment:20 in reply to: ↑ 19   foolswisdom5 years ago

  • Milestone changed from 2.1.1 to 2.2

Replying to rob1n:

IMO this should go into 2.1.1. I'm not sure (it's not likely, considering the age of the patch) that the patch still applies to the current trunk. Probably needs a new patch.

Only high severity bugs should be targeted for 2.1.1 . This patch also adds new functionality to fix the problem making it higher risk. WP 2.2 is targeted for release April 23rd.

Your position would be much more compelling if you confirmed that the patch still applies, and tested it.

  • Owner changed from ryan to rob1n
  • Status changed from assigned to new

Okay, patch doesn't apply to the current trunk.

I'll work on a new patch.

  • Owner changed from rob1n to ryan

rob1n5 years ago

  • Owner changed from ryan to rob1n
  • Status changed from new to assigned

Okay, new patch added that adds crculver code to wp_mail, since WordPress no longer uses PHP's mail() function directly.

  • Keywords has-patch added; dev-feedback removed

See #3862

  • Status changed from assigned to closed
  • Resolution set to fixed

Should be fixed as we now use phpmailer (see #3862).

  • Keywords needs-testing has-patch removed
Note: See TracTickets for help on using tickets.