Make WordPress Core

Opened 4 years ago

Closed 20 months ago

#18521 closed defect (bug) (invalid)

Wp_mail function: email subject with multibyte chars is not encoded properly

Reported by: jetpackpony Owned by:
Milestone: Priority: normal
Severity: normal Version: 3.2.1
Component: External Libraries Keywords:
Focuses: Cc:


When an email with multibyte characters (like Swedish é, å, etc.) in the subject is sent from Wordpress (using wp_mail), the subject of the email at the reciever end looks like this: =?UTF-8?Q?New_site_name_Site:_J=C3=A9t_inqsfzxb_p=C3=A5_?= =?UTF-8?Q?=C3=A9?=

Create the new blog with a multibyte char in the blog title and then activate it.
For example: go to /wp-content/wp-signup.php and create a new user. After that you will be asked to create a new blog. Create one and put some é and å characters in the blog title. After that you will recieve the first email (which should be ok), asking to activate the account. Activate it, then you will get the second email with the screwed subject.

The reason is phpmailer class which double-encodes the email subject (only when sending mail using php mail() function though). The problem in these lines (wp-includes/class-phpmailer.php, taken from WP 3.2.1):
(lines 657, 663, 671, 677)
$rt = @mail($val, $this->EncodeHeader($this->SecureHeader($this->Subject)), $body, $header, $params);
$rt = @mail($to, $this->EncodeHeader($this->SecureHeader($this->Subject)), $body, $header, $params);
$rt = @mail($val, $this->EncodeHeader($this->SecureHeader($this->Subject)), $body, $header, $params);
$rt = @mail($to, $this->EncodeHeader($this->SecureHeader($this->Subject)), $body, $header);
The subject is first encoded with EncodeHeader function, and after that encoded again by mail() function itself.

Fix suggestion:
Since mail() function does proper encoding of subject, we don't need to encode it ourselves. So those lines should be replaced by:
$rt = @mail($val, $this->Subject, $body, $header, $params);
$rt = @mail($to, $this->Subject, $body, $header, $params);
$rt = @mail($val, $this->Subject, $body, $header, $params);
$rt = @mail($to, $this->Subject, $body, $header);
I've tried this on my server, and it works well.

Hope it will be fixed and included in the release soon. Although, i've found the good enough temporarily solution overloading wp_mail() with a plugin, but still would be nice to see the solution in the wordpress core.

Attachments (1)

wp_18521.patch (2.0 KB) - added by jetpackpony 4 years ago.
The patch based on suggested fix

Download all attachments as: .zip

Change History (3)

4 years ago

The patch based on suggested fix

#1 @kawauso
4 years ago

  • Component changed from General to External Libraries

#2 @bpetty
20 months ago

  • Keywords needs-patch removed
  • Milestone Awaiting Review deleted
  • Resolution set to invalid
  • Severity changed from major to normal
  • Status changed from new to closed

I tested this with just the blog name changed to "some é and å characters" in single-site mode, and those emails were correct:

Site Name: some é and å characters

Subject: =?UTF-8?Q?[some_=C3=A9_and_=C3=A5_characters]_Password_Reset?=

Since the instructions here seem to imply that this is a multi-site installation, I also performed the steps to reproduce by enabling user account registration (under "Allow new registrations": "Both sites and user accounts can be registered."), signing up a new account, and creating a new blog, whose name also contained UTF8 characters:

Network Name: some é and å chars
Site Name: some é and å characters

Subject: =?UTF-8?Q?New_some_=C3=A9_and_=C3=A5_chars_Site:_some_=C3=A9_and_=C3=A5_c?=  =?UTF-8?Q?haracters?=

Not only does my email client (Gmail in this case) correctly decode and display this subject, but according to RFC 2047, this is also perfectly acceptable to have multiple encoded-word sections in the subject, as long as they are separated by a space (and they are, and you can see by looking at PHPMailer's source for EncodeHeader that the space is intentionally added). PHPMailer actively uses an algorithm to encode these in the most optimal way that results in the shortest string length, so it will break parts of the subject up and encode them separately, but I also believe in this case, it may have had a max length encoded-word, so it still broken them up.

So in short, I don't see any double-encoding happening, and everything looks correct. Given the subject line you also listed:

Subject: =?UTF-8?Q?New_site_name_Site:_J=C3=A9t_inqsfzxb_p=C3=A5_?= =?UTF-8?Q?=C3=A9?=

This is also perfectly valid, so maybe you're email client lacked proper support? Feel free to re-open this ticket if you believe I'm wrong.

Note: See TracTickets for help on using tickets.