WordPress.org

Make WordPress Core

Opened 5 years ago

Closed 4 years ago

#9913 closed defect (bug) (fixed)

Plain text emails sent with encoded quotes

Reported by: Speedboxer Owned by:
Milestone: 2.9 Priority: normal
Severity: normal Version: 2.8
Component: Mail Keywords: has-patch dev-feedback
Focuses: Cc:

Description

Now that the default is to encode quotes in esc_html() and related, emails from WordPress (comment notifications) and plugins get sent with encoded quotes. Because these emails are sent as text/plain, clients won't decode the quotes. This results in non-user-friendly emails from WordPress.

Attachments (5)

9913.diff (1.4 KB) - added by Speedboxer 5 years ago.
Ensure plain text emails aren't sent with encoded quotes
9913.2.diff (707 bytes) - added by Speedboxer 5 years ago.
Remove added parameter
9913.3.diff (587 bytes) - added by Speedboxer 5 years ago.
Use html_entity_decode()
9913.4.diff (605 bytes) - added by Speedboxer 5 years ago.
Use UTF-8 as charset in html_entity_decode() to support non-English quotes
html_entity_decode_blogname.diff (2.7 KB) - added by tenpura 4 years ago.
html_entity_decode when mailing.

Download all attachments as: .zip

Change History (29)

comment:1 Denis-de-Bernardy5 years ago

  • Milestone changed from Unassigned to 2.8

comment:2 Denis-de-Bernardy5 years ago

  • Keywords needs-patch added

Speedboxer5 years ago

Ensure plain text emails aren't sent with encoded quotes

comment:3 Speedboxer5 years ago

  • Keywords has-patch added; needs-patch removed

comment:4 Denis-de-Bernardy5 years ago

  • Milestone changed from 2.8 to 2.9

punting per IRC discussion.

is the new parameter really needed? $content_type seems like a good enough check.

comment:5 dd325 years ago

IMO, The extra param isnt required, You dont send HTML emails with a plain content-type.

Isnt there a better way than simply replacing it like that though (ie. all the other entities)? Or does i t only happen to quotes?

Speedboxer5 years ago

Remove added parameter

comment:6 Speedboxer5 years ago

New patch removes the parameter.

It only happens to quotes.

comment:7 follow-up: Denis-de-Bernardy5 years ago

quick question, instead of the str_replace, why not use entity decode or whatever the function is called?

Speedboxer5 years ago

Use html_entity_decode()

comment:8 in reply to: ↑ 7 Speedboxer5 years ago

Replying to Denis-de-Bernardy:

quick question, instead of the str_replace, why not use entity decode or whatever the function is called?

Only quotes get encoded, but I've added a patch that uses html_entity_decode().

comment:9 Denis-de-Bernardy5 years ago

  • Cc hakre added
  • Keywords 2nd-opinion added

adding hakre to the cc in case he's an idea of what would be best

comment:10 hakre5 years ago

  • Keywords dev-feedback added; 2nd-opinion removed

Well best would be that it is clear that wp_mail() would expect data that is plain/text encoded in the default charset. the default charset should be named as well. stick with utf-8 here?

the all needed decisions can be taken care of. mostly I assume the problem is not located inside wp_mail even thought the provided patch is a workaround on that (wrong) place.

I could not take a very decent look into it right now, but it looks that contet_type comes out of nowhere and then a (reasonable) decision is bound to it.

what about the idea to have an esc_mail() compagnon to esc_html() that is used in combination with wp_mail at the places where wp_mail() currently is used with esc_html()? This would leave us enough space for changes in the future and the place to fix walks in the right (imho) direction: not inside wp_mail(). wp_mail needs a clear statement in the docblocks of how string parameters are to be expected: charset/encoding.

what about some feedback on that basic point from core devs? as far as I read the docblocks, it's only string: can be plain text, can be html can be utf8 but can be latin1 as well. my suggestion is to have it plain here (this is an email and those are normally plain, no need for fancy html upfront) and utf-8 encoded because utf-8 is used more and more and I would like to have it as "default charset" within all routines.

comment:11 hakre5 years ago

can we get a clarification of what wp_mail() is for? With that it should be possbile to provide the needed patch.

comment:12 follow-up: dd325 years ago

Closed #10557 as duplicate of this issue.

Take note that Internationalised quotation marks (ie ’) are also affected by this bug, Therefor, a simple str_replace() of the english-quotes isn't going to be enough.

Speedboxer5 years ago

Use UTF-8 as charset in html_entity_decode() to support non-English quotes

comment:13 in reply to: ↑ 12 Speedboxer5 years ago

Replying to dd32:

Take note that Internationalised quotation marks (ie ’) are also affected by this bug, Therefor, a simple str_replace() of the english-quotes isn't going to be enough.

Latest patch fixes that by using the UTF-8 character set in html_entity_decode(). Tested it using ", ', ‘, ’, “, ”, “, ”, ‘ and ’. All were converted as expected.

comment:14 Speedboxer5 years ago

Side note: This will also decode & in URLs, making the URL actually usable in the email.

comment:15 hakre5 years ago

Could the patch be extended that the function docblock comments conains a definition of expected input and output values including it's encodings?

comment:16 dd325 years ago

closed #10942 as duplicate of this

When used in an e-mail, apostrophes should not be encoded

WordPress sends e-mail as plain text (as far as I know), and therefore special characters should not be encoded as HTML entities, but displayed as-is.

A French user has recently reported that WP does encode at least one character: the straight apostrophe ('). My guess is that it is to prevent any breakage further down the road, but in this case it makes the string hard to read.

Actual blog title: Le Mag'zine

Title as used in e-mail: Le Mag'Zine

comment:17 tenpura4 years ago

'blogname' is esc_htmled in sanitize_option().
We may need to decode this back when mailing.

case 'blogname':
	$value = addslashes($value);
	$value = wp_filter_post_kses( $value ); // calls stripslashes then addslashes
	$value = stripslashes($value);
	$value = esc_html( $value );
	break;

tenpura4 years ago

html_entity_decode when mailing.

comment:18 westi4 years ago

Ok looking back in time I am sure this used to work fine.

It turns out that I think this was broken by [11380] where Mark switched out wp_specialchars for esc_html - this changed the default behaviour to encode quotes when wp_specialchars defaulted to not encoding quotes.

Unfortunately this functionality runs when the data is store in the db so even if we were to revert that a lot of people would still see the issue.

So it looks like we do need to undo this work in the relevant places where we want in non-html - we should however use wp_specialchars_decode rather than html_entity_decode.

comment:19 westi4 years ago

I am going to fix the email subject issue for WordPress generated emails.

As far as I know the body of emails generated by WordPress is usually the raw data and wp_mail expects to be sending plain text emails.

I don't want to be changing that for 2.9 - If there are still issues with email bodies could someone open a new ticket for a Future version to fix.

Many Thanks.

comment:20 westi4 years ago

  • Resolution set to fixed
  • Status changed from new to closed

(In [12388]) Don't html encode quotes in the blogname in email subjects as this is a plain text output. Fixes #9913 props tenpura.

comment:21 follow-up: tenpura4 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

html_entity_decode() is used on line 1139.
Is this supposed to be wp_specialchars_decode(), too?

comment:22 in reply to: ↑ 21 westi4 years ago

Replying to tenpura:

html_entity_decode() is used on line 1139.
Is this supposed to be wp_specialchars_decode(), too?

Yes. Good spot!

comment:23 westi4 years ago

Also found a couple of other places we need this.

Password reset emails (both the key one and the one with the password in).

Patching up.

comment:24 westi4 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

(In [12398]) Fix some more html encoding in email subject issues. Fixes #9913.

Note: See TracTickets for help on using tickets.