Make WordPress Core

Opened 3 years ago

Last modified 3 months ago

#21537 new defect (bug)

Email address sanitisation mangles valid email addresses

Reported by: westi Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 3.4.1
Component: Formatting Keywords: 2nd-opinion has-patch is-email
Focuses: Cc:


If you change your email address to one including an ampersand then we mangle the address with html entities.

For example:

  • This - peter&paul@…
  • Becomes - peter&paul@…

This is due to the call to wp_filter_kses on pre_user_email' in default-filters.php.

The was added in [5906] for #4546.

I'm not sure if we need kses filtering for emails - if we do which should probably revert this conversion of the & => & afterwards.

Attachments (1)

21537.diff (2.5 KB) - added by valendesigns 9 months ago.

Download all attachments as: .zip

Change History (14)

comment:1 @beaulebens3 years ago

  • Cc beau@… added

While we're in there, there are some other rules that might need to be considered:

  • Uppercase and lowercase English letters (a–z, A–Z) (ASCII: 65–90, 97–122)
  • Digits 0 to 9 (ASCII: 48–57)
  • Characters !#$%&'*+-/=?^_`{|}~ (ASCII: 33, 35–39, 42, 43, 45, 47, 61, 63, 94–96, 123–126)
  • Character . (dot, period, full stop) (ASCII: 46) provided that it is not the first or last character, and provided also that it does not appear two or more times consecutively (e.g. John..Doe@… is not allowed.).
  • Special characters are allowed with restrictions. They are:
    • Space and "(),:;<>@[\] (ASCII: 32, 34, 40, 41, 44, 58, 59, 60, 62, 64, 91–93)
    • The restrictions for special characters are that they must only be used when contained between quotation marks, and that 2 of them (the backslash \ and quotation mark " (ASCII: 32, 92, 34)) must also be preceded by a backslash \ (e.g. "
  • Comments are allowed with parentheses at either end of the local part; e.g. "john.smith(comment)@example.com" and "(comment)john.smith@…" are both equivalent to "john.smith@…".
  • International characters above U+007F are permitted by RFC 6531, though mail systems may restrict which characters to use when assigning local parts.

From http://en.wikipedia.org/wiki/Email_address which summarizes http://tools.ietf.org/html/rfc3696#section-3

Last edited 3 years ago by SergeyBiryukov (previous) (diff)

comment:2 @yoavf3 years ago

  • Cc yoavf added

comment:4 @jkudish3 years ago

  • Cc joachim.kudish@… added

comment:5 @iandunn3 years ago

  • Cc ian_dunn@… added

comment:6 follow-up: @iandunn3 years ago

What about instead of applying wp_filter_kses, we pass the new address through PHP's FILTER_SANITIZE_EMAIL? That would strip out all characters except letters, digits and !#$%&'*+-/=?^_`{|}~@.[]

comment:7 @cfinke2 years ago

  • Cc cfinke@… added

comment:8 @feedmeastraycat2 years ago

This is also affected when you register a new user with & in the e-mail. Registering a user with "foo&bar@…" is registered in the database as "foo&amp;bar@…" thus failing a test on email_exists( 'foo&bar@example.com' ) (which returns false) and get_user_by( 'email', 'foo&bar@example.com' ) (which also returns false).

comment:9 @feedmeastraycat2 years ago

  • Cc david.martensson@… added

comment:10 @nacin21 months ago

  • Component changed from General to Formatting

@valendesigns9 months ago

comment:11 @valendesigns9 months ago

  • Keywords has-patch added; needs-patch removed

The 21537.diff patch includes unit tests, while solving the issue as simply as possible. This solution allows us to move forward by closing this ticket and then adding any other entities that need to be reverted back to a pre-encoded state in other tickets. As well, we continue having the benefits of using wp_filter_kses and don't have to rewrite the email validation.

Happy New Year!

comment:12 @SergeyBiryukov9 months ago

#28848 was marked as a duplicate.

comment:13 in reply to: ↑ 6 @miqrogroove3 months ago

  • Keywords is-email added

Replying to iandunn:

What about instead of applying wp_filter_kses, we pass the new address through PHP's FILTER_SANITIZE_EMAIL? That would strip out all characters except letters, digits and !#$%&'*+-/=?^_`{|}~@.[]

I'm curious about this myself, and how it relates to our other is_email tickets. I'm going to tag them all as related for now.

Note: See TracTickets for help on using tickets.