Make WordPress Core

Opened 12 years ago

Last modified 2 years ago

#21537 new defect (bug)

Email address sanitisation mangles valid email addresses

Reported by: westi's profile westi Owned by:
Milestone: Priority: normal
Severity: normal Version: 3.4.1
Component: Formatting Keywords: 2nd-opinion has-patch is-email
Focuses: Cc:

Description

If you change your email address to one including an ampersand then we mangle the address with html entities.

For example:

  • This - peter&paul@…
  • Becomes - peter&paul@…

This is due to the call to wp_filter_kses on pre_user_email' in default-filters.php.

The was added in [5906] for #4546.

I'm not sure if we need kses filtering for emails - if we do which should probably revert this conversion of the & => & afterwards.

Attachments (1)

21537.diff (2.5 KB) - added by valendesigns 10 years ago.

Download all attachments as: .zip

Change History (16)

#1 @beaulebens
12 years ago

  • Cc beau@… added

While we're in there, there are some other rules that might need to be considered:

  • Uppercase and lowercase English letters (a–z, A–Z) (ASCII: 65–90, 97–122)
  • Digits 0 to 9 (ASCII: 48–57)
  • Characters !#$%&'*+-/=?^_`{|}~ (ASCII: 33, 35–39, 42, 43, 45, 47, 61, 63, 94–96, 123–126)
  • Character . (dot, period, full stop) (ASCII: 46) provided that it is not the first or last character, and provided also that it does not appear two or more times consecutively (e.g. John..Doe@… is not allowed.).
  • Special characters are allowed with restrictions. They are:
    • Space and "(),:;<>@[\] (ASCII: 32, 34, 40, 41, 44, 58, 59, 60, 62, 64, 91–93)
    • The restrictions for special characters are that they must only be used when contained between quotation marks, and that 2 of them (the backslash \ and quotation mark " (ASCII: 32, 92, 34)) must also be preceded by a backslash \ (e.g. "
      \"").
  • Comments are allowed with parentheses at either end of the local part; e.g. "john.smith(comment)@example.com" and "(comment)john.smith@…" are both equivalent to "john.smith@…".
  • International characters above U+007F are permitted by RFC 6531, though mail systems may restrict which characters to use when assigning local parts.

From http://en.wikipedia.org/wiki/Email_address which summarizes http://tools.ietf.org/html/rfc3696#section-3

Last edited 12 years ago by SergeyBiryukov (previous) (diff)

#2 @yoavf
12 years ago

  • Cc yoavf added

#4 @jkudish
12 years ago

  • Cc joachim.kudish@… added

#5 @iandunn
12 years ago

  • Cc ian_dunn@… added

#6 follow-up: @iandunn
12 years ago

What about instead of applying wp_filter_kses, we pass the new address through PHP's FILTER_SANITIZE_EMAIL? That would strip out all characters except letters, digits and !#$%&'*+-/=?^_`{|}~@.[]

#7 @cfinke
11 years ago

  • Cc cfinke@… added

#8 @feedmeastraycat
11 years ago

This is also affected when you register a new user with & in the e-mail. Registering a user with "foo&bar@…" is registered in the database as "foo&amp;bar@…" thus failing a test on email_exists( 'foo&bar@example.com' ) (which returns false) and get_user_by( 'email', 'foo&bar@example.com' ) (which also returns false).

#9 @feedmeastraycat
11 years ago

  • Cc david.martensson@… added

#10 @nacin
11 years ago

  • Component changed from General to Formatting

@valendesigns
10 years ago

#11 @valendesigns
10 years ago

  • Keywords has-patch added; needs-patch removed

The 21537.diff patch includes unit tests, while solving the issue as simply as possible. This solution allows us to move forward by closing this ticket and then adding any other entities that need to be reverted back to a pre-encoded state in other tickets. As well, we continue having the benefits of using wp_filter_kses and don't have to rewrite the email validation.

Happy New Year!

#12 @SergeyBiryukov
10 years ago

#28848 was marked as a duplicate.

#13 in reply to: ↑ 6 @miqrogroove
9 years ago

  • Keywords is-email added

Replying to iandunn:

What about instead of applying wp_filter_kses, we pass the new address through PHP's FILTER_SANITIZE_EMAIL? That would strip out all characters except letters, digits and !#$%&'*+-/=?^_`{|}~@.[]

I'm curious about this myself, and how it relates to our other is_email tickets. I'm going to tag them all as related for now.

#14 @jrfoell
9 years ago

Just adding a note here so this doesn't get forgotten, since it appears to be 4 years old :-/

I was able to get around the problem during a bulk user import (where the dataset is known) by doing:

<?php

if ( strpos( $user_email, '&' ) !== false ) {
        //Turn off wp_filter_kses for this email
        remove_filter( 'pre_user_email', 'wp_filter_kses' );
}

wp_insert_user(...);

add_filter( 'pre_user_email', 'wp_filter_kses' );

I wouldn't advise doing this to open user registrations... posting here to help this issue bubble up :)

#15 @jrfoell
8 years ago

Would be awesome if this could get fixed as part of the 4.6 mega-bugfix release :)

Note: See TracTickets for help on using tickets.