Opened 5 weeks ago
#62940 new defect (bug)
wp_mail(): Address header parsing is not RFC-5322 complient and fails on quoted-string when including a "<", ">" or ","
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | trunk |
Component: | Keywords: | ||
Focuses: | Cc: |
Description
Problem
The following From
field values result in an error, although both comply with RFC 2822:
Case 1: quoted printable:
- Input:
From: "test<stage" <test@example.com>
- Result:
- $from_name:
"test
- $from_email:
stage" <test@example.com
- $from_name:
Case 2: multiple from addresses:
- Input:
From: test <test@example.com>, User <user@example.com>
- Result:
- $from_name:
test
- $from_email:
test@example.com, User <user@example.com
- $from_name:
Case 3: similar problem for to
, cc
, bcc
, and reply_to
fields!
- Input:
To: test <test@example.com>, "Joe, Doe" <joe.doe@example.com>
- Result:
- address 1:
test <test@example.com>
- address 2:
"Joe
- address 3:
Doe" <joe.doe@example.com>
- address 1:
Reason
The php mail() documentation does not explicitly state that the From
header needs to comply with RFC 2822. But it does say so for the To
header. Furthermore, the current implementation does already support the name-addr
-syntax (i.e. [display-name] angle-addr
), which was not allowed in the preceding RFC 822.
However, wp_mail
does ignore the facts
- that
display-name
may include<
(case #1 above), and - that the From field may actually include multiple addresses.
RFC 2822 Specification
From
field value ismailbox-list
section 3.6.2mailbox-list
is a singlemailbox
, or two or more comma-separatedmailbox, mailbox
section 3.4mailbox
is either just anaddr-spec
(e-mail address) or aname-addr
section 3.4display-name
is aphrase
section 3.4phrase
is aword
and hence aquoted-string
section 3.2.6quoted-string
(i.e. a string in double quotes) may include any US-ASCII prinatable character except for the double quote ("
) and the backslash (\
). And as such also the<
.
Current Implementation
The parsing of the field simply splits the two parts of the name-addr
at the position of the first occurrence of <
, see line 2 of the code in question (pluggable.php:301-319):
<?php case 'from': $bracket_pos = strpos( $content, '<' ); if ( false !== $bracket_pos ) { // Text before the bracketed email is the "From" name. if ( $bracket_pos > 0 ) { $from_name = substr( $content, 0, $bracket_pos ); $from_name = str_replace( '"', '', $from_name ); $from_name = trim( $from_name ); } $from_email = substr( $content, $bracket_pos + 1 ); $from_email = str_replace( '>', '', $from_email ); $from_email = trim( $from_email ); // Avoid setting an empty $from_email. } elseif ( '' !== trim( $content ) ) { $from_email = trim( $content ); } break;
Similarly, for the to
, cc
, bcc
, and reply_to
fields:
- Splitting the field value at every comma (which may be quoted)
- Ignoring possible quotes when extracting name and address from the different values
Discussion
Unfortunately there is no simple solution:
- Correct parsing - according to the standard outlined in RFC 2822 - would probably require the use of a real parser (as opposed to searching the entire string for certain tokens).
PHPMailer::parseAddresses()
method cannot be used, as it lacks the same shortcoming, if the IMAP extension is not installed.- mail-mime-parser could work as an alternative. But it's a whole other level of complexity and only supports php v8+
- Even if correct parsing would be supported and
$from_name
as well as$from_email
correctly assigned, this would only work for single-mailbox
entries. Multiple from addresses are not supported by
Weak implementation
It must be noted, that the current implementation has repeatedly had bugs and regressions (although it has been quiet around this in the last years):
Pragmatic proposal
- Introduce a new filter
wp_mail_from_raw
at the beginning of theFrom
field parsing:This would allow a user (or plugin author) to mitigate the issue, if it happens on a given site. But it would save them the effort of parsing the headers entirely (as would be required with<?php /** * Filters the raw "From" header field before parsing it. * * @since 7.6.2 * * @param string $content Form header content. * @param string $header Full raw header. */ $content = apply_filters( 'wp_mail_from_raw', $content, $header );
wp_mail
). - Alternatively, the filter could be called for all headers in line 300
<?php $origName = $name; // used later for the default case in line 347 $name = strtolower( $name ); /** * Filters the raw header fields before parsing them. * * @since 7.6.2 * * @param string $content Header field's content. * @param string $name Header field's name. * @param string $header Full raw header. */ $content = apply_filters( 'wp_mail_raw_header', $content, $name, $header ); /** * Filters the raw header fields before parsing them. * * @since 7.6.2 * * @param string $content Header field's content. * @param string $name Header field's name. * @param string $header Full raw header. */ $content = apply_filters( "wp_mail_header_$name", $content, $header ); // ... // Line 347 needs to be updated: // $headers[ trim( $name ) ] = trim( $content ); $headers[ $origName ] = $content; // both values are already trimmed in lines 296 & 297
- Potentially use a RFC-compliant parser to avoid future errors with quoted address names at least in the other address fields
- Use an instance of an object implementing
ArrayAccess
andStringable
. This can be used to store one or a set of addresses/names in the$from_email
/wp_mail_from
and$from_name
/wp_mail_from_name
values and is sent through the respective filters. By this it is basically backwards compatible. - Parse the
Sender
header, add a filter and default it to the first address of a multi-address From field of the default wordpress from address. - Enhance the
PHPMailer::setFrom()
method to accept$address
and$name
to be arrays. And therefore also the corresponting class fields$From
and$FromName
(maybe through a PR to PHPMailer)
Happy to help with a PR. I would just love to know what would be accepted in there.
Thanks,
Bhujagendra