Make WordPress Core

Opened 5 years ago

Last modified 3 months ago

#47557 new enhancement

Sanitize Email Suggestion

Reported by: dandersoncm's profile dandersoncm Owned by:
Milestone: Awaiting Review Priority: normal
Severity: minor Version: 5.2.1
Component: Formatting Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

I am using WooCommerce and I've noticed several customer emails come through like...

example@example.com1234
example@example.com1234567812345678

It's mostly due to the email input being the last one before the credit card step, but these emails are passing the validation and sanitization that exists: is_email and sanitize_email.

I am doing something like the following to fix...

<?php

    public function clean_billing_email_address( $value ) {
        return trim( preg_replace( '/\d*$/', '', $value ) );
    }
    add_filter( 'woocommerce_process_checkout_field_billing_email', 'clean_billing_email_address' );

You may consider adding something like this to the sanitize_email function since no TLD ends with numbers anyways, at least at this point in time.

Change History (5)

#1 @mostafa.s1990
5 years ago

  • Keywords needs-patch added

#2 @SergeyBiryukov
5 years ago

  • Keywords needs-unit-tests added

#3 @bhubbard
21 months ago

Here is a test function:

<?php
public function test_clean_billing_email_address()
{
    $value = 'user123@example.com123';
    $expected = 'user123@example.com';
    $result = clean_billing_email_address($value);
    $this->assertEquals($expected, $result);

    $value = ' user456@example.com   ';
    $expected = 'user456@example.com';
    $result = clean_billing_email_address($value);
    $this->assertEquals($expected, $result);

    $value = ' user789@example.com 123';
    $expected = 'user789@example.com';
    $result = clean_billing_email_address($value);
    $this->assertEquals($expected, $result);
}

This ticket was mentioned in PR #7334 on WordPress/wordpress-develop by @debarghyabanerjee.


3 months ago
#4

  • Keywords has-patch has-unit-tests added; needs-patch needs-unit-tests removed

Trac Ticket: Core-47577

## Problem Statement

  • It has been observed that certain email addresses are passing through the validation and sanitization processes with trailing numbers appended to them, such as:
  • example@example.com1234
  • example@example.com1234567812345678
  • These emails are being accepted by the current validation functions is_email and sanitize_email. The issue is primarily due to the email input being validated without accounting for specific rules regarding trailing numbers and IP address formats.

## Fixes Implemented

  • The fixes focus on adhering to standard email validation and sanitization criteria, including:

### is_email() Validation Changes:

  • Bracketed IP Address Handling:
    • Validates bracketed IP addresses (e.g., user@[192.0.2.1]) as valid email addresses according to RFC 5321.
    • Rejects non-bracketed IP addresses (e.g., user@192.0.2.1) as invalid.
    • Trailing Numbers in Domain:
    • Updated validation logic to reject email addresses if the domain part contains trailing numbers, unless the domain also includes at least one alphabetic character.
    • sanitize_email() Sanitization Changes:
  • Bracketed IP Address:
    • Ensures that emails with bracketed IP addresses are returned unchanged.
    • Trailing Numbers in Domain:
    • If the domain contains trailing numbers and includes alphabetic characters, the function will remove the trailing numbers as part of the sanitization process.


## Detailed Changes

### is_email()

  • Added logic to validate IP addresses enclosed in square brackets as valid.
  • Introduced a condition to reject email addresses with trailing numbers in the domain if the domain does not contain alphabetic characters.

### sanitize_email()

  • Implemented logic to accept bracketed IP addresses without modification.
  • Modified the sanitization process to remove trailing numbers from the domain if the domain contains alphabetic characters.

## Example Updates

### Validation

  • user@[192.0.2.1] is considered a valid email address.
  • user@192.0.2.1 is considered an invalid email address.
  • example@example.com1234 is considered as invalid due to trailing numbers in domain.

### Sanitization:

  • Bracketed IP Address: user@[192.0.2.1] remains unchanged.
  • Trailing Numbers: example@… is sanitized to example@… if the domain part includes alphabets.

## Testing

  • Bracketed IP Addresses: Verified that bracketed IP addresses are correctly validated and sanitized.
  • Non-Bracketed IP Addresses: Ensured non-bracketed IP addresses are rejected.
  • Trailing Numbers: Confirmed that trailing numbers are handled correctly based on the presence of alphabetic characters in the domain.
  • General Email Validity: Ensured that standard email addresses continue to pass without issues.

## Considerations

  • Backward Compatibility: Ensured that the updated validation and sanitization rules do not negatively impact existing valid email addresses.
  • RFC Compliance: The changes are aligned with email validation standards set by RFC 5321.

#5 @debarghyabanerjee
3 months ago

Hi @SergeyBiryukov, can you please take a look into this proposed solution and PR. Thanks.

Note: See TracTickets for help on using tickets.