Make WordPress Core

Opened 10 years ago

Last modified 3 months ago

#32917 new enhancement

Tests_DB_Charset tests don't fully cover wpdb::strip_invalid_text_for_column()

Reported by: johnbillion's profile johnbillion Owned by:
Milestone: Future Release Priority: normal
Severity: normal Version: 4.2
Component: Charset Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

Related / previously:

Tests_DB_Charset includes a data provider that feeds various charsets into its wpdb::strip_invalid_text() test, but not for its wpdb::strip_invalid_text_for_column() test, which means it's not being tested fully. It should be.

Change History (3)

#1 @dd32
10 years ago

#32279 is an example where test coverage here would've prevented issues..

This ticket was mentioned in PR #7652 on WordPress/wordpress-develop by @debarghyabanerjee.


4 months ago
#2

  • Keywords has-patch has-unit-tests added; needs-patch needs-unit-tests removed

Trac Ticket: Core-32917

## Summary

  • This PR enhances the existing unit tests for the strip_invalid_text_for_column function in the $wpdb class. The changes aim to ensure that the function behaves correctly across different character sets (utf8, utf8mb4, latin1, and ascii) and handles various combinations of valid and invalid characters.

## Changes Made

### Extended Test Coverage

  • Added a data provider strip_invalid_text_provider to supply multiple input scenarios for the test cases.
  • The data provider now covers:
  • Valid and invalid UTF-8 sequences: Tests to ensure invalid byte sequences are stripped, while valid sequences are retained.
  • Support for 4-byte characters: Valid 4-byte characters (e.g., emojis) are supported when the charset is utf8mb4. This is tested explicitly.
  • Different character sets: Added tests for utf8mb4, latin1, and ascii to verify behavior across multiple charsets.
  • Special cases: Tests include empty strings, special characters, and various invalid sequences.

### Improved Existing Tests:

  • Refactored the existing test (test_strip_invalid_text_for_column) to utilize the new data provider, ensuring consistent and comprehensive testing.
  • Adjusted the expected output for test cases involving utf8mb4 to correctly allow 4-byte characters like emojis, reflecting the true behavior of the charset.

## Example Test Cases Added

  • Valid UTF-8 String: Confirms that valid sequences remain unchanged.
  • Mixed Characters: Verifies behavior when encountering a mix of valid and invalid sequences, including 4-byte characters.
  • Latin1 and ASCII Tests: Ensures compatibility and correct text stripping when using latin1 and ascii charsets.

## Benefits

  • Comprehensive Testing: The new data provider approach ensures consistent coverage across various scenarios, improving the robustness of the tests.
  • Charset Compatibility: Explicitly tests different character sets, helping to identify and prevent charset-specific issues.

Future Flexibility: By leveraging a data provider, future test additions and adjustments will be more straightforward.

## Testing

  • Run Tests: Executed the enhanced test suite to verify that the strip_invalid_text_for_column function behaves as expected.

#3 @desrosj
3 months ago

  • Milestone set to Future Release

Found this one in a list of tickets missing a milestone. @johnbillion would you be able to take a look at the newly added PR?

Note: See TracTickets for help on using tickets.