Opened 10 years ago
Last modified 3 months ago
#32917 new enhancement
Tests_DB_Charset tests don't fully cover wpdb::strip_invalid_text_for_column()
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Future Release | Priority: | normal |
Severity: | normal | Version: | 4.2 |
Component: | Charset | Keywords: | has-patch has-unit-tests |
Focuses: | Cc: |
Change History (3)
This ticket was mentioned in PR #7652 on WordPress/wordpress-develop by @debarghyabanerjee.
4 months ago
#2
- Keywords has-patch has-unit-tests added; needs-patch needs-unit-tests removed
Trac Ticket: Core-32917
## Summary
- This PR enhances the existing unit tests for the strip_invalid_text_for_column function in the $wpdb class. The changes aim to ensure that the function behaves correctly across different character sets (utf8, utf8mb4, latin1, and ascii) and handles various combinations of valid and invalid characters.
## Changes Made
### Extended Test Coverage
- Added a data provider strip_invalid_text_provider to supply multiple input scenarios for the test cases.
- The data provider now covers:
- Valid and invalid UTF-8 sequences: Tests to ensure invalid byte sequences are stripped, while valid sequences are retained.
- Support for 4-byte characters: Valid 4-byte characters (e.g., emojis) are supported when the charset is utf8mb4. This is tested explicitly.
- Different character sets: Added tests for utf8mb4, latin1, and ascii to verify behavior across multiple charsets.
- Special cases: Tests include empty strings, special characters, and various invalid sequences.
### Improved Existing Tests:
- Refactored the existing test (
test_strip_invalid_text_for_column
) to utilize the new data provider, ensuring consistent and comprehensive testing.
- Adjusted the expected output for test cases involving utf8mb4 to correctly allow 4-byte characters like emojis, reflecting the true behavior of the charset.
## Example Test Cases Added
- Valid UTF-8 String: Confirms that valid sequences remain unchanged.
- Mixed Characters: Verifies behavior when encountering a mix of valid and invalid sequences, including 4-byte characters.
- Latin1 and ASCII Tests: Ensures compatibility and correct text stripping when using latin1 and ascii charsets.
## Benefits
- Comprehensive Testing: The new data provider approach ensures consistent coverage across various scenarios, improving the robustness of the tests.
- Charset Compatibility: Explicitly tests different character sets, helping to identify and prevent charset-specific issues.
Future Flexibility: By leveraging a data provider, future test additions and adjustments will be more straightforward.
## Testing
- Run Tests: Executed the enhanced test suite to verify that the
strip_invalid_text_for_column
function behaves as expected.
Note: See
TracTickets for help on using
tickets.
#32279 is an example where test coverage here would've prevented issues..