Make WordPress Core


Ignore:
Timestamp:
05/14/2024 06:03:43 PM (2 years ago)
Author:
dmsnell
Message:

Normalize UTF-8 charset slug detection.

There are several exist places in Core that attempt to detect if a blog charset
is UTF-8. Each place attempts to perform the same check, except the logic is
spread throughout and there's no single method provided to make this
determination in a consistent way. The _canonical_charset() method exists,
but is marked private for use.

In this patch the new unicode module provides is_utf8_charset() as a method
taking an optional charset slug and indicating if it represents UTF-8,
examining all of the allowable variants of that slug. Associated code is
updated to use this new function, including _canonical_charset(). If no slug
is provided, it will look up the current get_option( 'blog_charset' ).

Finally, the test functions governing _canonical_charset() have been
rewritten as a single test with a data provider instead of as separate test
functions.

Developed in https://github.com/WordPress/wordpress-develop/pull/6535
Discussed in https://core.trac.wordpress.org/ticket/61182

Fixes #61182.
Props dmsnell, jonsurrell.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • trunk/src/wp-includes/compat.php

    r57985 r58147  
    9292     * charset just use built-in substr().
    9393     */
    94     if ( ! in_array( $encoding, array( 'utf8', 'utf-8', 'UTF8', 'UTF-8' ), true ) ) {
     94    if ( ! is_utf8_charset( $encoding ) ) {
    9595        return is_null( $length ) ? substr( $str, $start ) : substr( $str, $start, $length );
    9696    }
     
    177177     * just use built-in strlen().
    178178     */
    179     if ( ! in_array( $encoding, array( 'utf8', 'utf-8', 'UTF8', 'UTF-8' ), true ) ) {
     179    if ( ! is_utf8_charset( $encoding ) ) {
    180180        return strlen( $str );
    181181    }
Note: See TracChangeset for help on using the changeset viewer.