WordPress.org

Make WordPress Core

Opened 5 weeks ago

Last modified 13 days ago

#53019 new defect (bug)

The _sanitize_text_fields function removing the octets that incorrectly work with Arabic RTL languages.

Reported by: wppunk Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version:
Component: Security Keywords:
Focuses: rtl Cc:

Description (last modified by SergeyBiryukov)

%10 - %99 are valid percents for the Arabic languages. The Arabic percentage usage.

As I can see [here]https://core.trac.wordpress.org/browser/tags/5.7/src/wp-includes/formatting.php#L5409, removing all octets, but I'm not sure that it's really for security reasons. Anyone could approve that this code really important here?

Change History (2)

#1 @SergeyBiryukov
5 weeks ago

  • Description modified (diff)

#2 @peterwilsoncc
13 days ago

I ran some strings through various escaping functions in wp-cli

wp> sanitize_text_field( 'cats of %90 by recommended' );
string(22) "cats of by recommended"

wp> sanitize_text_field( 'recommend by 90% of cats' );
string(24) "recommend by 90% of cats"

wp> sanitize_text_field( 'cats of %900 by recommended' );
string(24) "cats of 0 by recommended"

wp> sanitize_text_field( 'cats of 💯 by recommended' );
string(27) "cats of 💯 by recommended"

wp> sanitize_text_field( 'cats of %90 < by recommended' );
string(27) "cats of &lt; by recommended"

wp> esc_attr( 'cats of %90 by recommended' );
string(26) "cats of %90 by recommended"

wp> esc_url( 'http://example.com/?s=20%' )
string(25) "http://example.com/?s=20%"

wp> esc_url( 'http://example.com/?s=20%25' )
string(27) "http://example.com/?s=20%25"

wp> esc_url( 'http://example.com/?s=%20' )
string(25) "http://example.com/?s=%20"

wp> esc_attr( 'cats of %90 by recommended' );
string(26) "cats of %90 by recommended"

wp> global $wpdb
wp> $wpdb->prepare( 'post_type=%s', '%20' )
string(80) "post_type='{ad6df8669b87f3e7ce3f7b30446aeb270ddef911039b7c96abdd4e90e383dfe5}20'"

As a general rule the sanitize_* functions are intended to run on data on the way in, the esc_* function upon display so some difference is expected but WP should certainly accommodate RTL languages.

It occurs to me that in faux-equations something like %aa + %bb = %cc could also be legitimate in some RTL languages.

This was added in [11929] for #10751 but the reasoning is unclear.

--

WordPress ought to support RTL representations of percentages. For properly prepared SQL statements, WP uses the value of $wpdb->placeholder_escape() for percent symbols and later removes them while making the query.

Note: See TracTickets for help on using tickets.