﻿id,summary,reporter,owner,description,type,status,priority,milestone,component,version,severity,resolution,keywords,cc
11528,sanitize_text_field() issue with UTF-8 characters,SergeyBiryukov,,"{{{sanitize_text_field()}}} is the new function in {{{/wp-includes/formatting.php}}} which sanitizes a string from user input or from the database.

The following line of the function is not fully compatible with UTF-8:
{{{
$filtered = trim( preg_replace('/\s+/', ' ', $filtered) );
}}}
It creates problems with characters like Р (capital Cyrillic R) which can be represented as {{{D0 A0}}} (hexadecimal) in ASCII and becomes {{{D0 20}}} after the replacement. To reproduce the issue, one can try to create a category named оРангутанг or САПР. The rest of the word after Р is not displayed, the slug is incorrect too. If a title starts with Р, it is not displayed at all.

The problem was reported on Russian support forums soon after the release. Currently the filter is included in local files to avoid this replacement, however I think the issue is relevant to other languages using Cyrillic alphabet.",defect (bug),closed,normal,2.9.1,Formatting,2.9,major,fixed,,SergeyBiryukov
