Opened 11 years ago
Closed 11 years ago
#24001 closed defect (bug) (fixed)
\s in the regexp destroys some UTF-8 characters in pingback_ping()
Reported by: | tenpura | Owned by: | SergeyBiryukov |
---|---|---|---|
Milestone: | 3.6 | Priority: | normal |
Severity: | normal | Version: | 1.5.2 |
Component: | XML-RPC | Keywords: | has-patch commit |
Focuses: | Cc: |
Description
\s in the regexp destroys some UTF-8 characters in pingback_ping(). Same issue as in #21625.
Steps to reproduce:
- Pingback with the post title "САПР".
- It will create a pingback comment with no comment author (Anonymous).
Solution:
Use [\r\n\t ] rather than [\s\r\n\t].
Attachments (1)
Change History (6)
#1
@
11 years ago
- Keywords commit added
- Milestone changed from Awaiting Review to 3.6
- Version changed from trunk to 1.5.2
#3
follow-up:
↓ 4
@
11 years ago
Yeah, we shouldn't be using \s
in regex that filters user submitted or translatable text as it matches bytes that are part of some multibyte (UTF-8, others) chars. The [\r\n\t ]
replacement seems to work properly. In theory best would be to use the u
modifier, however that leaves installs with charset other than UTF-8 out in a "grey area".
Note: See
TracTickets for help on using
tickets.
Introduced in [2619].