Make WordPress Core

Opened 17 years ago

Closed 15 years ago

#6233 closed defect (bug) (duplicate)

kses can cut unicode text from tags attributes

Reported by: humator's profile humator Owned by:
Milestone: Priority: normal
Severity: normal Version: 2.3.3
Component: Formatting Keywords: kses unicode
Focuses: Cc:

Description

The following code

$string = preg_replace('/\xad+/', '', $string); # deals with Opera "feature"

in function wp_kses_bad_protocol can also break unicode letters. E.g. Russian word Экран was cut by this function by error (this word was in "alt" attribute). As a result the whole post was cut after that place.

The problem is that 0xAD symbol can appear in unicode text.

I would fix this bug, but unfortunately I don't know what Opera "feature" is mentioned here.

Change History (3)

#1 @takayukister
17 years ago

Hi humator,

From kses 0.2.1 release note:

For some reason, the Opera developers decided to make chr(173) a whitespace
character in URL protocols, both when it occurs raw and in an entity. kses
now handles this.

Japanese users also have the same problem.

kses has another issue. See #5917 I opened. The problematic 0xAD stripping you mentioned here is run in the bad-protocol check action. So if #5917 is fixed, this #6233 should be fixed at the same time.

#2 @Denis-de-Bernardy
16 years ago

  • Component changed from General to Formatting
  • Owner anonymous deleted

#3 @ryan
15 years ago

  • Milestone 2.9 deleted
  • Resolution set to duplicate
  • Status changed from new to closed

See #9823

Note: See TracTickets for help on using tickets.