Ticket #5917 (closed defect (bug): fixed)

Opened 4 years ago

Last modified 3 years ago

Kses should apply bad-protocol check only to URI typed attributes

Reported by: takayukister Owned by: anonymous
Priority: normal Milestone: 2.7
Component: General Version: 2.5
Severity: normal Keywords: kses has-patch
Cc:

Description

Kses HTML filter (wp-includes/kses.php) applies "bad protocol" check to all attribute values now. It treats string including a colon (:) as URI, and if the string doesn't have an allowed protocol (http, https, ftp, ...), it delete the letters before colon as a bad protocol.

But this rule is too strict in many cases. For example, if you want to write

<img src="C-3PO.png" alt="Star Wars Episode IV: A New Hope" />

"Star Wars Episode IV:" will be deleted as a bad protocol.

<img src="R2-D2.png" alt="Fig 1: R2-D2" />

"Fig 1:" will be deleted as a bad protocol.

Alt attribute values are not URI. So bad protocol checking shouldn't be needed.

I wrote a patch which makes kses apply bad-protocol check only to URI typed attributes. I referred to HTML spec for attribute types.

 http://www.w3.org/TR/REC-html40/index/attributes.html

Attachments

kses_bad_protocol.diff Download (3.8 KB) - added by takayukister 4 years ago.
5917.diff Download (1.9 KB) - added by ryan 4 years ago.

Change History

comment:1   andy4 years ago

What would this look like as a whitelist?

Andy, what kind of whitelist do you mean?

Actually I was trying picking up attributes which can include colon safely, like <img alt="">, instead of picking away attributes with URI value like my first patch. But I eventually realized that most of attributes can include colon, not only CDATA and Text type, all ID and Name attributes can include colon as well [*], so I thought specifying all these attributes is not effective at that time.

comment:3   ryan4 years ago

See #6888

ryan4 years ago

comment:4   ryan4 years ago

5917.diff takes a bit different approach. It moves everything into wp_kses_hair(). The list of attributes to check for bad protocols was obtained by searching for %URI in the XHTML DTDs. Since these particular attributes are always used as a URI regardless of the element they are in, I skipped checking the element.

comment:5   ryan3 years ago

  • Milestone changed from 2.9 to 2.7

Objections to patch?

I agree with ryan's approach. It's simpler to understand.

Tested in WordPress 2.6.1. Working properly. +1

comment:8   ryan3 years ago

  • Status changed from new to closed
  • Resolution set to fixed

(In [8671]) Apply kses bad-protocol checks only to URI typed attributes. Props takayukister. fixes #5917 #6888 #6910 #7512

comment:9   ryan3 years ago

  • Status changed from closed to reopened
  • Resolution fixed deleted
  • Milestone changed from 2.7 to 2.6.2

Reopening for 2.6.2 consideration. Let's try this in trunk for awhile and then apply to 2.6 branch.

  • Status changed from reopened to closed
  • Resolution set to fixed

Not going into 2.6.2. Closing.

Note: See TracTickets for help on using tickets.