Opened 17 years ago
Closed 16 years ago
#5917 closed defect (bug) (fixed)
Kses should apply bad-protocol check only to URI typed attributes
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | 2.7 | Priority: | normal |
Severity: | normal | Version: | 2.5 |
Component: | General | Keywords: | kses has-patch |
Focuses: | Cc: |
Description
Kses HTML filter (wp-includes/kses.php) applies "bad protocol" check to all attribute values now. It treats string including a colon (:) as URI, and if the string doesn't have an allowed protocol (http, https, ftp, ...), it delete the letters before colon as a bad protocol.
But this rule is too strict in many cases. For example, if you want to write
<img src="C-3PO.png" alt="Star Wars Episode IV: A New Hope" />
"Star Wars Episode IV:" will be deleted as a bad protocol.
<img src="R2-D2.png" alt="Fig 1: R2-D2" />
"Fig 1:" will be deleted as a bad protocol.
Alt attribute values are not URI. So bad protocol checking shouldn't be needed.
I wrote a patch which makes kses apply bad-protocol check only to URI typed attributes. I referred to HTML spec for attribute types.
Attachments (2)
Change History (12)
#2
@
17 years ago
Andy, what kind of whitelist do you mean?
Actually I was trying picking up attributes which can include colon safely, like <img alt="">, instead of picking away attributes with URI value like my first patch. But I eventually realized that most of attributes can include colon, not only CDATA and Text type, all ID and Name attributes can include colon as well [*], so I thought specifying all these attributes is not effective at that time.
#4
@
17 years ago
5917.diff takes a bit different approach. It moves everything into wp_kses_hair(). The list of attributes to check for bad protocols was obtained by searching for %URI in the XHTML DTDs. Since these particular attributes are always used as a URI regardless of the element they are in, I skipped checking the element.
What would this look like as a whitelist?