Make WordPress Core

Opened 9 years ago

Closed 5 years ago

Last modified 5 years ago

#34406 closed defect (bug) (duplicate)

wp_kses_hair is too stringent redux

Reported by: travisnorthcutt's profile travisnorthcutt Owned by:
Milestone: Priority: normal
Severity: normal Version: 1.5
Component: Formatting Keywords: has-patch needs-unit-tests
Focuses: Cc:

Description

Attributes from custom xml name spaces may use numbers, underscores, en dashes, and em dashes, but the regex used inside wp_kses_hair() doesn't allow them through. Technically, there are other allowed characters, but that's probably getting into very fringe edge-case territory.

Admittedly, en and em dashes may be edge-case territory as well, so two patches are provided: one allowing numbers and underscores, and one additionally allowing en and em dashes.

Related: #17847

Here's a link to the W3C spec for reference.

Attachments (3)

34406.diff (606 bytes) - added by travisnorthcutt 9 years ago.
Allow numbers and underscores in html attribute names
34406.2.diff (618 bytes) - added by travisnorthcutt 9 years ago.
Allow numbers, underscores, and en and em dashes in html attribute names
34406.3.diff (2.0 KB) - added by travisnorthcutt 9 years ago.
Updated patch with tests

Download all attachments as: .zip

Change History (16)

@travisnorthcutt
9 years ago

Allow numbers and underscores in html attribute names

@travisnorthcutt
9 years ago

Allow numbers, underscores, and en and em dashes in html attribute names

#1 @travisnorthcutt
9 years ago

  • Keywords has-patch dev-feedback added

Also, here's a demo on RegExr if anyone wants to run checks against it: http://regexr.com/3c1uq

This ticket was mentioned in Slack in #core by travisnorthcutt. View the logs.


9 years ago

#3 @dd32
9 years ago

  • Keywords needs-unit-tests added

I see no reason why numbers and underscores shouldn't be allowed here, although I don't really see the point in whitelisting en/em dashes if we're not going to support the full set of characters.

Some unit tests would be good for this change I think.

@travisnorthcutt
9 years ago

Updated patch with tests

#4 follow-up: @miqrogroove
9 years ago

Needs more tests. This is the kind of ticket where the tests will get more traction than the patch ever will. :)

For example,

<img src="" data_at:2x="[audio]" alt="blah" />

We expect this shortcode to get stripped automatically upon display, which almost certainly fails in the proposed patch.

Last edited 9 years ago by miqrogroove (previous) (diff)

#5 in reply to: ↑ 4 @travisnorthcutt
9 years ago

Replying to miqrogroove:

Needs more tests. This is the kind of ticket where the tests will get more traction than the patch ever will. :)

For example,

<img src="" data_at:2x="[audio]" alt="blah" />

We expect this shortcode to get stripped automatically upon display, which almost certainly fails in the proposed patch.

I think that's not current behavior; without applying my patch, try this:

global $allowedposttags;
$test = '<img alt="[blah]" />';
echo wp_kses( $test, $allowedposttags );

In my testing, the brackets are allowed. So, in that case, your example fails with the current patch and on existing core, but only if we do indeed want/expect the shortcode to get stripped. Is that the case, though? From my reading of the Shortcode Roadmap Draft Three it sounds like that restriction shouldn't happen until 4.6.

#6 follow-up: @miqrogroove
9 years ago

Try it in the post editor. Because wp_kses itself doesn't handle shortcodes, there will be some unexpected behaviors without more test coverage here.

#7 in reply to: ↑ 6 @travisnorthcutt
9 years ago

Replying to miqrogroove:

Try it in the post editor. Because wp_kses itself doesn't handle shortcodes, there will be some unexpected behaviors without more test coverage here.

Without my patch applied, <img alt="[audio]" /> in the (text tab of the) post editor results in <img alt="" /> being output.
With my patch applied, <img alt="[audio]" /> in the (text tab of the) post editor results in <img alt="" /> being output.

My patch only affects parsing of attribute names, not attribute values, so I don't see how shortcodes in attribute values should be relevant here. I'm happy to change my mind, of course :).

This ticket was mentioned in Slack in #core by travisnorthcutt. View the logs.


9 years ago

This ticket was mentioned in Slack in #core by travisnorthcutt. View the logs.


9 years ago

#10 @chriscct7
9 years ago

  • Keywords dev-feedback removed

Bug Scrub 1/8/2015 notes:

  • The suggestion was made to add a unit test to avoid regression for the case mentioned in comment 7.
  • Further, the reporter should reach out to miqrogroove for help identifying additional edge cases to develop unit tests for.

#12 @whyisjake
5 years ago

  • Resolution set to duplicate
  • Status changed from new to closed

Duplicate of #49464.

#13 @whyisjake
5 years ago

In 48132:

Formatting: Extend wp_kses_hair and wp_kses_hair_parse to allow digits and underscores.

Fixes a lot of issues around parsing XML/HTML attributes.

Fixes #49464.

See #34406, #48608.

Props codeforest, zodiac1978, johnpgreen, dlh, ayeshrajans, johnpgreen, rilwis, travisnorthcutt, miqrogroove, chriscct7, whyisjake.

Note: See TracTickets for help on using tickets.