#6598 closed defect (bug) (invalid)
2.5 escapes widget names
Reported by: | omry | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 2.5 |
Component: | Widgets | Keywords: | reporter-feedback has-patch has-unit-tests |
Focuses: | Cc: |
Description
My FireStast widgets used to have an image tag with a small FireStats icon to make them easily identified.
now that 2.5 started to escape the widget names, this no longer works.
Since the code that adds widgets to WP is already privileged code that can do whatever it wants, I don't think that escaping the names is adding any kind of protection.
Change History (14)
#3
@
16 years ago
- Keywords reporter-feedback added
- Milestone 2.9 deleted
- Resolution set to invalid
- Status changed from new to closed
This ticket was mentioned in PR #6429 on WordPress/wordpress-develop by @dmsnell.
8 months ago
#4
- Keywords has-patch has-unit-tests added
Trac ticket: Core-61052
Alternative in #6598
Allow for additional custom data attributes in wp_kses_attr_check()
.
In this patch the set of allowable custom data attribute names is expanded to permit those including dashes in the dataset
name. The term dataset
name here means the name appearing to JavaScript in a browser. This is found by stripping away the data-
prefix, lowercasing, and then turning every dash-letter combination into a capital letter.
The Interactivity API relies on data attributes of the form data-wp-bind--enabled
. The corresponding dataset name here is wpBind-Enabled
, and is currently blocked by wp_kses_attr_check()
. This patch allows that and any other data attribute through which would include those dashes in the translated name.
This patch is structured in two parts:
- Ensure that Core can recognize whether a given attribute represents a custom data attribute.
- Once recognizing the custom data attributes, apply a WordPress-specific set of constraints to determine whether to allow it.
The effect of this change is allowing a double-dash when previously only a single dash after the initial data-
prefix was allowed. It's more complicated than this, because, as evidenced in the test suite, double dashes translate into different names based on where they appear and what follows them in the attribute name.
These are recommendations. If these naming recommendations are not followed, no errors will occur. The attributes will still be matched using CSS attribute selectors, with the attribute being case insensitive and any attribute value being case-sensitive. Attributes not conforming to these three recommendations will also still be recognized by the JavaScript HTMLElement.dataset property and user-agents will include the attribute in the DOMStringMap containing all the custom data attributes for an HTMLElement.
https://developer.mozilla.org/en-US/docs/Web/HTML/Global_attributes/data-*
8 months ago
#5
@sirreal what are your thoughts on this vs. #6598? and also about @peterwilsoncc's thoughts.
I'm a little unclear as to the need for the new function as it's the attribute name that KSES is concerned about, the validation regex within the existing function should be for convertible names.
Personally I feel like I could be convinced both ways, but there's this part of me, especially after working with so many issues in WordPress' HTML handling and kses
issues, that starting with the specification and then adding constraints is going to pay off over starting with a mix of special constraints and ad-hoc parsing.
That is, for example, maybe today kses
rejects anything that looks like a custom data attribute, but tomorrow some other code tries to reuse the logic and misses some valid ones, or lets through one that it shouldn't, because its test wasn't "is this a custom data attribute" but rather "does this attribute follow this regex?" (By the way, @peterwilsoncc, this is another phrasing of my run-around reason for including the function - to define a new semantic which can answer the question "is this a custom data attribute?")
@jonsurrell commented on PR #6429:
8 months ago
#6
what are your thoughts on this vs. https://github.com/WordPress/wordpress-develop/pull/6598?
I'm happy to see this PR move forward if we're happy with it, it seems more complete. My only intention with #6598 was to have a PR that was small and easy to land. I'm happy to close that PR.
@peterwilsoncc commented on PR #6429:
8 months ago
#7
@dmsnell I can see it leading to confusing if the new function doesn't return null
for a value that will end up being stripped:
wp> wp_kses_transform_custom_data_attribute_name( "data-love-a-little-🐮-love-a-little-tippin'" ) => string(34) "loveALittle-🐮LoveALittleTippin'" wp> wp_kses_post( "<div data-love-a-little-🐮-love-a-little-tippin'>Yes, a musical theatre reference</div>" ) => string(43) "<div>Yes, a musical theatre reference</div>"
As it's a kses function it suggests support that isn't available. Generally KSES sub-functions aren't intended to be called directly but that can change over time, as we've seen with safecss_filter_attr()
8 months ago
#8
@peterwilsoncc 🤔 many of the kses
sub-functions _are_ there just for parsing, like wp_kses_hair
and wp_kses_hair_parse
and wp_kses_attr_parse
would it make more sense to add this into the HTML API and then have kses
call it? it would have a clear home there as being a semantic of the HTML layer.
as a reminder, my purpose is to fully vet and adapt this solution for the feedback, not to push to adopt it or choose it in the end.
@peterwilsoncc commented on PR #6429:
8 months ago
#9
as a reminder, my purpose is to fully vet and adapt this solution for the feedback, not to push to adopt it or choose it in the end.
Yeah, I've got ya. We're working to the same end & it's a complex question :)
would it make more sense to add this into the HTML API and then have
kses
call it? it would have a clear home there as being a semantic of the HTML layer.
That works as the HTML API seems more about parsing without regard to kses. As the maintainer of the HTML API is that something you are happy with? If it's not something you had planned and don't think is a valid use of the feature then don't feel the need to wedge it in there on my account.
tomorrow some other code tries to reuse the logic and misses some valid ones, or lets through one that it shouldn't, because its test wasn't "is this a custom data attribute" but rather "does this attribute follow this regex?
Are you able to clarify this a little more?
I'm still trying to get clear on the purpose of the new function in my head and how it differs from the regex validation.
8 months ago
#10
thanks again @peterwilsoncc, and sorry for my late response, which is due to some transit that threw me off.
That works as the HTML API seems more about parsing without regard to kses.
I'm happy to put this in there since it relates to the spec issues. Originally we had a WP_HTML_Spec
class in the HTML API proposal but we removed it. It was basically an all-static class with knowledge like this from the HTML spec, and no other stateful or complicated logic.
The only challenge I see in moving into the HTML API right now is that it's not clear where to put it, and that would probably take some time to see how everyone feels.
As an alternative, we could inline the code for now and move it over when we have a stronger idea where it belongs. Or I guess we could still leave it as its own function so that we can leave in the tests for it, but communicate that it's not to be used or called from the outside, and that it will probably move. As in, maybe we can keep it here for now for practical reasons and defer the bigger design question until we can answer it.
Are you able to clarify this a little more?
Thank you for asking. I don't mean to be unclear, but I don't always get things out very well.
I could change the question here a bit. This entire patch and need to open up kses
is complicated by the fact that it's unclear what Core wants or expects. Here's was the existing code claims…
/**
* Allow `data-*` attributes.
*
* …
*
* Note: the attribute name should only contain `A-Za-z0-9_-` chars,
* double hyphens `--` are not accepted by WordPress.
*/
The existing code makes a claim for the purpose of the regex but then immediately contradicts that claim by specifying new rules. It's there to allow data-*
attributes, but then also it's there to only allow _some_ of them. It doesn't explain why these new restrictions are in place or who decided. It's evident that there are additional rules being imposed, but there's no explanation for a process to determine how and if it's safe to change those rules, or know if there's a bug in the original implementation.
For example, should Core allow data-________
? It's allowed by the existing regex, but it looks wrong. What makes it wrong? Looks?
Getting back to your question, by separating the question into two steps I find it easier to disambiguate situations like these: is this a custom data attribute? is this one Core allows?
We have a clear goalpost for proper parsing, so we don't have to ask ourselves if the regex was intentionally overlooking other custom data attributes vs. whether it was accidentally overlooking them. Since this entire patch is about changing what WordPress allows, we can focus on WordPress additional rules and not conflate that with how to properly detect legitimate ones. We can change our own sanitization without ever effecting the underlying parsing code.
Anyway I hope I haven't made this more confusing. This is all about making it easier to talk about and verify that code we write is correct and about separating the act of detecting things from allowing some of those things we detect.
It's similar to using the HTML API to find IMG
tags with a given class first by finding all of the legitimate IMG
tags and then applying further rules, vs. attempting to find the subset directly with some pattern like ~[[Image(https://[^)]]]>~
- it invites conflation of spec. vs. custom behaviors which open up opportunity to miss things.
@peterwilsoncc commented on PR #6429:
8 months ago
#11
Thanks for clarifying.
I've been considering this over the weekend and my strong inclination is to go with the source code changes in PR #6598 to allow for the lightest touch changes to land in core.
Doing some quick testing, I can't see it leading to problems but it's worth coming up with additional edge cases. (Also doing quick tests, I can see that some of the browser interpretations are interesting https://jsbin.com/devefix/edit?html,js,console )
That said, if you can think of problematic cases it would lead to, I am happy to listen. It's best to know about these things pre- rather than post-commit.
The only challenge I see in moving into the HTML API right now is that it's not clear where to put it, and that would probably take some time to see how everyone feels.
As an alternative, we could inline the code for now and move it over when we have a stronger idea where it belongs. Or I guess we could still leave it as its own function so that we can leave in the tests for it, but communicate that it's not to be used or called from the outside, and that it will probably move. As in, maybe we can keep it here for now for practical reasons and defer the bigger design question until we can answer it.
If we're to validate the dataset, I'd be inclined to inline it for the time being if the intent is to deprecate in the shortish term.
In the past Core has removed private delegations because it didn't reflect reality (List Tables being the poster-child for this) so we'd need to keep it around. Inlining also allows a more considered approach to inclusion in the HTML API while you figure out where best to place it.
8 months ago
#12
If you feel happy with #6598 then let's merge that in and leave this open for further consideration. The two aren't contradictory, but this one encompasses more choice. I'll approve that one, and if nobody has merged it by later today I'll try and do that. The most important thing to me is that we allow these where WordPress currently removes them for the Interactivity API's purposes.
In aa63e2b0756f21043b74a8cc1fe45bc4ceddf552 I added your additional test cases. Cases like these are exactly why I approached the problem the way I did, because intuition of the mapping from HTML to JavaScript can be fuzzy, meaning the rules we apply in PHP are in high risk for being different than we think they are unless we write the rules to the output of that transformation.
Thanks @peterwilsoncc!
@jonsurrell commented on PR #6429:
7 months ago
#13
We should update the trac ticket for this to https://core.trac.wordpress.org/ticket/61501. This would be an enhancement targeting the 6.7 release.
@jonsurrell commented on PR #6429:
3 months ago
#14
I shared some motivation for allowing more data attributes in #61501, namely Interactivity API class directives and the Tailwind CSS framework.
This could result in data attributes like data-wp-class--bottom-[-24rem]="context.isEditCard"
which kses strips out at the moment.
Please re-open with feedback if this is still occurring/needed in your opinion.