Make WordPress Core

Opened 9 years ago

Closed 9 years ago

#33102 closed defect (bug) (wontfix)

Shortcodes with Quoted Attributes Break Inside of Quoted HTML Attributes

Reported by: cgrymala's profile cgrymala Owned by:
Milestone: Priority: normal
Severity: normal Version: 4.2.3
Component: Shortcodes Keywords: close
Focuses: Cc:

Description

Use Case:
You're trying to use a shortcode that accepts string attributes as the value of an HTML element's attribute. For instance, you have a sample shortcode that outputs a URL, and accepts a few string attributes. If the quotes surrounding the HTML attribute match the quotes surrounding the shortcode attributes, the shortcode is never processed.

Example:
<a title="This is a test" href="/[currenturl sanitize=1 before="?ref="]">Testing some stuff.</a>

In this example, assume there is a shortcode called currenturl that accepts the sanitize and before parameters, where the before parameter is a string argument, so it needs to be quoted in order to be processed properly, and the shortcode returns the URL of the current page. In this example, the processed code should look something like:

<a title="This is a test" href="/?ref=http%3A%2F%2Fwordpress.org%2F">Testing some stuff.</a>

Instead, as of WordPress 4.2.3 and the latest nightly build, the output looks like:

<a title="This is a test" href="/[currenturl sanitize=1 before="?ref="]">Testing some stuff.</a>

So far, this only seems to be related to the shortcode being inside of HTML tags, and it only occurs when the HTML attribute's quote style is the same as the shortcode attribute's quote style.

For instance, the following works just fine:

<a title="This is a test" href="/[currenturl sanitize=1 before='?ref=']">Testing some stuff.</a>

As does the following:

<a title="This is a test" href='/[currenturl sanitize=1 before="?ref="]'>Testing some stuff.</a>

The following two examples do not work, though:

`
<a title='This is a test' href='/[currenturl sanitize=1 before='?ref=']'>Testing some stuff.</a>

<a title="This is a test" href="/[currenturl sanitize=1 before="?ref="]">Testing some stuff.</a>
`

If you'd like to test it with a native WordPress shortcode (to prove that it's not a plugin/theme thing), you can certainly test it with something like:

<a href="[gallery link="file" ids="9,10,11,12,13,14,15"]">Some stuff</a>

Of course, we all know that that would output invalid (and really messy) HTML if it was processed correctly, but it still illustrates the point.

In that specific example, when the shortcode is processed correctly within that context, it returns an empty string, so that the output looks like:

<a href="">Some stuff</a>

However, in the current versions of WordPress, it outputs:

<a href="[gallery link="file" ids="15,14,13,12,11,10,9"]">Some stuff</a>

This has broken quite a few features on quite a few sites I'm working with. At the moment, I am having to go through all of those sites and manually edit the shortcodes to use single-quotes instead of the standard double-quotes, just so that they don't clash with the quotes around the HTML attributes.

Attachments (2)

33102_kses.diff (616 bytes) - added by gitlost 9 years ago.
Adjust regex in wp_kses_hair_parse() to allow for quoted shortcode params.
33102_tests.diff (2.1 KB) - added by gitlost 9 years ago.
Unit tests.

Download all attachments as: .zip

Change History (51)

#1 follow-up: @cgrymala
9 years ago

According to forum reports, replacing the wp-includes/shortcodes.php file with the version that was in WordPress 4.2.2 fixes this issue, so I'll be working on diffing the two versions of the file to see if I can figure out where the bug was introduced. If it's actually a bug that I think can be fixed without introducing new security issues again, I'll work on a patch.

#2 in reply to: ↑ 1 ; follow-up: @markjaquith
9 years ago

Replying to cgrymala:

According to forum reports, replacing the wp-includes/shortcodes.php file with the version that was in WordPress 4.2.2 fixes this issue, so I'll be working on diffing the two versions of the file to see if I can figure out where the bug was introduced.

It was a big, complex change, so I'd recommend not doing that, for your own sanity. :-) The info about the particular cases that fail is very helpful, however. We're looking into this.

#3 @cgrymala
9 years ago

Thanks, Mark. I also just found that this particular set of issues is possibly/probably related to #15964 (as you probably were already aware).

#4 @kitchin
9 years ago

Correction to comment:3, it is #15694

#5 @samuelsidler
9 years ago

#33103 was marked as a duplicate.

#6 @cgrymala
9 years ago

Thanks for the correction, @kitchin.

I think I've narrowed the issue down to line 379 of wp-includes/shortcodes.php. The wp_kses_attr_parse() function seems to be returning false when the HTML attribute's quote style matches the shortcode attribute's quote style.

I'll start trying to debug the wp_kses_attr_parse() function to see if I can figure out where/why that's happening.

#7 @johnbillion
9 years ago

  • Keywords needs-patch added
  • Version changed from trunk to 4.2.3

#8 @ocean90
9 years ago

#33097 was marked as a duplicate.

#9 @cgrymala
9 years ago

Okay, I've narrowed it down to the wp_kses_hair_parse() function. It's parsing the HTML string as having incorrect attributes because it's stopping at the first quote inside the shortcode.

The way it's parsing it, it ends up thinking the HTML code breaks down like:

<a
title="This is a test"
href="/[currenturl sanitize=1 before="
?ref="]"
>

The solution seems to be to adjust the regex that parses the HTML tag looking for attributes. I'm open for ideas on how that might work, as I'm struggling with how to get the regex to tell the difference between a closing quote for the HTML attribute and an opening quote for a shortcode attribute.

Bottom-line, is, though, that we need some way for the regex to determine which quotes are wrapping the HTML attribute value, and which quotes are inside the HTML attribute value.

#10 @miqrogroove
9 years ago

@cgrymala The problem is well described, however it is not a bug. It is simply invalid input resulting in unexpected output.

Last edited 9 years ago by miqrogroove (previous) (diff)

#11 @iamfriendly
9 years ago

This also appears to happen with nested shortcodes. i.e.

[outer_shortcode with="some_attributes"]

<a href="[the_permalink]" title="[the_title]">[the_title]</a> <!-- does not work -->

[the_permalink] <!-- does work -->

[/outer_shortcode]

Line 3 produces an anchor tag with a blank href and title attribute, line 5 produces the permalink as expected.

Mixing the quotes in the outer shortcode attributes/inner shortcodes doesn't seem to help in this case.

Last edited 9 years ago by iamfriendly (previous) (diff)

#12 @scamartist26
9 years ago

We have been looking for a fix for this today.We were able to fix some things with a filter that runs prior to do_shortcode that runs a preg_replace on the double quotes. Our regex is specific to us, and wouldn't work for all, but maybe this introduces an idea to you all that are probably much better at reqex than me.

Thought I would share.

Cheers!

#13 follow-up: @cgrymala
9 years ago

Out of curiosity, would it be more effective or less effective as a security fix if we were to quarantine the HTML tag first, then process its shortcodes, then evaluate its validity/safety, rather than trying to evaluate its validity/safety prior to processing the shortcodes inside?

I honestly don't know the answer, as I'm not privy to a lot of the decision-making behind this change, but I am curious.

For instance, instead of running wp_kses_hair_parse() on the original code that includes un-processed shortcodes, run it after processing/parsing the shortcodes.

As another possibility, maybe we could quarantine/replace all of the shortcodes inside of the HTML tag first, replacing them with placeholders (as you're already doing above with the HTML tags themselves), run wp_kses_hair_parse() on the HTML tag with the placeholders, then put the shortcodes back in?

It seems like these ideas could potentially solve the problem people are experiencing, but I have no idea whether they'd potentially re-introduce the security issues that these changes fixed. Thanks.

Also, to clarify, I understand that my original example could be somewhat confusing (without knowing my specific use-case, it can seem like a pretty stupid way to write a shortcode), but there are plenty of valid, non-confusing examples out there.

#14 @scamartist26
9 years ago

To explain our quick fix, we had a shortcode that years ago we found useful on our network site:
<a href="[bf_ins field="home_url"]">Home</a>

function fix_do_shortcode( $text ) {
	$updated_text = preg_replace('/\[bf_ins\s*field=\"(\w*)\"\]/', "[bf_ins field='$1']", $text);
	return $updated_text;
}

add_filter( 'the_content', fix_do_shortcode' );

This fixes OUR use just fine since the default filter priority is 10 and CORE do_shortcode is run at priority 11 on the_content. Also, this specific regex is simply looking how we used it.

Would there be harm in a better more general filter like this in CORE?

Last edited 9 years ago by scamartist26 (previous) (diff)

#15 in reply to: ↑ 13 ; follow-up: @dnavarrojr
9 years ago

Regular Expressions have never been a strong suite, however I have written dozens of mathematical parsers and based on what I know from that process, I would have to agree with your assessment. The correct way to handle this is to parse the shortcodes first and ignore the HTML, which is how I *think* it worked before the patch. In fact, I'm not sure why it's doing ANY HTML parsing at all.

If someone else doesn't work this out, I'll take a few days to work on it myself in a week.

#16 in reply to: ↑ 2 ; follow-ups: @cooperator_JR
9 years ago

Buddies, please, look at my shortcode problem,started by 4.2.3 automatic update:No problems before... 3 years working flawlessly, but now... :-(

Here´s a very simple code to show the ACTUAL DAY/MONTH/YEAR on posts: ( days /months are in Portuguese language ) :
Example: Last updated Post on ""DAY/MONTH/YEAR"" .

<script type="text/javascript">// <![CDATA[
var mydate=new Date()
var year=mydate.getYear()
if (year < 1000) year+=1900
var day=mydate.getDay()
var month=mydate.getMonth()
var daym=mydate.getDate()
if (daym<10) daym="0"+daym
var dayarray=new Array("Domingo","Segunda","Terça","Quarta","Quinta","Sexta","Sábado")
var montharray=new Array("de Janeiro","de Fevereiro","de Março","de Abril","de Maio","de Junho","de Julho","de Agosto","de Setembro","de Outubro","de Novembro","de Dezembro")
document.write(""+daym+" "+montharray[month]+", "+year+"")
// ]]></script>

SCRIPT TESTED=> OK: Click to ZOOM:
http://snag.gy/4AgUt.jpg

There´a phrase in Brazil that says: " In a victorious soccer team, we should not replace great players"... something like "If it's not broke,don't fix it"

What about now?
If you look at NOW, hundreds of multiple issues because updating 4.2.3, it seems like putting fire in ROME again... https://wordpress.org/support/forum/how-to-and-troubleshooting
Please, P-L-E-A-SE! Don´t be proud, just return to the previous version, where everything was ok.

Sorry my bad English , many thanks.

Replying to markjaquith:

Replying to cgrymala:

According to forum reports, replacing the wp-includes/shortcodes.php file with the version that was in WordPress 4.2.2 fixes this issue, so I'll be working on diffing the two versions of the file to see if I can figure out where the bug was introduced.

It was a big, complex change, so I'd recommend not doing that, for your own sanity. :-) The info about the particular cases that fail is very helpful, however. We're looking into this.

#17 follow-up: @kitchin
9 years ago

Try this (not tested):

<script type="text/javascript">
var mydate=new Date();
var year=mydate.getYear();
if (year < 1000) year+=1900;
var day=mydate.getDay();
var month=mydate.getMonth();
var daym=mydate.getDate();
if (daym<10) daym="0"+daym;
var dayarray=new Array("Domingo","Segunda","Terça","Quarta","Quinta","Sexta","Sábado");
var montharray=new Array("de Janeiro","de Fevereiro","de Março","de Abril","de Maio","de Junho","de Julho","de Agosto","de Setembro","de Outubro","de Novembro","de Dezembro");
document.write(""+daym+" "+montharray[month]+", "+year+"");
</script>

#18 in reply to: ↑ 16 ; follow-up: @scamartist26
9 years ago

There is no context of your shortcode here. Can you elaborate where this script implemented?

#19 in reply to: ↑ 17 ; follow-up: @cooperator_JR
9 years ago

Tested. Nothing happened.More: Deleted Cache, minified CSS/JS,same problem.
As I told you, 4.2.2 and before, FINE! 4.2.3...Catastrophic not only for me, but for thousand users.
It Happens, unfortunately.
Many Thanks for give some light, Kitchin.

Replying to kitchin:

Try this (not tested):

<script type="text/javascript">
var mydate=new Date();
var year=mydate.getYear();
if (year < 1000) year+=1900;
var day=mydate.getDay();
var month=mydate.getMonth();
var daym=mydate.getDate();
if (daym<10) daym="0"+daym;
var dayarray=new Array("Domingo","Segunda","Terça","Quarta","Quinta","Sexta","Sábado");
var montharray=new Array("de Janeiro","de Fevereiro","de Março","de Abril","de Maio","de Junho","de Julho","de Agosto","de Setembro","de Outubro","de Novembro","de Dezembro");
document.write(""+daym+" "+montharray[month]+", "+year+"");
</script>

#20 in reply to: ↑ 19 @scamartist26
9 years ago

Replying to cooperator_JR:

Tested. Nothing happened.More: Deleted Cache, minified CSS/JS,same problem.
As I told you, 4.2.2 and before, FINE! 4.2.3...Catastrophic not only for me, but for thousand users.
It Happens, unfortunately.
Many Thanks for give some light, Kitchin.

Can you post the php and the shortcode in context? There is no way to tell how this is parsed, nor the output intent (shortcode inline with content). If you post these details it will help shed more light on what is happening.

#21 in reply to: ↑ 18 ; follow-up: @cooperator_JR
9 years ago

Hi scamartist26 ,
I said before and gave an example:
When a same post is edited by different people in different days a week, the last person to edit use the script Last updated Post on

""DAY/MONTH/YEAR"" .

Replying to scamartist26:

There is no context of your shortcode here. Can you elaborate where this script implemented?

#22 in reply to: ↑ 21 @scamartist26
9 years ago

Replying to cooperator_JR:

Hi scamartist26 ,
I said before and gave an example:
When a same post is edited by different people in different days a week, the last person to edit use the script Last updated Post on

""DAY/MONTH/YEAR"" .

Replying to scamartist26:

There is no context of your shortcode here. Can you elaborate where this script implemented?

Shortcodes are typically written like this: [shortcode argument=value] or [shortcode argument='value'] or the way in question in this thread [shortcode argument="value"] of course there is the absolute [shortcode].

I do not see this in your code example. I see a script tag, and JavaScript. I do not see where [shortcode] is input. This thread is specific to the code I mentioned above. If you have an example of that it would helpful. If you are inserting JavaScript tags the way you have mentioned inside of the_content then your problem is likely that do_shortcode is filtering the +montharray[month]+ because of it's brackets. Also, quotes need to be properly escaped, which is likely the reason for this security update. JavaScript was never intended to be directly inserted into the_content.

#23 @jeremyfelt
9 years ago

#33110 was marked as a duplicate.

#24 in reply to: ↑ 15 ; follow-up: @georgestephanis
9 years ago

Replying to dnavarrojr:

Regular Expressions have never been a strong suite, however I have written dozens of mathematical parsers and based on what I know from that process, I would have to agree with your assessment. The correct way to handle this is to parse the shortcodes first and ignore the HTML, which is how I *think* it worked before the patch. In fact, I'm not sure why it's doing ANY HTML parsing at all.

I believe some of the problem on that is a question of the kses'ing. Either you scan for nasty script injections before or after processing the shortcodes. Scanning after doesn't work, as many plugins and oembeds are written explicitly to convert shortcodes into safe, known quantities with safe script tags and such. Scanning before is the problem that you seem to be having here. After all, someone may actually want to do something like ...

<img src="#" title="[foo bar=" baz=" nope]">

Which is totes ambiguous, and could be parsed reasonably as either

<img 
    src="#" 
    title="[foo bar=" 
    baz=" nope]"
>

an image with three attributes -- src, title, baz -- or:

<img 
    src="#" 
    title="
        [foo bar=" baz=" nope]
    "
>

An image with two attributes, the second containing a shortcode foo with two attributes -- bar being set equal to baz= and nope which is just present.

Using different quotes is how to resolve the ambiguity. I don't think it's possible for WP to automatically resolve such ambiguity correctly and safely. (happy to be proven wrong, just trying to explain it in a way that I'm not sure I'd seen up above)

#25 in reply to: ↑ 16 @SergeyBiryukov
9 years ago

Replying to cooperator_JR:

Here´s a very simple code to show the ACTUAL DAY/MONTH/YEAR on posts

See #33106 for your JavaScript issue, this ticket is about using shortcodes in HTML attributes.

#26 @wpoets
9 years ago

In our case even this failed, even though it followed the suggested guideline of single and double code.

<a class="gray line-height-s10" href="[aw2_call_func site_url p1='business-services' /]/[aw2_loop_value slug /]-in-india">[aw2_loop_value name /]</a></li>

this is just for reference on usecase.

#27 @iamfriendly
9 years ago

I wanted to give an exact example using the Accordion Shortcode (https://wordpress.org/plugins/accordion-shortcode/) which is pretty popular;

[accordions collapsible=true active=false style=accordion-container]
 [accordion title="section 1" style="accordion-style1 [author]"]
  content [author]
 [/accordion]
 [accordion title="section 2" style="accordion-style2"]
  more content
 [/accordion]
[/accordions]

Here, what I would expect - and is happening on 4.2.2 -

<div id="random-accordion-id-400" class="accordion-shortcode accordion accordion-container">

	<div class="accordion-group">
		<div class="accordion-heading">
			<a href="#section-1-0" class="accordion-toggle accordion-style1 ADMIN" data-toggle="collapse" data-parent="#random-accordion-id-400">section 1</a>
		</div>
		<div id="section-1-0" class="accordian-shortcode-content accordion-body collapse" style="height: 0px;">
			<div class="accordion-inner">
				<p>content ADMIN</p>
			</div>
		</div>
	</div>

	<div class="accordion-group">
		<div class="accordion-heading">
			<a href="#section-2-1" class="accordion-toggle accordion-style2 focus" data-toggle="collapse" data-parent="#random-accordion-id-400">section 2</a>
		</div>
		<div id="section-2-1" class="accordian-shortcode-content accordion-body in collapse" style="height: auto;">
			<div class="accordion-inner">
				<p>more content</p>
			</div>
		</div>
	</div>

</div>

You can see that the [author] shortcode has been replaced by ADMIN (I added the caps to make it more easily read) in the two places - one as an attribute and one as part of the content. However, what I'm seeing in 4.2.3 is the following;

<div id="random-accordion-id-400" class="accordion-shortcode accordion accordion-container">

	<div class="accordion-group">
		<div class="accordion-heading">
			<a href="#section-1-0" class="accordion-toggle" data-toggle="collapse" data-parent="#random-accordion-id-400">section 1</a>
		</div>
		<div id="section-1-0" class="accordian-shortcode-content accordion-body collapse" style="height: 0px;">
			<div class="accordion-inner">
				<p>"]<br>content admin</p>
			</div>
		</div>
	</div>

	<div class="accordion-group">
		<div class="accordion-heading">
			<a href="#section-2-1" class="accordion-toggle accordion-style2 focus" data-toggle="collapse" data-parent="#random-accordion-id-400">section 2</a>
		</div>
		<div id="section-2-1" class="accordian-shortcode-content accordion-body in collapse" style="height: auto;">
			<div class="accordion-inner">
				<p>more content</p>
			</div>
		</div>
	</div>

</div>

The shortcode in the attribute is missed. the one in the content exists, however, the content is prefixed with "]<br>

This is an example of nested shortcodes breaking, regardless of how you use quotes (I've tried with both single and double quotes for the style attribute. Both produce the same result (apart from when using single quotes, the content is prefixed with ']<br> and not "]<br> )

Sadly, I'm aware that using shortcodes like this was never an expected behaviour. But, in our case (across several hundred sites) and many, many others, this is something that people are doing.

I'm really keen to help out with a fix on this, so if anyone can point me in the right direction I'll happily get my hands dirty.

#28 in reply to: ↑ 24 ; follow-up: @cgrymala
9 years ago

Replying to georgestephanis:

With shortcodes already being broken when we would like things to be interpreted as:

<img 
    src="#" 
    title="
        [foo bar=" baz=" nope]
    "
>

What if we were to add a parameter to the add_shortcode() function that allows us to specify whether the shortcode should be processed prior to kses or after kses? By default, the shortcode would be processed after kses, as is the new default anyway, but this would potentially allow us to specify that a shortcode should always be interpreted in the first manner (process the shortcode before parsing the HTML element for attributes) without breaking the new manner of handling things.

Is that a reasonable possibility?

It seems to be basically the way plugin authors are already trying to short-circuit this issue anyway (grabbing and processing various bits of code before it gets to the kses or even the do_shortcode() part of the process), and it would give us a uniform way to do it instead of having a lot of independent, possibly dangerous, one-off solutions to the issue.

#29 follow-up: @miqrogroove
9 years ago

  • Keywords close added; needs-patch removed

#30 in reply to: ↑ 29 ; follow-up: @georgestephanis
9 years ago

  • Keywords close removed

Replying to miqrogroove:

I don't think that closing this out is wise. Clearly it's affected a number of folks negatively, and if they want to try and devise a solution that resolves the issue without affecting their use cases, they should be encouraged, if for no reason other than it could get more folks interested in contributing to core, and should folks come up with something that hadn't been considered previously, could restore the feasibility of previous workflows.

#31 in reply to: ↑ 28 @georgestephanis
9 years ago

Replying to cgrymala:

What if we were to add a parameter to the add_shortcode() function that allows us to specify whether the shortcode should be processed prior to kses or after kses? By default, the shortcode would be processed after kses, as is the new default anyway, but this would potentially allow us to specify that a shortcode should always be interpreted in the first manner (process the shortcode before parsing the HTML element for attributes) without breaking the new manner of handling things.

Is that a reasonable possibility?

I don't think it's necessarily the best idea, largely as kses is run on the content on save and is user specific, and do_shortcode is run on output (front-end requests) and isn't dependent on the user that edited the content (whether the user is contributor or administrator), if that makes sense?

#32 in reply to: ↑ 30 ; follow-up: @miqrogroove
9 years ago

  • Keywords close added

Replying to georgestephanis:

I don't think that closing this out is wise. Clearly it's affected a number of folks negatively, and if they want to try and devise a solution that resolves the issue without affecting their use cases, they should be encouraged

We need to encourage everyone to write better plugin code rather than mislead them about the validity of this ticket.

#33 in reply to: ↑ 32 ; follow-up: @georgestephanis
9 years ago

Replying to miqrogroove:

We need to encourage everyone to write better plugin code rather than mislead them about the validity of this ticket.

Whether you like it or not, this is the status quo, and until the point release it was in significant enough usage to cause this many issues.

We care about supporting backward compatibility when possible. As Core contributors, we shoulder the burden of that technical debt to make the users and developers lives easier.

You've made your position very clear, however I fear that how you're saying it is coming off unkind and unwelcoming to folks whose sites and clients sites have been negatively affected by the change. I'm not saying that the change itself was bad -- it was necessary -- but to recommend shutting down a ticket aimed at supporting backward compatibility feels problematic at best.

We need to welcome contributors, and if they can devise a way to bring back support for a prior use case that is both secure and performant, then I feel strongly that needs to be encouraged.

Alternately, if they give it their best effort and fail to come up with a sufficient solution, they will then come to the conclusion that the Core teams working on the fix for 4.2.3 had in fact come to the best solution, and that they may be more willing to give Core the benefit of the doubt in the future.

Long story short, let them try. If they ace it, we all win. If they don't, they'll understand that maybe the problem was a lot harder than they thought.

Seems like a win-win to me.

Also, I'm not going to play a tag-untag-retag game with you. It's just petty. Please remove the close tag that you re-added yourself.

Last edited 9 years ago by georgestephanis (previous) (diff)

#34 in reply to: ↑ 33 ; follow-up: @miqrogroove
9 years ago

Replying to georgestephanis:

Whether you like it or not, this is the status quo, and until the point release it was in significant enough usage to cause this many issues.

I suggest starting a new ticket with generalized status quo enhancements. The description in the current ticket is unlikely to gain traction beyond complaints and "me toos".

#35 in reply to: ↑ 34 @georgestephanis
9 years ago

Replying to miqrogroove:

I suggest starting a new ticket with generalized status quo enhancements. The description in the current ticket is unlikely to gain traction beyond complaints and "me toos".

Honestly, it is what you make of it. Sure, there have been some less useful comments, but if less useful comments is a way to kill a ticket, that'd be rather exploitable.

Plus I think we all need to give @cgrymala props for a tremendously solid initial report describing precisely what the prior status quo was, and how it broke.

It reads as a very good ticket to me, and I don't understand why you're so eager to close it, instead of letting folks collaborate on what's already been established as a central focal ticket, with three other tickets closed as duplicates of this one.

Last edited 9 years ago by georgestephanis (previous) (diff)

#36 @minderdl
9 years ago

Guys, might we get back to the original problem, please?

While there is a solution for wrong quotes (i.e. use double quotes inside single quotes and vice versa) there is no solution for nested shortcodes yet (see comments 11 and 27 above). Or did I miss something?

Are nested shortcodes also considered "wrong use" of shortcodes? Then, the whole nesting would be practically useless...

Nested shortcodes are used for conditions, loops, etc. by several plugins. And inside it MUST be possbile to use other shortcodes, also inside quoted tag arguments.

#37 @pento
9 years ago

I'm fine with leaving this ticket open for a bit, if folks want to experiment with a solution for the original bug, "Shortcodes with Quoted Attributes Break Inside of Quoted HTML Attributes". Short of implementing a full parser (that isn't an option, by the way :-) ), I'm not confident of there being a proper solution. I will sing your praises from the rooftops if you prove me wrong, though!

Shortcodes inside of HTML attributes inside of shortcodes (or nested shortcodes, for people searching for that) are a separate issue, we've opened #33134 to track that.

#38 @boybawang
9 years ago

Hey Guys -

I'm pretty sure what I'm experiencing is related to this bug, but wanted to double check. My site also broke with the 4.2.3 upgrade, so I had to revert back to 4.2.2.

I have a text input field whose value calls a shortcode that processes some PHP, and returns the value. It looks something like this:

<input type="text" value="[php function=1]" />

Now, what's showing up in my text field is [php function=1] as opposed to what the PHP used to return. Seeing as this shortcode is within quoted HTML attributes, does this correctly reflect the bug for this ticket?

#39 @miqrogroove
9 years ago

Please reference

https://make.wordpress.org/core/2015/07/23/changes-to-the-shortcode-api/

and

http://codex.wordpress.org/Shortcode_API

Shortcodes inside of HTML elements (angle braces) are restricted by the new patches to protect your website. This is not a bug.

#40 follow-up: @miyarakira
9 years ago

To summarize this delightfully tangled thread and lead it to closure..

The original topic was Shortcodes with Quoted Attributes Break Inside of Quoted HTML Attributes.

<a href="[shortcode param="value"]">

To start with, the inner quotes should be 'single' or no quotes. However, that doesn't matter because from WP 4.2.3, all shortcodes inside HTML attributes are no longer valid, with or without shortcode parameters. They are removed, and the result is:

<a href="">

So, this is the answer to the original issue: "shortcodes with quoted attributes inside of quoted HTML attributes" are not allowed anymore. There will not be a fix or patch for this.

The recommended solution is to change the shortcode's behavior, to return the whole link or HTML element.

Case closed?


Several people gave variations of the same use case - shortcodes inside HTML attributes - including inside of nested shortcodes. These are not specifically related to the original topic, since they're not using quoted shortcode parameters. But the answer is the same as above. This latest comment# 38 is another example. In this case, the shortcode can return the whole input element, instead of just the value.

In the case of using a shortcode in a <div> class (comment# 27), I don't see an easy answer. Also in the same comment, a different topic was introduced: the use of shortcodes inside shortcode parameters: [shortcode param="[another-shortcode]"]. I doubt that this can be parsed correctly without serious regex magic.


There was a relevant point raised in comment# 38 by @cgrymala. There are themes and plugins which provided shortcodes for use in HTML attributes, most notably in links, and a significant number of websites depended on this use case. Since going through all previously developed sites to change these shortcode use cases is not realistic, some of the theme/plugin developers have resorted to processing their shortcodes before it gets stripped in do_shortcode().

This is inefficient, and, more importantly, risks reopening the vulnerability that was addressed by this change in the Shortcode API. The only feasible solution I see is that these themes/plugins must transition their user base and provide new shortcodes (or break backward compatibility with existing shortcodes) so that the whole HTML element is returned.

Either that, or loosen/improve the restriction to allow safe and valid use of shortcodes in HTML attributes.

Version 0, edited 9 years ago by miyarakira (next)

#41 follow-up: @miqrogroove
9 years ago

I don't think we have any good reasons to loosen restrictions on quotes usage in HTML at this point. But for the other cases, you are welcome to open an enhancement ticket. The Shortcodes API would need to undergo a partial re-design to do this, for example to loosen restrictions on a user or capability basis. The API currently doesn't have that.

#42 in reply to: ↑ 41 @miyarakira
9 years ago

Prior to WP 4.2.3, shortcode parameters could use single or no quotes inside HTML attributes:

<html attr="[shortcode param='value']">
<html attr="[shortcode param=value]">

I agree that it won't make sense to allow double quotes.

---

My point about loosening restrictions was more general, to allow shortcodes in HTML attributes in a safe and valid way. This would restore previously allowed use cases, while still keeping it secure. It's encouraging to hear that it may be possible by user or capability basis. I'm sure other people would be interested in this solution.

This seems to be a part of what's proposed in #33134. The example given there is about nested shortcodes, but there is a common point of how to handle this:

<html attr="[dependent]">

Prior to WP 4.2.3, do_shortcode() - in the_content filter or within another shortcode - resulted in:

<html attr="some dynamic value">

Since the change in the Shortcode API, shortcodes inside HTML attributes are stripped, regardless of being nested or not. The result being:

<html attr="">

I understand this is to prevent malicious use of the shortcode syntax. As you suggested, there could be a less drastic way, by allowing trusted users to continue using shortcodes in HTML attributes. I imagine it can be allowed inside posts whose author has sufficient capability. The same goes for nested shortcodes, and if do_shortcode() is used inside PHP templates, well, if they can run PHP then they already have sufficient privileges, so it should be safe to allow the use of shortcodes inside HTML attributes. (Edit: ..unless untrusted content is put through do_shortcode..hmm..)

The question is, how can do_shortcode() determine if the content came from an untrusted user with insufficient privileges. It seems to me that this is the only context when it's necessary to prevent this shortcode use case.

Last edited 9 years ago by miyarakira (previous) (diff)

#43 in reply to: ↑ 40 @miqrogroove
9 years ago

Replying to miyarakira:

all shortcodes inside HTML attributes are no longer valid

This is a false statement.

#44 @miqrogroove
9 years ago

  • Milestone Awaiting Review deleted

#45 @SergeyBiryukov
9 years ago

  • Milestone set to Awaiting Review

@gitlost
9 years ago

Adjust regex in wp_kses_hair_parse() to allow for quoted shortcode params.

@gitlost
9 years ago

Unit tests.

#46 @gitlost
9 years ago

  • Keywords has-patch added

The above patch seems to deal with the parsing aspect of same-quoted shortcode attributes in html attributes, which would be nice for backward compatibility.

#47 @miqrogroove
9 years ago

  • Keywords has-patch removed

Above patch only checks for the presence of square braces, without regard to HTML sanity.

#48 @gitlost
9 years ago

I don't understand that comment, could you elaborate please? The patch causes the OP's currently non-working

<a title="This is a test" href="/[currenturl sanitize=1 before="?ref="]">Testing some stuff.</a>

to be treated the same as the currently working

<a title="This is a test" href="/[currenturl sanitize=1 before='?ref=']">Testing some stuff.</a>

Which are both equally sane (or insane) as far as I can see...

#49 @miqrogroove
9 years ago

  • Milestone Awaiting Review deleted
  • Resolution set to wontfix
  • Status changed from new to closed

I have explained repeatedly on the Core blog and in comments above that this invalid use of HTML is fixed and is no longer possible as of 4.2.3.

It is time to close this ticket.

Note: See TracTickets for help on using tickets.