#6969 closed enhancement (wontfix)
Don't apply wptexturize() to the insides of shortcode tags
Reported by: | Viper007Bond | Owned by: | tellyworth |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 2.5.1 |
Component: | Formatting | Keywords: | needs-patch wptexturize |
Focuses: | Cc: |
Description
I have this for my post contents:
[code lang="php"]$foo = 'bar';[/code]
The problem is my shortcode function gets this passed to it for the content string:
$foo = ‘bar’;
wptexturize()
should be smart enough to not format the contents of registered shortcodes. If plugins want their contents formatted, they can pass the contents to the wptexturize()
function itself.
Attachments (5)
Change History (48)
#3
@
16 years ago
- Keywords has-patch needs-testing added; needs-patch removed
attachment 6969.diff added.
- Needs testing
- skips inards of registered shortcodes
- Splitting regex might need more attention.. not too sure.
Test Code:
<?php $in = '[code lang="php"] $foo = \'bar\'; [/code] <code lang="php">$foo = \'bar\'; $foo = \'bar\'</code>'; $GLOBALS['shortcode_tags']['code'] = ''; var_dump(time()); var_dump(wptexturize($in));
Without the 'code' registered:
[code lang="php"] $foo = ‘bar’; [/code] <code lang="php">$foo = 'bar'; $foo = 'bar'</code>
with 'code' registered:
[code lang="php"] $foo = 'bar'; [/code] <code lang="php">$foo = 'bar'; $foo = 'bar'</code>
#6
@
16 years ago
Don't shortcodes use [] only? Why don't you texturize text inside <>?
<...>
are HTML tags, So applying texturize to those would break all HTML inside a post (inc. formatting).
If you were to enter < "blah blah" >
inside a post in the visual mode, you would have > "blah..
when texturize came to it, So the content inside them IS texturized if entered via the visual mode.
#7
follow-up:
↓ 9
@
16 years ago
Right, sorry my bad. I'm still a bit puzzled by this change though.
- $textarr = preg_split('/(<.*>|\[.*\])/Us', $text, -1, PREG_SPLIT_DELIM_CAPTURE);
+ $textarr = preg_split('/([<\[].*[>\]]|\[.*\])/Us', $text, -1, PREG_SPLIT_DELIM_CAPTURE);
So instead of <> tags and [] tags, now also <] and [> tags are possible? What's the point of that? And why is the second part of the new regexp even needed ( \[.*\] ), shouldn't these tags all be matched by the first part anyways ( [<\[].*[>\]] ).
I'm not looking at the bigger picture here, so if I'm not making any sense at all, just ignore me =)
#9
in reply to:
↑ 7
@
16 years ago
- Keywords needs-patch added; has-patch removed
Replying to JakaJancar:
So instead of <> tags and [] tags, now also <] and [> tags are possible? What's the point of that? And why is the second part of the new regexp even needed ( \[.*\] ), shouldn't these tags all be matched by the first part anyways ( [<\[].*[>\]] ).
Hm.. Odd.. Why didnt i notice that at that the time.
Eitherway, This is currently broken then, Because while the code seems to read that it was to exclude shortcodes, In reality, it doesnt work. (Probably because splitting like that doesnt work as expected)
#10
@
16 years ago
I am working on a patch for this. I am at the point where I have the texturizing fixed, and am now working on wpautop (which is totally dysfunctional with any but the most basic use of shortcodes).
Excluding the shortcode content is fairly complex work and will likely slow things down a bit.
This leads me to wonder why these filters are applied before the shortcodes are handled. There could be security reasons -- I admit I didn't really dig into this. From my point of view, it seems we are making things needlessly complicated. Shortcodes are meant to alter or produce output, and it is that on that output that filters should be applied -- not on input. No?
#12
@
16 years ago
The above test case is pretty basic, but should do the job. It is not integrated into the WordPress tests, as I do not have time at the moment to get familiar with those.
#13
@
16 years ago
- Component changed from General to Formatting
- Keywords has-patch added; needs-patch removed
#14
follow-up:
↓ 16
@
16 years ago
A few thoughts:
- I'm scratching my head on the extra parameter in the get_shortcode_regexp(), when using the original one might be just as good.
- the start regexp is not the same as the beginning of the real regex
- the idea of changing the actual regexp using str_replace() seems a little weird. why not make it the first element and re-assign things as needed later on?
also, I foresee potential bugs or tough choices due to the fact that shortcodes can now be escaped like this:
[[shortcode]]
because of the escaping regex, the current patch seems potentially buggy:
foo[shortcode] "bar" [/shortcode]foo
#16
in reply to:
↑ 14
;
follow-up:
↓ 19
@
16 years ago
Replying to Denis-de-Bernardy:
- I'm scratching my head on the extra parameter in the get_shortcode_regexp(), when using the original one might be just as good.
This is just an optimization. Feel free to ditch it.
- the start regexp is not the same as the beginning of the real regex
Not sure what you mean exactly.
- the idea of changing the actual regexp using str_replace() seems a little weird. why not make it the first element and re-assign things as needed later on?
I agree, but I could not find another decent way to do it.
also, I foresee potential bugs or tough choices due to the fact that shortcodes can now be escaped like this:
[[shortcode]]because of the escaping regex, the current patch seems potentially buggy:
foo[shortcode] "bar" [/shortcode]foo
We need add test cases for this and fix it.
I continue to think that shortcodes should be treated before wptexturize and wpautop. In my mind, it makes no sense to do all that formatting work before shortcodes are treated. It's presumptuous -- why should we presume shortcode content is meant to be displayed? In fact, it most likely isn't -- it's meant to be interpreted, transformed, etc. The output of the shortcode is what should be formatted.
#19
in reply to:
↑ 16
;
follow-up:
↓ 20
@
16 years ago
Replying to mzizka:
Replying to Denis-de-Bernardy:
- the start regexp is not the same as the beginning of the real regex
Not sure what you mean exactly.
your start regexp doesn't start with (.?) like the real one. it's potentially problematic in the second use-case I outlined.
#20
in reply to:
↑ 19
@
16 years ago
your start regexp doesn't start with (.?) like the real one. it's potentially problematic in the second use-case I outlined.
I was not familiar with the shortcode escaping feature. I should have read the comments more closely. I do see your point now. This needs a fix, but the existing code is already broken in this regard anyway. All the full regexp will do is capture an extra character on either side of the shortcode. In our case we would want to NOT consider escaped shortcodes when splitting the text. This should not be difficult, but require a different regular expression from the "standard" one.
Regarding the str_replace
on the regexp, I'd like to bring a little more precision. The reason why I do this is because the entire regular expression has to be captured in parentheses (so that the full delimiters are all captured by preg_split
). This will increment by 1 any capturing index in the original shortcode regular expression, so I have to fix the backreferencing used therein. Short of refactoring the entire function, this was the only fix I could come up with. The other solution is to not call the function, and construct another regular expression locally. I think this is the path I will opt for.
I will change my patch and add some test cases for the case scenarios you outlined (as well as a few others) when time allows.
(However, I would still like to know why formatting is applied before processing of shortcodes. I know I'm annoying with this, but I'm having a hard time with the idea.)
#21
@
16 years ago
I've updated a new and somewhat simplified patch (though it does not call the shortcode regexp function, so no code reuse). I've updated my test case with the escaping. I can see it failing on fringe cases like
[[shortcode]
(i.e., two opening brackets and only one closing bracket), but I personally could live with that.
#22
@
16 years ago
Previous patch version did not take into account the following scenario:
foo[shortcode] "bar" [/shortcode]foo
This is fixed now (test added).
#24
@
16 years ago
- Keywords reporter-feedback 2nd-opinion dev-feedback added
Another possibility, for this and #6984, could be to make shortcodes occur before autop and texturize. I take it there were reasons for shortcodes to occur after them, but I can't seem to remember what it is... Maybe someone else does?
#25
@
16 years ago
I don't remember the exact reasoning, but it was discussed at extreme length. There was even the ability to chose when your shortcode ran (before or after wpautop).
In short, it's how it is now for a reason and it should not be moved, especially since so many plugins expect it to be where it is now.
#27
@
16 years ago
- Keywords needs-patch 2nd-opinion dev-feedback added; has-patch needs-testing removed
- Milestone changed from 2.8 to Future Release
@mzizka: here are a few nasty tests case to play with:
foo[caption id="attachment_31" align="alignleft" width="215" caption=""foo""]<img class="size-full wp-image-31" title="test2" src="http://localhost/~denis/wp/wp-content/uploads/2009/04/test2.jpg" alt=""foo"" width="215" height="300" />"bar"[/caption]bar
currently outputs (with your patch):
<p>foo<div id="attachment_31" class="wp-caption alignleft" style="width: 225px"><img class="size-full wp-image-31" title="test2" src="http://localhost/~denis/wp/wp-content/uploads/2009/04/test2.jpg" alt=""foo"" width="215" height="300" />"bar"<p class="wp-caption-text">"foo"</p></div>bar</p>
expected, either of:
<p>foo<div id="attachment_31" class="wp-caption alignleft" style="width: 225px"><img class="size-full wp-image-31" title="test2" src="http://localhost/~denis/wp/wp-content/uploads/2009/04/test2.jpg" alt=""foo"" width="215" height="300" />”bar″<p class="wp-caption-text">"foo"</p></div>bar</p> <p>foo<div id="attachment_31" class="wp-caption alignleft" style="width: 225px"><img class="size-full wp-image-31" title="test2" src="http://localhost/~denis/wp/wp-content/uploads/2009/04/test2.jpg" alt=""foo"" width="215" height="300" />"bar"<p class="wp-caption-text">"foo"</p></div>bar</p>
the same shortcode, but this time escaped:
[[caption id="attachment_31" align="alignleft" width="215" caption=""foo""]<img class="size-full wp-image-31" title="test2" src="http://localhost/~denis/wp/wp-content/uploads/2009/04/test2.jpg" alt=""foo"" width="215" height="300" />"bar"[/caption]]
currently outputs:
[caption id=”attachment_31″ align=”alignleft” width=”215″ caption=”"foo"”]<img class="size-full wp-image-31" title="test2" src="http://localhost/~denis/wp/wp-content/uploads/2009/04/test2.jpg" alt=""foo"" width="215" height="300" />“bar”[/caption]
expected, either of:
[caption id="attachment_31" align="alignleft" width="215" caption=""foo""]<img class="size-full wp-image-31" title="test2" src="http://localhost/~denis/wp/wp-content/uploads/2009/04/test2.jpg" alt=""foo"" width="215" height="300" />"bar&quot;[/caption] [caption id="attachment_31" align="alignleft" width="215" caption=""foo""]<img class="size-full wp-image-31" title="test2" src="http://localhost/~denis/wp/wp-content/uploads/2009/04/test2.jpg" alt=""foo"" width="215" height="300" />"bar"[/caption]
itching to close this as invalid or wontfix, myself. or then, we'd want to add some kind of special attribute that *tells* texturize what to do.
the reason is, in above, the outlined expected results could all be expected. I used the caption shortcode from lack of a plugin around that used shortcodes, but:
- on occasion, you'll actually want the texturizing
- on occasion, you're writing php code and don't want it to change a thing
- on occasion, you're writing about shortcodes, using the escape method that was recently implemented, and you expect *some* texturizing to happen (as is done in code block). in this case the shortcode arguably gets nested into a <pre> or <code> block (haven't tried that)
moving this to future pending dev feedback
#28
@
16 years ago
oh, and there might arguably be related bugs in the escape method that needs to be fixed, too. specifically, if I write about shortcodes, in something like:
<pre> [[php foo="bar"] echo "foo" . 'bar'; [/php]] </pre>
then the ultimate, expected, final result arguably is:
<pre> [php foo="bar"] echo "foo" . 'bar'; [/php] </pre>
#35
@
15 years ago
- Milestone changed from 3.0 to Future Release
No patch, unit tests, agreement or traction.
#36
follow-up:
↓ 38
@
15 years ago
Work-around for plugin authors BTW:
http://www.viper007bond.com/2009/11/22/wordpress-code-earlier-shortcodes/
#38
in reply to:
↑ 36
@
14 years ago
Replying to Viper007Bond:
Work-around for plugin authors BTW:
http://www.viper007bond.com/2009/11/22/wordpress-code-earlier-shortcodes/
This workaround seems really solid, though I'd like to chip in of the idea including third attribute for priority in add_shortcode
like this:
add_shortcode('foo', 'foo_shortcode_func', 7);
Where "7" is the priority as mentioned in viper's blog, check the workaround to understand.
#40
@
11 years ago
- Keywords needs-unit-tests removed
Added a robust, easy-to-read unit test at WordCamp Phoenix
This ticket was mentioned in IRC in #wordpress-dev by ericmann. View the logs.
11 years ago
#42
@
11 years ago
- Resolution set to wontfix
- Status changed from new to closed
As of version 2.8.0, there is a no_texturize_shortcodes filter that prevents wptexturize() from touching the text between specified code tags. Several patches are being added to 4.0 that will also fix bugs in the wptexturize() parsing of shortcodes.
It is therefore up to the author of the shortcode plugin whether or not to allow wptexturize() to modify shortcode content. There is no reason to change the default behavior for all shortcodes.
See also: #6984