Make WordPress Core

Opened 15 years ago

Closed 9 years ago

#10702 closed enhancement (wontfix)

Support for complex nested shorttags

Reported by: pepijndevos's profile pepijndevos Owned by:
Milestone: Priority: normal
Severity: normal Version: 2.9
Component: Shortcodes Keywords: needs-testing needs-refresh close
Focuses: Cc:


The documentations is misleading on this point.
It is said that you can have nested shorttags by calling do_shorttag on the content.

However the shorttag parser isn't really a parser, it's just a non-greedy regex of some sort. It does work for simple tags without content, but when we define a [div] tag, look at the following code:


Results is:

    <div>(tag without content)
    [/div](not parsed at all)

I think that if you want to stop people from writing their own parsers you should include some decent parser with Wordpress, not a regex I could do myself.

Workaround: Assign div2 and probably div3 and 4 to the same function, and use these for the nested tags. <--- very ugly!

Attachments (3)

10702.diff (3.2 KB) - added by kovshenin 12 years ago.
10702.unit-tests.diff (1.9 KB) - added by kovshenin 12 years ago.
10702.performance.csv (1.1 KB) - added by kovshenin 12 years ago.

Download all attachments as: .zip

Change History (22)

#1 @scribu
15 years ago

  • Component changed from Editor to Shortcodes
  • Keywords needs-patch added
  • Milestone changed from Unassigned to Future Release
  • Version set to 2.9

#2 @scribu
15 years ago

Well, if you're so dissatisfied, why don't you contribute with a patch?

#3 @pepijndevos
15 years ago

Because I can't.

I've never written a complex non-regex parser(and have no idea how to do it*) and never contributed a patch to Wordpress.

It'd be an interesting learning experience, but I doubt if it will work, be Wordpress coding style, backwards compatible, etc...
Looking at shortcode.php I'm not sure which parts are mandatory an what I can throw out to write something new(with all those comments).

  • If I had to write a parser I think I would do it with a lot of foreach loops starting a new loop for every new tag and ending a loop for every end tag(to create some sort of 'levels'). I don't know exactly how, else I would have written it.

#4 @scribu
15 years ago

Perhaps the code for bbCode Lite plugin for bbPress will be of help.

Shortcodes and bbCode are very similar.

#5 @pepijndevos
15 years ago


They stated in shortcode.php that it's based on textpatern:

 * WordPress API for creating bbcode like tags or what WordPress calls
 * "shortcodes." The tag and attribute parsing or regular expression code is
 * based on the Textpattern tag parser.

On the Textpattern site I found this article saying that before version X they also did not support nesting of the same tags, but now they do. Maybe that code is usable?

#6 follow-up: @pepijndevos
15 years ago

Forgot to say:

No, BBCode Lite is not usable, it's based on regex also.

#7 in reply to: ↑ 6 @scribu
15 years ago

Replying to pepijndevos:

No, BBCode Lite is not usable, it's based on regex also.

So what if it's based on regex? It gets the job done: nested tags.

#8 @pepijndevos
15 years ago

No, it has the same problem as Wordpress has.
A regex either looks for the widest match or the smallest, looking at the code above you can see that both cause trouble.

You somehow need to match tags on the same nesting level. But as I said, I never wrote a parser so I don't know how.

#9 @sirzooro
15 years ago


  • you can use regular expressions to split text into tokens;
  • use stack to store open tags (or write everything using recursion - IMO more tricky way).

#10 @pepijndevos
15 years ago

Ok, I think I understand now, but there ar still some blurry parts in my mind on the point where I need to put the content in the matched tags. But I think I'll get around that...

Another problem is that plugins now call do_shortcode recursively, and that I don't know which functions are private or can be changed. This means I can probably write the parser by now, but I have no clue yet how I can get it to fit the current situation.

Maybe it is easier to leave do_shortcode alone and make a new do_recursive_shortcode... or something like that?
It would not brake anyones code and leave the faster alternative for simple tags.

#11 @scribu
15 years ago

Or you could add a 'recursive' parameter to do_shortcode and call _do_shortcode() and _do_recursive_shortcode() internally.

#12 @Viper007Bond
12 years ago

Still a valid bug even with the new shortcode regex although the output has changed.

add_shortcode( 'div', 'shortcode_div' );
function shortcode_div( $atts, $content ) {
	return '<div>' . do_shortcode( $content ) . '</div>';

The first [/div] found gets matched with the first [div] found, resulting in [div]test1 being the $content and then that stand-alone shortcode getting expanded.

12 years ago

#13 @kovshenin
12 years ago

  • Keywords has-patch needs-testing added; needs-patch removed

Hi there! My name is Konstantin and I love regular expressions (not!). Everybody says we should not parse HTML using a regular expression, and they're right. However, nobody said anything about using two (or more) regular expressions to parse HTML, so let's just do a bunch of loops :)

10702.diff is an attempt to allow proper nested shortcodes in WordPress. The regex pattern has changed to use a positive lookahead and match the opening shortcode tag only if it's followed by its closing tag, meaning the search will match the inner tags first, which is why a loop has been introduced in do_shortcode and strip_shortcodes which will loop from the inner tags to the outer tags. This means that preg_replace_callback will run at least twice if a shortcode is present.

10702.unit-tests.diff adds some test coverage.

10702.performance.csv shows that there's a slight performance hit when the content has a lot of shortcodes. The "no shortcodes" and "some shortcodes" gains are probably due to the strstr( $content, '[' ) short circuit in do_shortcode. The plugin that ran these tests can be found here.

This is a draft and has room for improvement. Whatever we end up doing in do_shortcode should be equally done in strip_shortcodes. Also, escaping shortcodes might look a little weird with that string replacement, but since we're now replacing in a loop several times, replacing [[ with [ will no longer work. If you can think of a smarter way, I'm all ears!

Your feedback is highly appreciated :)

#14 follow-up: @Viper007Bond
12 years ago


If you only want to determine if a particular needle occurs within haystack, use the faster and less memory intensive function strpos() instead.


I can't comment on the actual patch though -- regex is scary for me. ;)

#15 in reply to: ↑ 14 @kovshenin
12 years ago

Replying to Viper007Bond:

Cool, thanks for the heads up! Will address this in the next patch!

#16 follow-up: @aaroncampbell
10 years ago

  • Keywords needs-refresh added; has-patch removed

Looking at this, it seems that the thing making it more difficult is allowing the self-closing tags to be identical to an opening tag. It means that our simpler (certainly not simple) regex isn't pairing the right open/close tags. The current behavior is as follows:

[test num="1" expected="matches #6" actual="matches #3"]
  [test num="2" expected="matches #3" actual="seen as self-closing"]test1[/test num="3" expected="matches #2" actual="matches #1"]
  [test num="4" expected="matches #5" actual="matches #5"]test2[/test num="5" expected="matches #4" actual="matches #4"]
[/test num="6" expected="matches #1" actual="ignored completely"]

After the patch, tags 1 & 6 from above were completely missed, but the internal pairs seemed to match correctly. Having said that, the patch is also 2 years old and I had to apply the changes manually.

#17 in reply to: ↑ 16 @kovshenin
10 years ago

Replying to aaroncampbell: Yeah, I don't think I like my patch anymore.

#18 @chriscct7
9 years ago

  • Keywords close added

#19 @miqrogroove
9 years ago

  • Milestone Future Release deleted
  • Resolution set to wontfix
  • Status changed from new to closed

At the 9 September meeting, participants agreed to close this ticket because fixing it would require unwanted changes to the shortcode syntax, or to parser performance.

For details, please see

Note: See TracTickets for help on using tickets.