#10566 closed defect (bug) (duplicate)
Possibly wrong regex to strip shortcodes
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 2.8.3 |
Component: | Shortcodes | Keywords: | regex, shortcode |
Focuses: | Cc: |
Description
In file wp-includes/shortcodes.php, the function get_shortcode_regex() is generating the following regex:
return '(.?)\[('.$tagregexp.')\b(.*?)(?:(\/))?\](?:(.+?)\[\/\2\])?(.?)';
The (.?) captured groups at the beginning and the end are meant to match possible double brackets and ?, according to the comments above the function. If so, they probably should be replaced with (\[)? and (\])? Otherwise, stripping shortcodes in strings such as
[caption foo="bar"]data[/caption]Text
Causes the initial "T" in "Text" to be removed.
Also, the (.+?) group in the middle is causing problems with blocks of text like this:
[shortcode foo="bar"][/shortcode]Text[shortcode foo="bar"][/shortcode]
Since the are no characters between the opener and closer tags, (.+?) fails to match and the complete string is removed. In this admittedly fringe case, with two consecutive shortcodes, the first of which closes just after opening, we remove too much text. The group should probably be (.+?) to allow the possibility of an empty string.
Feels like a duplicate of #10082 to me.