Make WordPress Core

Opened 11 years ago

Last modified 3 years ago

#24990 new defect (bug)

Nested Shortcode Inside [caption]

Reported by: prionkor's profile prionkor Owned by:
Milestone: Priority: normal
Severity: normal Version: 3.6
Component: Shortcodes Keywords: needs-unit-tests has-patch dev-feedback
Focuses: Cc:

Description

Nested shortcodes inside caption observation:

[caption]<a href=""><img alt="[shortcode]" title="[shortcode]"></a> Caption Text [shortcode][/caption]
  1. shortcode inside alt and title processed.
  2. Caption Text doesn't

Attachments (2)

24990.diff (367 bytes) - added by nofearinc 9 years ago.
do_shortcode on caption attr
img_caption_array_filter.diff (1.5 KB) - added by pputzer 8 years ago.
An attempt at a generic filter for matching the caption attribute and the media tag.

Download all attachments as: .zip

Change History (30)

This ticket was mentioned in IRC in #wordpress-dev by aubreypwd. View the logs.


10 years ago

#2 @nacin
10 years ago

  • Keywords needs-patch needs-unit-tests added
  • Milestone changed from Awaiting Review to Future Release

We've never called do_shortcode() on the caption, but I don't see why we couldn't do so.

#3 @pputzer
10 years ago

I've got an even more vexing problem in the media-credit plugin (currently solved by replacing wp_caption and caption with my own code, as well as a modified wpeditimage plugin for the visual editor).

[caption][shortcode]<a href=""><img></a>[/shortcode] Caption Text[/caption]

The opening shortcode get's thrown out completely because caption only begins to parse at <a ... :(

#4 @prionkor
10 years ago

Current workarounds:

http://wordpress.stackexchange.com/a/113417/3094

@pputzer: I guess currently wordpress assumes caption texts starts after <a><img></a>

#5 @pputzer
10 years ago

@prionkor: Yeah, it does. My workaround looks like this:

	function media_credit_caption_shortcode($attr, $content = null) {
	        // New-style shortcode with the caption inside the shortcode with the link and image tags.
	        if ( ! isset( $attr['caption'] ) ) {
	                if ( preg_match( '#((?:\[media-credit[^\]]+\]\s*)(?:<a [^>]+>\s*)?<img [^>]+>(?:\s*</a>)?(?:\s*\[/media-credit\])?)(.*)#is', $content, $matches ) ) {
	                        $content = $matches[1];
	                        $attr['caption'] = trim( $matches[2] );
	                }
	        }
	       
	        return img_caption_shortcode($attr, $content);
	}
	add_shortcode('wp_caption', 'media_credit_caption_shortcode');
	add_shortcode('caption', 'media_credit_caption_shortcode');	

Simply calling do_shortcode() on the contentblock would not help in this case, the problem lies in the way backwards compatibility was implemented for captions in 3.4 (i.e. the new-style text block caption is converted to the old-style attribute caption).

Last edited 10 years ago by pputzer (previous) (diff)

@nofearinc
9 years ago

do_shortcode on caption attr

#6 follow-up: @nofearinc
9 years ago

  • Keywords has-patch added; needs-patch removed

I'm not sure what's the issue with calling do_shortcode for the caption, whether it's set via the media popup or as an attribute, so a simple call added to the caption callback.

#7 in reply to: ↑ 6 ; follow-up: @pputzer
9 years ago

Replying to nofearinc:

I'm not sure what's the issue with calling do_shortcode for the caption, whether it's set via the media popup or as an attribute, so a simple call added to the caption callback.

I'm pretty sure that this patch is only a partial fix. The problem is the regex line if ( preg_match( '#((?:<a [^>]+>\s*)?<img [^>]+>(?:\s*</a>)?)(.*)#is', $content, $matches ) ) { that throws out any shortcode enclosing the image and/or link tag in the $content.

Last edited 9 years ago by pputzer (previous) (diff)

#8 in reply to: ↑ 7 ; follow-ups: @nofearinc
9 years ago

pputzer, I was able to replicate the issue above (wasn't clear for me at first) and the shortcode is not the problem there as a literal would be skipped as well.

IIRC we only support 2 versions before the current release, so probably removing the regex test at all?

#9 in reply to: ↑ 8 @pputzer
9 years ago

Replying to nofearinc:

pputzer, I was able to replicate the issue above (wasn't clear for me at first) and the shortcode is not the problem there as a literal would be skipped as well.

IIRC we only support 2 versions before the current release, so probably removing the regex test at all?

The regex converts the new syntax to the old one, so I don't think it can be removed completely (even if it were the other way around, old content would be broken by the change). To allow for at least one level of nested shortcodes in the $content, something like the following regex (not tested) could be used as a generalization of my specific workaround:

if ( preg_match( '#((?:\[([^\]]+)\]\s*)(?:<a [^>]+>\s*)?<img [^>]+>(?:\s*</a>)?(?:\s*\[/\2\])?)(.*)#is', $content, $matches ) ) {
   $content = $matches[1];
   $attr['caption'] = trim( $matches[3] );
}

#10 @kanedo
9 years ago

Another case worth consider is if there is no image tag at all inside the caption shortcode but another shortcode.
In my case I implemented a shortcode for hidpi images but it wont get parsed if the caption text is inserted after the shortcode like this:

[caption id="attachment_504" align="aligncenter" width="300"]
[responsive_image id='504' size='medium' width='300' alt='test' align='center'] Test
[/caption]

On the other hand, if one add the caption as attribute to the caption-shortcode (I think this is called old-style) it gets parsed just fine.

#11 follow-up: @kitwestneat
9 years ago

I also ran into the "img tag replaced by shortcode" issue described in comment:10, and worked around it by filtering the regex with a new filter in media.php

--- a/wordpress/wp-includes/media.php   Sat Nov 29 03:37:15 2014 -0500
+++ b/wordpress/wp-includes/media.php   Thu Dec 04 12:50:49 2014 -0500
@@ -797,7 +797,9 @@
 function img_caption_shortcode( $attr, $content = null ) {
        // New-style shortcode with the caption inside the shortcode with the link and image tags.
        if ( ! isset( $attr['caption'] ) ) {
-               if ( preg_match( '#((?:<a [^>]+>\s*)?<img [^>]+>(?:\s*</a>)?)(.*)#is', $content, $matches ) ) {
+        $regex = apply_filters('img_caption_shortcode_caption_regex',
+                        '#((?:<a [^>]+>\s*)?<img [^>]+>(?:\s*</a>)?)(.*)#is');
+               if ( preg_match( $regex, $content, $matches ) ) {
                        $content = $matches[1];
                        $attr['caption'] = trim( $matches[2] );
                }

Also I noticed that there's a spot where do_shortcode is not called on content - is that intentional?

@@ -831,7 +833,7 @@

        $atts['width'] = (int) $atts['width'];
        if ( $atts['width'] < 1 || empty( $atts['caption'] ) )
-               return $content;
+               return do_shortcode($content);

        if ( ! empty( $atts['id'] ) )
                $atts['id'] = 'id="' . esc_attr( $atts['id'] ) . '" ';

#12 @tychay
9 years ago

media.php line 889 and 915

        if ( current_theme_supports( 'html5', 'caption' ) ) {
                return '<figure ' . $atts['id'] . 'style="width: ' . (int) $atts['width'] . 'px;" class="' . esc_attr( $class ) . '">'
                . do_shortcode( $content ) . '<figcaption class="wp-caption-text">' . $atts['caption'] . '</figcaption></figure>';
        }
        return '<div ' . $atts['id'] . $style . 'class="' . esc_attr( $class ) . '">'
        . do_shortcode( $content ) . '<p class="wp-caption-text">' . $atts['caption'] . '</p></div>';

At first glance, it seems to be the usual way of supporting nesting shortcodes.

These look right, but they're not. (Or at least I think they're not.) It took me a while to notice this because $content is actually the <a><img /></a> and $atts['caption'] is the old $content minus the image. Why would you do_shortcode() on the part that surely must be valid HTML? One possibility: a shortcode to set properties, which I've never seen (but is doable). You want to actually do_shortcode() in the stuff other than the image.

Here's my thinking: when it was originally written a long time ago, the programmer didn't realize that an earlier line rewrites $content and keeps the image part in the content and puts the content part in the attribute. But in all normal cases if you want to support nesting of shortcodes, you run do_shortcode() on $content so they put that in thinking they're supporting nesting. It's an honest mistake, and I didn't notice it the first couple times I glanced at it until I finally decided to really look at why shortcodes inside captions are not being processed.

My suggestion is a very simple fix (much simpler than the regex stuff above). on these two lines replace replace $atts['caption'] with do_shortcode($atts['caption']). At this point, I think it's been too long to replace do_shortcode($content) with just $content as it might break some plugins.

I can do a patch, but I wanted to get feedback in case I missed something about the state of plugin development.

~

To the notes 7 months ago about reverse nesting shortcodes that output images so they can be put safely in captions. The way I handle that case in my plugin is the same way embed is handled to beat wpautop: register a dummy function and a the_content handler that beats it to do: save shortcodes, register the real one, do_shortcode, and restore shortcodes.

In fact the pattern is so common (embed, syntaxhighligher, etc.). I honestly think we should add a feature to core to improve add_shortcode() to create the pattern, like: add_shortcode( $tag, $function, $before_wpautop=false ) so people wouldn't have to constantly be writing all these extra handlers and functions to emulate this behavior. It'll also be slightly more efficient than running the trick serially. But that's another bug entirely :-D

(Ironically if we supported that, the codes that are run earlier cannot correctly support nesting do_shortcode, but I've thought about that, and they shouldn't. The only times you actually want to run earlier are to beat things like wpautop or another shortcode in which case you aren't nested, but more likely nested inside something.

I can do a patch for that also, if there's interest. But, we should open a different bug on that one.

#13 in reply to: ↑ 11 @tychay
9 years ago

Replying to kitwestneat:

I also ran into the "img tag replaced by shortcode" issue described in comment:10, and worked around it by filtering the regex with a new filter in media.php

I think it's a good idea to add a filter here because a lot of people seem to be shunting the entire function with the img_caption_shortcode filter.

However, I think it'd be more flexible to filter the $matches instead of just the preg, this way people can add onto the existing function, it'd be a little odd though, since you'd have to set $matches to false first and use that as the conditional. So who knows.

As for the second part, a weirder side effect of that line is that when it fails because width=0, it ends up pruning the caption entirely! I wonder if Justin's article linked in the codex will work now that a lot of oEmbed providers are dropping frames and <video> tags instead of image content?

#14 @tychay
9 years ago

The patch (24990.diff​ ) in the original bug report does cause a small side-effect. the filter img_caption_shortcode_width will be passed in a shortcode-processed caption attribute. I assume side effects like these are why do_shortcodes() are done at the very end the the processing chain in most shortcode functions.

Last edited 9 years ago by tychay (previous) (diff)

#15 @pputzer
8 years ago

I've just been thinking about this ticket again. I kinda like @tychay's idea about filtering $matches. This way, plugins can override the regex and modify the $content (via $matches[1]) at the same time. I'll try to come up with a patch proposal.

@pputzer
8 years ago

An attempt at a generic filter for matching the caption attribute and the media tag.

#16 @pputzer
8 years ago

I've added a patch based on the idea proposed by @tychay (it's against 4.3.1-alpha). It works fine for me and would be a great new hook for the Media Credit plugin. The filter would also enable new-style captions for media elements other than <img> (which currently are only supported if you use the old-style attribute syntax).

#17 @pputzer
8 years ago

  • Keywords dev-feedback added

#18 @johnbillion
8 years ago

  • Component changed from Media to Shortcodes

This ticket was mentioned in Slack in #core by pepe. View the logs.


8 years ago

#20 in reply to: ↑ 8 ; follow-up: @pputzer
8 years ago

Replying to nofearinc:

pputzer, I was able to replicate the issue above (wasn't clear for me at first) and the shortcode is not the problem there as a literal would be skipped as well.

IIRC we only support 2 versions before the current release, so probably removing the regex test at all?

@nofearinc, what do you think of my patch based on @tychay's idea? Would that be a valid fix for core?

#21 in reply to: ↑ 20 ; follow-ups: @nofearinc
8 years ago

Replying to pputzer:

@nofearinc, what do you think of my patch based on @tychay's idea? Would that be a valid fix for core?

Looks better based on the discussion above regarding img_caption_shortcode_content, although it would likely require more testing. Also, it's worth seeing what will the updated Shortcodes API look like.

#22 in reply to: ↑ 21 @pputzer
8 years ago

Replying to nofearinc:

Replying to pputzer:

@nofearinc, what do you think of my patch based on @tychay's idea? Would that be a valid fix for core?

Looks better based on the discussion above regarding img_caption_shortcode_content, although it would likely require more testing. Also, it's worth seeing what will the updated Shortcodes API look like.

I don't see how the proposed API change will impact the implementation of [caption]. The legacy caption attribute never supported HTML (which AFAIK was the reason for the switch to the enclosure syntax) and the newer syntax is still valid according to the proposed API.

What would we need to do to get such a filter included as soon as possible? The current workaround for [media-credit] (replacing img_caption_shortcode is a hack I'd be really glad to be rid of).

#23 in reply to: ↑ 21 @pputzer
8 years ago

Replying to nofearinc:

Replying to pputzer:

@nofearinc, what do you think of my patch based on @tychay's idea? Would that be a valid fix for core?

Looks better based on the discussion above regarding img_caption_shortcode_content, although it would likely require more testing. Also, it's worth seeing what will the updated Shortcodes API look like.

In light of that syntax being canned, what's the way to move this ticket forward?

Last edited 8 years ago by pputzer (previous) (diff)

#24 @pputzer
8 years ago

What is needed to move forward with this?

This ticket was mentioned in Slack in #core by pepe. View the logs.


8 years ago

#26 @pputzer
8 years ago

After a short discussion with @johnbillion at WCEU in Vienna, I've split off the problem with caption regex into a separate ticket (#37183) and updated the batch. (The issue initially described in this ticket and the new one are related, but not quite the same, so I think it's better to track them separately.)

Everyone that has been watching this, please have look at the new ticket :)

#27 @pputzer
4 years ago

Long time no see. Developer feedback is still needed.

#28 @pputzer
3 years ago

Is anyone still working on the Shortcodes component at all?

Note: See TracTickets for help on using tickets.