Opened 12 years ago
Closed 7 years ago
#23307 closed defect (bug) (fixed)
shortcode_parse_atts may return empty string
Reported by: |
|
Owned by: |
|
---|---|---|---|
Milestone: | 4.3 | Priority: | normal |
Severity: | minor | Version: | |
Component: | Shortcodes | Keywords: | |
Focuses: | docs | Cc: |
Description
See http://core.trac.wordpress.org/browser/tags/3.5.1/wp-includes/shortcodes.php#L243 .
If there are no attributes in a shortcode e.g. [foobar]
, then shortcode_parse_atts returns an empty string. This is not reflected in the long description or @return
tag.
Ideally though, it should return an empty array, so that it's not only consistent, but writing a shortcode function can type hint the $atts
argument to be an array e.g.
add_shortcode( 'foo', 'my_shortcode' ); function my_shortcode( array $atts ) { ... }
This might also be combined with checking if the incoming $text
is empty, and returning an empty array immediately.
Attachments (4)
Change History (35)
#1
@
12 years ago
- Keywords has-patch added
The patch captures when there is no space + characters after the shortcode tag, to return an empty array.
However, an opening tag, plus some characters / combination of characters that don't fall into the $pattern
will still currently return as a string. The left-trimmed string could be set as the first element of an array, or cast to an array, but I'm unsure what BC might break from what is presumably an edge case?
#3
follow-up:
↓ 4
@
11 years ago
- Keywords needs-unit-tests added
This needs unit tests. I'd also be curious when this was introduced — what's the history here?
Patch doesn't look complete — if the pattern fails, it'd also return a string ( ltrim( $text )
). We should make it as consistent as possible.
I don't think there would be a BC break of concern here. The function may also be able to return an empty array if given the right arguments. But I wouldn't expect someone to be asserting a string.
#4
in reply to:
↑ 3
@
11 years ago
Replying to nacin:
I'd also be curious when this was introduced — what's the history here?
Patch doesn't look complete — if the pattern fails, it'd also return a string (
ltrim( $text )
). We should make it as consistent as possible.
Looking at my comment above, I think this was indeed half a patch, as I wasn't sure what could / would be broken when potentially doing something with that left trim, so I just handled the simple case at the time.
#9
@
11 years ago
- Keywords needs-patch added; has-patch removed
Here's the latest tagged reference: https://core.trac.wordpress.org/browser/tags/3.8.1/src/wp-includes/shortcodes.php#L280
Are we happy with just doing:
$atts[] = ltrim($text);
so that the else
returns an array with a single value?
#10
follow-up:
↓ 11
@
11 years ago
An array with a single empty string in it feels wrong. Also, empty()
checks (which plugins might be doing to catch the empty string that's returned currently) would fail.
The correct return value should be an empty array.
#11
in reply to:
↑ 10
;
follow-up:
↓ 12
@
11 years ago
Replying to TobiasBg:
An array with a single empty string in it feels wrong. Also,
empty()
checks (which plugins might be doing to catch the empty string that's returned currently) would fail.
The correct return value should be an empty array.
If the string is empty (i.e. [foobar]
), then my attached patch already ensures an empty array is returned.
What that last suggestion covers is the edge case when some non-empty string fails the regex.
#12
in reply to:
↑ 11
;
follow-up:
↓ 13
@
11 years ago
Replying to GaryJ:
If the string is empty (i.e.
[foobar]
), then my attached patch already ensures an empty array is returned.
What that last suggestion covers is the edge case when some non-empty string fails the regex.
Ah, ok. Thanks for the clarification.
Without having actually tested your patch: What happens for [foobar ]
or [foobar /]
?
And do you happen to have an example where the regex would fail for a non-empty string? I couldn't find one yet.
#13
in reply to:
↑ 12
@
11 years ago
Replying to TobiasBg:
Without having actually tested your patch: What happens for
[foobar ]
or[foobar /]
?
And do you happen to have an example where the regex would fail for a non-empty string? I couldn't find one yet.
From the regex in get_shortcode_regex()
, both [foobar ]
and [foobar /]
give $m[3]
as ' '
(string containing a single space). It's $m[3]
that is provided as the argument to shortcode_parse_atts()
.
Here's the (slightly modified - see below) regex in a single string (with "foobar" added in place of the tag name variable):
\[(\[?)(foobar)(?![\w-])([^\]\/]*(?:\/(?!\])[^\]\/]*)*?)(?:(\/)\]|\](?:([^\[]*(?:\[(?!\/\2\])[^\[]*)*)\[\/\2\])?)(\]?)
The shortcode_parse_atts()
regex allows shortcodes attributes of the form:
- x=y
- x =y (optional space before =)
- x= y (optional space after =)
- x="y" (use of double-quotes, can have optional spaces around =)
- x='y' (use of single quotes, can have optional spaces around =)
- "y" (just attribute value in double quotes)
- y (non-whitespace character without quotes)
That means that a single or multiple consecutive spaces WILL fail the regex, and why there's a trim in the current else
.
So, [foobar ]
should act the same as [foobar]
.
One case where the string wouldn't be empty (if I'm reading everything right) would be something like [foobar "]
(single double quote) or [foobar '']
(two single quotes) - they give a string of '"'
and ''''
which are then run through stripcslashes()
.
As a side question, the Rad Software Regular Expression Designer that I was testing in reported that the get_shortcode_regex()
regex had nested quantifiers and wouldn't run without fixing them. Looking at the original regex, there does seem to be three instances of *+
, which seems odd. Can anyone explain that logic?
#14
@
11 years ago
- Milestone changed from 3.9 to Future Release
Great research so far, GaryJ! Shouldn't be too tough to turn this into some unit tests, which, ultimately, this will need to be considered.
For *+
, see http://www.regular-expressions.info/possessive.html.
This ticket was mentioned in Slack in #core by aaroncampbell. View the logs.
10 years ago
#16
@
10 years ago
- Keywords has-patch 2nd-opinion added; needs-unit-tests needs-patch removed
My thoughts:
1) I agree that it makes more sense to return an empty array for no attributes.
2) Most people are probably using empty( $atts )
but there might be some places where we break people's tests for empty attributes. I'm thinking this is a pretty minor breakage.
3) One thing we ARE breaking are our own unit tests. We check for ''
. My patches switch our tests from assertEquals( '', $atts )
to assertEmpty( $atts )
instead.
4) And lastly, the most important thing, I can't seem to trigger the else in shortcode_parse_atts()
. My first patch (23307.2.diff) returns an array in that case, but I couldn't come up with any content to test it with. My second patch (23307.3.diff) removes the else entirely. All unit tests still pass.
#17
@
10 years ago
Spent some time checking my earlier research, and I agree with point 4 - once the early trim()
is in place, there's nothing that triggers the else
. Even the most complicated of values only needs to satisfy \S*
.
#18
@
10 years ago
Saying that, we have one or two other else
/ default:
instances in core that are commented as "We should never hit this, but ...", so for the sake of one line, I think the else
should probably be kept in (and suitably commented), in case there's some odd character with odd encoding that doesn't get removed in a default trim()
but fails \S
.
#19
@
9 years ago
- Focuses docs added
- Keywords needs-patch added; has-patch 2nd-opinion removed
- Milestone changed from Future Release to 4.4
I am not in favor of changing this. It's baked in and had no traction here.
I would like to see the docblock corrected/updated in 4.4. It should describe any situation where the attributes regex failure causes a string return.
I am working on a roadmap where we might have a separate attribute parser for new shortcodes. It will be easier to make improvements in that future system.
#20
@
9 years ago
- Owner set to miqrogroove
- Status changed from new to assigned
@miqrogroove: Can you please work on a patch describing the issue for the DocBlock change?
#21
follow-up:
↓ 22
@
9 years ago
Currently researching the history of attribute name syntax. I had documented this as being restricted to "word" characters, but it looks like that changed to include hyphens at some point.
#22
in reply to:
↑ 21
@
9 years ago
Replying to miqrogroove:
Currently researching the history of attribute name syntax. I had documented this as being restricted to "word" characters, but it looks like that changed to include hyphens at some point.
Yeah, that happened just a couple months ago in [33118] (ticket #9405). It brought them more in line with shortcode names (#17657) and made them feel a bit more like the HTML attributes that people are used to (data-something, aria-something, etc).
#23
@
9 years ago
- Keywords has-patch added; needs-patch removed
23307.4.diff provides the needed information.
#24
@
9 years ago
- Keywords commit added
Just confirming that:
1) I ran some quick tests just to double check, and the documentation in 23307.4.diff is accurate (albeit somewhat confusing)
2) I'm in favor of not messing with this and instead making sure the new system is more consistent (always returning an array)
#25
@
9 years ago
The "separate attribute parser for new shortcodes" idea died with draft one of that roadmap. :) Since there is no longer a plan to have two separate shortcode systems, we need to either wontfix, or we can add this ticket to the roadmap as a planned change.
On the one hand, there were no objections to wontfix at the last meeting. On the other hand, I think the patch adds new information that we can consider. The return behavior did not match the old docs and also did not match the ticket description.
If we do decide to change the return type, I think we should also take the opportunity at the same time to disallow equal signs and quote chars from unnamed, unquoted values. It makes no sense to allow an attribute value like gar=bage"
#28
@
9 years ago
- Keywords has-patch removed
- Milestone changed from Future Release to Awaiting Review
It's on the agenda for 7 October meeting at #feature-shortcode. The available patches did not cover all cases and we need to review plugin impact.
#29
@
9 years ago
@miqrogroove What's the next step here? Are we closing as wontfix or updating the docs?
Immediate return of empty array if no attributes given