WordPress.org

Make WordPress Core

Opened 5 weeks ago

Last modified 4 weeks ago

#47863 reviewing defect (bug)

Fix odd, unexpected output from shortcode_parse_attts

Reported by: mauteri Owned by: SergeyBiryukov
Milestone: 5.3 Priority: normal
Severity: normal Version:
Component: Shortcodes Keywords: has-patch has-unit-tests
Focuses: Cc:

Description (last modified by SergeyBiryukov)

When passing a full shortcode to this function, I'm expecting only an array of key/value attributes of the shortcode to be returned, however this is not the case. Basic example (though my diff of unit tests have many), this shortcode:

[unittest title="unittest" link="https://unit.test/"]

will return an array that looks like this:

(
    0 => '[unittest'
    'title' => 'unittest'
    1 => 'link="https://unit.test/"]'
)

rather than one that looks like this

(
    'title' => 'unittest'
    'link' => 'https://unit.test/'
)

I've already created a patch for this. I will put it in two parts. First unit tests with a data provider of 6 unit test format examples. 5 out of 6 will fail. I will then include another patch that includes the unit tests plus the proposed fix to the function.

Attachments (10)

shortcode_parse_attts.patch (1.4 KB) - added by mauteri 5 weeks ago.
This patch is of failing unit tests
47863.patch (2.2 KB) - added by mauteri 5 weeks ago.
Patch for issue plus unit tests
47863.1.patch (3.1 KB) - added by mauteri 5 weeks ago.
Updates to patch per first round of reviews
47863-2.diff (4.1 KB) - added by birgire 5 weeks ago.
47863-3.patch (4.5 KB) - added by mauteri 5 weeks ago.
Additional tests and slight code adjustment.
47863-4.patch (4.5 KB) - added by mauteri 5 weeks ago.
Extra semi-colon in last patch.
47863-5.patch (4.5 KB) - added by mauteri 5 weeks ago.
47863-6.patch (4.5 KB) - added by mauteri 5 weeks ago.
47863-7.patch (4.7 KB) - added by mauteri 5 weeks ago.
Additional test of nested shortcode.
47863-8.patch (4.6 KB) - added by mauteri 5 weeks ago.

Download all attachments as: .zip

Change History (30)

@mauteri
5 weeks ago

This patch is of failing unit tests

@mauteri
5 weeks ago

Patch for issue plus unit tests

#1 @SergeyBiryukov
5 weeks ago

  • Description modified (diff)

#2 @SergeyBiryukov
5 weeks ago

  • Milestone changed from Awaiting Review to 5.3
  • Owner set to SergeyBiryukov
  • Status changed from new to reviewing

#3 follow-up: @birgire
5 weeks ago

Thanks for the patch @mauteri

@SergeyBiryukov I almost posted some reviewing comments, but then noticed you're currently set for reviewing, so I will just wait for your initial review :-)

#4 in reply to: ↑ 3 @SergeyBiryukov
5 weeks ago

Replying to birgire:

I almost posted some reviewing comments, but then noticed you're currently set for reviewing, so I will just wait for your initial review :-)

I just wanted to note that based on the current usage in core, the function appears to only accept key-value pairs. Extending it to accept a full shortcode makes sense to me and seems like a quick win. That's all I have for now :)

#5 @birgire
5 weeks ago

  • Keywords has-unit-tests added

Thanks @SergeyBiryukov

Reading the docblock's description of shortcode_parse_atts(),

https://core.trac.wordpress.org/browser/tags/5.2/src/wp-includes/shortcodes.php#L483

it seems to indicate that this should be supported, e.g.:

 * Retrieve all attributes from the shortcodes tag.

So at a first glance, it seems like a good enhancement.

Maybe the docblock could be enhanced as well, e.g. adding a descripton to the * @param string $text part?

Just curious about the \w+ choice in the patch, as I noticed that in get_shortcode_atts_regex()

https://core.trac.wordpress.org/browser/tags/5.2/src/wp-includes/shortcodes.php#L480

we have [\w-]+. I guess these should match?

I didn't find any explicit tests for shortcode_parse_atts(), so it seems good to add them.

In the description of shortcode_parse_atts() we have:

* @return array|string List of attribute values.
*                      Returns empty array if trim( $text ) == '""'.
*                      Returns empty string if trim( $text ) == ''.
*                      All other matches are checked for not empty().

so I would suggest adding these test cases to the data provider in 47863.patch.

And also test cases with - in the shortcode's tag name.

ps:

According to https://make.wordpress.org/core/2019/07/12/php-coding-standards-changes/

Arrays must be declared using long array syntax in WordPress Core.

so I would assume it applying to the tests as well.

Then we would need to add a @ticket 47863 in the docblock of the test method.

Last edited 5 weeks ago by birgire (previous) (diff)

#6 follow-up: @mauteri
5 weeks ago

Thanks for the feedback, @birgire. All sounds very reasonable. Curious about the long array syntax since PHP minimum requirement was just raised from 5.2 to 5.6.20 only in last major version. Could that standard now be dated since shorthand arrays in PHP was introduced in PHP 5.4 and maybe the WordPress standard should be updated to reflect what is possible in the new minimum version of PHP?

Also, I found the $text parameter naming to be a bit odd. Just from seeing the name of the function shortcode_parse_atts seems it should ONLY accept shortcodes to have their atts parsed (which made it very odd to me when it messed that up in the first place). Seems a better name for that argument would be $shortcode to make it very clear that a shortcode should be passed to it to have its atts parsed. I'm just thinking of other improvements that could be made if you and others agree.

#7 in reply to: ↑ 6 @SergeyBiryukov
5 weeks ago

Replying to mauteri:

Could that standard now be dated since shorthand arrays in PHP was introduced in PHP 5.4 and maybe the WordPress standard should be updated to reflect what is possible in the new minimum version of PHP?

The PHP Coding Standards Changes post linked above explains the reasoning:

Short Array Syntax
A little less controversial, but still with varying opinions, was the proposal to require short array syntax ( [ 1, 2, 3 ] ) instead of long array syntax ( array( 1, 2, 3 ) ) for declaring arrays.

While I’m personally partial to short array syntax, there were two particularly convincing arguments for using long array syntax:

  • It’s easier to distinguish from other forms of braces, particularly for those with vision difficulties.
  • It’s much more descriptive for beginners.

So, this change to the coding standards is the opposite of what was originally proposed, but is ultimately the more inclusive option.

Coding Standards Change
Arrays must be declared using long array syntax in WordPress Core.

Also, I found the $text parameter naming to be a bit odd. Just from seeing the name of the function shortcode_parse_atts seems it should ONLY accept shortcodes to have their atts parsed (which made it very odd to me when it messed that up in the first place). Seems a better name for that argument would be $shortcode to make it very clear that a shortcode should be passed to it to have its atts parsed.

Based on the current usage in core (a string of shortcode attributes, rather than a full shortcode), I would keep the $text name as is and clarify in the documentation that it can also be a shortcode. That said, I wouldn't object too strongly to changing the name.

#8 @birgire
5 weeks ago

I agree with @SergeyBiryukov in keeping the $text, if the documentation is improved.

It also comes to mind that it might be worth mentioning that this is meant for a single shortcode as we would not be able to find what key/value belongs to each shortcode, from the output of e.g. shortcode_parse_atts( '[a b="c"][d e="f"]' ).

ps:

I updated my comment above regarding the @ticket 47863 annotation.

#9 @mauteri
5 weeks ago

Great, thanks to both of you for the additional info and reasoning.

#10 @mauteri
5 weeks ago

@birgire this doesn't appear to be true from what I can tell...

* Returns empty string if trim( $text ) == ''.

I wrote a test for this and it failed. I then removed my code changes to see if I introduced a bug and it still failed.

This was my dataProvider for the test:

array(
	'\'\'',
	'',
),
There was 1 failure:

1) Tests_Shortcode::test_shortcode_parse_atts with data set #0 ('''', '')
Failed asserting that Array &0 () is identical to ''.

/srv/www/wordpress-trunk/public_html/tests/phpunit/tests/shortcode.php:1054

Possible this was wrong for a while and no one noticed due to lack of tests?

Curious if this is the case, that this function now only returns arrays and not strings. I can spend more time and see what kind of coverage we have for the function with the tests I've already written.

Added 47863.1.patch with most recent updates, though more changes to DocBlocks will definitely need to be added.

@mauteri
5 weeks ago

Updates to patch per first round of reviews

@birgire
5 weeks ago

#11 @birgire
5 weeks ago

I adjusted the inline docs and some WPCS in 47863-2.diff.

Added two trivial test cases.

I also think I would expect the inputs '""' and '\'\'' to return an empty array, as the inline docs mentioned:

* Returns empty array if trim( $text ) == '""'.

The tests in 47863-2.diff run successfully on my install.

OK (12 tests, 12 assertions)

#12 follow-up: @mauteri
5 weeks ago

@birgire oh, I think I misread the docblock on the return for ''. This looks to work as expected as I read through your additional tests. Thanks.

#13 in reply to: ↑ 12 @birgire
5 weeks ago

Replying to mauteri:

@birgire oh, I think I misread the docblock on the return for ''. This looks to work as expected as I read through your additional tests. Thanks.

I think we should also consider cases like:

[unittest link=https://unit.test/ /]
[unittest title https://unit.test/ /]

but then the trimming might remove too much.

What do you think?

#14 @mauteri
5 weeks ago

Thanks @birgire, good points there. I've adjusted the trimming a bit. Mostly just pulling off any hanging / at the end of the string with a preg_replace before trimming whitespace. I also added a few more tests you mentioned above with what I think the expected output would be given the circumstance. Let me know of you agree or if you have any comments on the additional tests.

@mauteri
5 weeks ago

Additional tests and slight code adjustment.

@mauteri
5 weeks ago

Extra semi-colon in last patch.

#15 @birgire
5 weeks ago

I wonder if we could simplify with e.g.:

if ( preg_match( '#^\[[\w-]+([^\]]*?)\/?\]#', $text, $matches ) ) {
	$text = $matches[1];
}

i.e. use non-greedy *? check to allow for the latter \/? check.

This seems to run the tests successfully.

ps: here's though another kind of edge case:

array(
	'[unittest link=https://unit.test/]Lorem Ipsum[/unittest]',
	array(
		'link'  => 'https://unit.test/',
	),
),

that will not pass.

#16 @mauteri
5 weeks ago

@birgire yeah, i don't think we are going to win with that edge case :-/ as that can go either way. probably best to assume if it is right before the ] that it was intentionally part of the self closing shortcode. what do you think?

yeah, we can probably tighten this up a bit by extending the preg_match.

@mauteri
5 weeks ago

@mauteri
5 weeks ago

#17 @mauteri
5 weeks ago

@birgire let me know what you think of latest patch. updated regex per your suggestion and simplified to just a trim afterwards. thx!

@mauteri
5 weeks ago

Additional test of nested shortcode.

#18 @birgire
5 weeks ago

Thanks for the update @mauteri

I don't think we need the count check part, isn't $matches[1] always set?

The tests also seem to run successfully without the trim().

Personally I prefer to use a different pattern delimiter than /, when there are many slashes and backslashes in the pattern, to make it more readable.

Last edited 5 weeks ago by birgire (previous) (diff)

@mauteri
5 weeks ago

#19 @mauteri
5 weeks ago

@birgire Updated. I'm usually overly cautious with $matches, but I see your point that not needed in this case as it would never get that far without a count of 2. I also removed the superfluous trim and changed the delimiter. Thanks.

#20 @mauteri
4 weeks ago

@birgire @SergeyBiryukov let me know if there's anything else you'd like updated/changed for this ticket or if it is good now as-is. Thanks!

Note: See TracTickets for help on using tickets.