WordPress.org

Make WordPress Core

Opened 8 years ago

Closed 17 months ago

Last modified 14 months ago

#9405 closed defect (bug) (fixed)

Shortcode parameter names can't have dashes in them

Reported by: Viper007Bond Owned by: johnbillion
Milestone: 4.3 Priority: lowest
Severity: minor Version: 2.8
Component: Shortcodes Keywords: has-patch
Focuses: Cc:

Description

I expected this to work, but it doesn't:

[tag my-param="value" foo="bar"]Hello world![/tag]

The passed array looks like this:

array( 
	0 => 'my-param="value"',
	'foo' => 'bar'
);

Attachments (4)

9405.diff (748 bytes) - added by izem 2 years ago.
Updated regex of shortcode_parse_atts function at src/wp-includes/shortcodes.php
9405.2.diff (768 bytes) - added by izem 2 years ago.
Updated regex for 3 types values (double quotes, single quotes, no quotes)
9405.3.diff (1.6 KB) - added by aaroncampbell 2 years ago.
9405.4.diff (1.8 KB) - added by tyxla 18 months ago.
Updating the patch by @aaroncampbell to add support for multiple dashes in the middle of the attribute, as well as multiple consecutive dashes.

Download all attachments as: .zip

Change History (24)

#1 @hakre
8 years ago

function is shortcode_parse_atts() in wp-includes/shortcodes.php

the parsing is done by regular expressions.

word limiters are used, I guess dashes are not supported by them.

I wonder if it makes sense. This problem can for easier besolved by clarifying the documentation, the regular expression pattern is already quite complex.

#2 @Denis-de-Bernardy
8 years ago

it's supposed to "look" like html attributes. curious to know if dash is a valid character for html attributes before closing as invalid...

#3 @hakre
8 years ago

Some HTML Trivia:

  • attribute names are not case sensitive. (HTML 2.0)
  • The `NAMELEN' parameter in the SGML declaration (section SGML Declaration for HTML) limits the length of an attribute value to 1024 characters.
  • HTML allows CDATA attributes to be unquoted provided the attribute value contains only letters (a to z and A to Z), digits (0 to 9), hyphens (ASCII decimal 45) or, periods (ASCII decimal 46)
  • An attribute specification consists of a name, an "=" and a value specification. The name refers to an item in an ATTLIST declaration.

I see no attribute name in HTML (2.0, 3.2, 4.1 & xhtml 1.0) that conatins a dash or dot.

#4 @Denis-de-Bernardy
8 years ago

  • Milestone Future Release deleted
  • Resolution set to invalid
  • Status changed from new to closed

#5 @sswells
2 years ago

  • Component changed from General to Shortcodes
  • Resolution invalid deleted
  • Status changed from closed to reopened

Since the reasoning behind closing this ticket was the html attribute mirroring, can it be reconsidered now that data-uid and other data-? parameters are widely used?

Line 308 of shortcodes.php could be changed from:

$pattern = '/(\w+)\s*=\s*"([^"]*)"(?:\s|$)|(\w+)\s*=\s*\'([^\']*)\'(?:\s|$)|(\w+)\s*=\s*([^\s\'"]+)(?:\s|$)|"([^"]*)"(?:\s|$)|(\S+)(?:\s|$)/';

to:

$pattern = '/([\w\-]+)\s*=\s*"([^"]*)"(?:\s|$)|([\w\-]+)\s*=\s*\'([^\']*)\'(?:\s|$)|([\w\-]+)\s*=\s*([^\s\'"]+)(?:\s|$)|"([^"]*)"(?:\s|$)|(\S+)(?:\s|$)/';
Last edited 2 years ago by SergeyBiryukov (previous) (diff)

@izem
2 years ago

Updated regex of shortcode_parse_atts function at src/wp-includes/shortcodes.php

#6 @izem
2 years ago

  • Keywords has-patch added; needs-patch removed

Added a patch that update the regex pattern of shortcode_parse_atts function at src/wp-includes/shortcodes.php
The name part of the regex was: (\w+)
I've changed it to: (\w+(?:\-\w+)*)
Now it support names with hyphens (but not names that end with a hyphen).

// [sc_test name="val01" name-with-hyphens="val02" name-end-with-hyphen-="val03"]
atts Array:
(
    [name] => val01
    [name-with-hyphens] => val02
    [0] => name-end-with-hyphen-="val03"
)

@izem
2 years ago

Updated regex for 3 types values (double quotes, single quotes, no quotes)

#7 @SergeyBiryukov
2 years ago

  • Milestone set to Awaiting Review

@aaroncampbell
2 years ago

#8 @aaroncampbell
2 years ago

  • Keywords needs-testing added
  • Milestone changed from Awaiting Review to Future Release

In #17657 we moved to allowing hyphens in shortcode names. I don't actually see a reason to limit hyphens from being at the beginning of a parameter name, so in 9405.3.diff I went with a simpler regex change that allows [\w-] for all parameter name characters. I also added a unit test.

#9 @aaroncampbell
18 months ago

Just updating to say that I just re-applied and tested the patch again. It still applies and works as expected. If anyone else could test it, it would be much appreciated.

This ticket was mentioned in Slack in #core by aaroncampbell. View the logs.


18 months ago

#11 @tyxla
18 months ago

  • Keywords needs-unit-tests added

I believe some unit tests would be the best way to test this.

#12 @aaroncampbell
18 months ago

  • Keywords needs-unit-tests removed

There's a unit test in the most recent patch. It covers all the parameters in this shortcode:
[test-shortcode-tag foo="foo" foo-bar="foo-bar" foo-bar-="foo-bar-" -foo-bar="-foo-bar" -foo-bar-="-foo-bar-" /]
Do you think there are more that need to be added?

#13 @tyxla
18 months ago

It appears that the recent patch from @aaroncampbell has quite an extensive test coverage.
The unit test works fine for me, and manual testing also works great.

I believe we can add several more to the test, just to cover the cases with two dashes in the middle and two consecutive dashes.

Patch coming up.

@tyxla
18 months ago

Updating the patch by @aaroncampbell to add support for multiple dashes in the middle of the attribute, as well as multiple consecutive dashes.

#14 @tyxla
18 months ago

  • Keywords needs-testing removed

#15 follow-up: @azaozz
18 months ago

I'm not sure this is a good idea. Why would we want dashes in shortcode attribute names, what (critical) problem is solved with that?

The shortcode regexps are pretty big/slow already, making them even more complex is undesirable imho. Also, imagine the billions of times this slower regexp will run every day. Is it really worth the extra kWh of electricity for the servers so we can have slightly fancier attribute names (that few would ever use)?

Last edited 18 months ago by azaozz (previous) (diff)

#16 in reply to: ↑ 15 @aaroncampbell
18 months ago

Replying to azaozz:

I'm not sure this is a good idea. Why would we want dashes in shortcode attribute names, what (critical) problem is solved with that?

I'm not sure it's "critical" but I do think it's a problem that should be solved. When you create a shortcode, try to use a hyphenated parameter name, and it doesn't work...it seems broke. These are only getting more popular now with modern HTML using things like data-*, aria-*, etc. There are certainly characters we don't need to support here, but I think that [\w-] is not only understandable, but expected.


The shortcode regexps are pretty big/slow already, making them even more complex is undesirable imho. Also, imagine the billions of times this slower regexp will run every day. Is it really worth the extra kWh of electricity for the servers so we can have slightly fancier attribute names (that few would ever use)?

This actually doesn't affect the main shortcode regex (the one we've struggled with keeping memory-friendly). That one actually got support for hyphens almost three years ago in #17657. It only affects the shortcode attributes regex, which is only ever run on the content inside the opening shortcode tag (match 3 from the shortcode regex). Since the change is one that affects only the attributes regex, and since the only change to that regex it to use [\w-] instead of \w for attribute names, I think it's a reasonable move.

Last edited 18 months ago by aaroncampbell (previous) (diff)

#17 @johnbillion
18 months ago

  • Milestone changed from Future Release to 4.3

Hyphens should definitely be allowed in shortcode attribute names.

I bet the kWhs of electricity saved by reducing the time developers spend at their computers scratching their heads will more than offset the extra kWhs of electricity spent on parsing the updated regex.

#18 @obenland
17 months ago

  • Owner set to johnbillion
  • Status changed from reopened to assigned

#19 @johnbillion
17 months ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 33118:

Allow shortcode parameter names (attributes) to contain dashes.

Props aaroncampbell, tyxla, izem
Fixes #9405

Note: See TracTickets for help on using tickets.