Make WordPress Core

Opened 16 years ago

Closed 10 years ago

Last modified 9 years ago

#9405 closed defect (bug) (fixed)

Shortcode parameter names can't have dashes in them

Reported by: viper007bond's profile Viper007Bond Owned by: johnbillion's profile johnbillion
Milestone: 4.3 Priority: lowest
Severity: minor Version: 2.8
Component: Shortcodes Keywords: has-patch
Focuses: Cc:

Description

I expected this to work, but it doesn't:

[tag my-param="value" foo="bar"]Hello world![/tag]

The passed array looks like this:

array( 
	0 => 'my-param="value"',
	'foo' => 'bar'
);

Attachments (4)

9405.diff (748 bytes) - added by izem 11 years ago.
Updated regex of shortcode_parse_atts function at src/wp-includes/shortcodes.php
9405.2.diff (768 bytes) - added by izem 11 years ago.
Updated regex for 3 types values (double quotes, single quotes, no quotes)
9405.3.diff (1.6 KB) - added by aaroncampbell 10 years ago.
9405.4.diff (1.8 KB) - added by tyxla 10 years ago.
Updating the patch by @aaroncampbell to add support for multiple dashes in the middle of the attribute, as well as multiple consecutive dashes.

Download all attachments as: .zip

Change History (24)

#1 @hakre
16 years ago

function is shortcode_parse_atts() in wp-includes/shortcodes.php

the parsing is done by regular expressions.

word limiters are used, I guess dashes are not supported by them.

I wonder if it makes sense. This problem can for easier besolved by clarifying the documentation, the regular expression pattern is already quite complex.

#2 @Denis-de-Bernardy
16 years ago

it's supposed to "look" like html attributes. curious to know if dash is a valid character for html attributes before closing as invalid...

#3 @hakre
16 years ago

Some HTML Trivia:

  • attribute names are not case sensitive. (HTML 2.0)
  • The `NAMELEN' parameter in the SGML declaration (section SGML Declaration for HTML) limits the length of an attribute value to 1024 characters.
  • HTML allows CDATA attributes to be unquoted provided the attribute value contains only letters (a to z and A to Z), digits (0 to 9), hyphens (ASCII decimal 45) or, periods (ASCII decimal 46)
  • An attribute specification consists of a name, an "=" and a value specification. The name refers to an item in an ATTLIST declaration.

I see no attribute name in HTML (2.0, 3.2, 4.1 & xhtml 1.0) that conatins a dash or dot.

#4 @Denis-de-Bernardy
16 years ago

  • Milestone Future Release deleted
  • Resolution set to invalid
  • Status changed from new to closed

#5 @sswells
11 years ago

  • Component changed from General to Shortcodes
  • Resolution invalid deleted
  • Status changed from closed to reopened

Since the reasoning behind closing this ticket was the html attribute mirroring, can it be reconsidered now that data-uid and other data-? parameters are widely used?

Line 308 of shortcodes.php could be changed from:

$pattern = '/(\w+)\s*=\s*"([^"]*)"(?:\s|$)|(\w+)\s*=\s*\'([^\']*)\'(?:\s|$)|(\w+)\s*=\s*([^\s\'"]+)(?:\s|$)|"([^"]*)"(?:\s|$)|(\S+)(?:\s|$)/';

to:

$pattern = '/([\w\-]+)\s*=\s*"([^"]*)"(?:\s|$)|([\w\-]+)\s*=\s*\'([^\']*)\'(?:\s|$)|([\w\-]+)\s*=\s*([^\s\'"]+)(?:\s|$)|"([^"]*)"(?:\s|$)|(\S+)(?:\s|$)/';
Last edited 11 years ago by SergeyBiryukov (previous) (diff)

@izem
11 years ago

Updated regex of shortcode_parse_atts function at src/wp-includes/shortcodes.php

#6 @izem
11 years ago

  • Keywords has-patch added; needs-patch removed

Added a patch that update the regex pattern of shortcode_parse_atts function at src/wp-includes/shortcodes.php
The name part of the regex was: (\w+)
I've changed it to: (\w+(?:\-\w+)*)
Now it support names with hyphens (but not names that end with a hyphen).

// [sc_test name="val01" name-with-hyphens="val02" name-end-with-hyphen-="val03"]
atts Array:
(
    [name] => val01
    [name-with-hyphens] => val02
    [0] => name-end-with-hyphen-="val03"
)

@izem
11 years ago

Updated regex for 3 types values (double quotes, single quotes, no quotes)

#7 @SergeyBiryukov
11 years ago

  • Milestone set to Awaiting Review

#8 @aaroncampbell
10 years ago

  • Keywords needs-testing added
  • Milestone changed from Awaiting Review to Future Release

In #17657 we moved to allowing hyphens in shortcode names. I don't actually see a reason to limit hyphens from being at the beginning of a parameter name, so in 9405.3.diff I went with a simpler regex change that allows [\w-] for all parameter name characters. I also added a unit test.

#9 @aaroncampbell
10 years ago

Just updating to say that I just re-applied and tested the patch again. It still applies and works as expected. If anyone else could test it, it would be much appreciated.

This ticket was mentioned in Slack in #core by aaroncampbell. View the logs.


10 years ago

#11 @tyxla
10 years ago

  • Keywords needs-unit-tests added

I believe some unit tests would be the best way to test this.

#12 @aaroncampbell
10 years ago

  • Keywords needs-unit-tests removed

There's a unit test in the most recent patch. It covers all the parameters in this shortcode:
[test-shortcode-tag foo="foo" foo-bar="foo-bar" foo-bar-="foo-bar-" -foo-bar="-foo-bar" -foo-bar-="-foo-bar-" /]
Do you think there are more that need to be added?

#13 @tyxla
10 years ago

It appears that the recent patch from @aaroncampbell has quite an extensive test coverage.
The unit test works fine for me, and manual testing also works great.

I believe we can add several more to the test, just to cover the cases with two dashes in the middle and two consecutive dashes.

Patch coming up.

@tyxla
10 years ago

Updating the patch by @aaroncampbell to add support for multiple dashes in the middle of the attribute, as well as multiple consecutive dashes.

#14 @tyxla
10 years ago

  • Keywords needs-testing removed

#15 follow-up: @azaozz
10 years ago

I'm not sure this is a good idea. Why would we want dashes in shortcode attribute names, what (critical) problem is solved with that?

The shortcode regexps are pretty big/slow already, making them even more complex is undesirable imho. Also, imagine the billions of times this slower regexp will run every day. Is it really worth the extra kWh of electricity for the servers so we can have slightly fancier attribute names (that few would ever use)?

Last edited 10 years ago by azaozz (previous) (diff)

#16 in reply to: ↑ 15 @aaroncampbell
10 years ago

Replying to azaozz:

I'm not sure this is a good idea. Why would we want dashes in shortcode attribute names, what (critical) problem is solved with that?

I'm not sure it's "critical" but I do think it's a problem that should be solved. When you create a shortcode, try to use a hyphenated parameter name, and it doesn't work...it seems broke. These are only getting more popular now with modern HTML using things like data-*, aria-*, etc. There are certainly characters we don't need to support here, but I think that [\w-] is not only understandable, but expected.


The shortcode regexps are pretty big/slow already, making them even more complex is undesirable imho. Also, imagine the billions of times this slower regexp will run every day. Is it really worth the extra kWh of electricity for the servers so we can have slightly fancier attribute names (that few would ever use)?

This actually doesn't affect the main shortcode regex (the one we've struggled with keeping memory-friendly). That one actually got support for hyphens almost three years ago in #17657. It only affects the shortcode attributes regex, which is only ever run on the content inside the opening shortcode tag (match 3 from the shortcode regex). Since the change is one that affects only the attributes regex, and since the only change to that regex it to use [\w-] instead of \w for attribute names, I think it's a reasonable move.

Last edited 10 years ago by aaroncampbell (previous) (diff)

#17 @johnbillion
10 years ago

  • Milestone changed from Future Release to 4.3

Hyphens should definitely be allowed in shortcode attribute names.

I bet the kWhs of electricity saved by reducing the time developers spend at their computers scratching their heads will more than offset the extra kWhs of electricity spent on parsing the updated regex.

#18 @obenland
10 years ago

  • Owner set to johnbillion
  • Status changed from reopened to assigned

#19 @johnbillion
10 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 33118:

Allow shortcode parameter names (attributes) to contain dashes.

Props aaroncampbell, tyxla, izem
Fixes #9405

Note: See TracTickets for help on using tickets.