WordPress.org

Make WordPress Core

Opened 5 years ago

Closed 4 years ago

Last modified 2 years ago

#17657 closed defect (bug) (fixed)

Shortcode regex doesn't allow for hyphens in shortcode name

Reported by: sivel Owned by: koopersmith
Milestone: 3.5 Priority: normal
Severity: normal Version: 2.9.2
Component: Shortcodes Keywords: needs-codex
Focuses: Cc:

Description

I found an issue where I wanted to use a string in a post for specifying a lightbox style "album" that looked like a shortcode.

The code I was using looked something like:

<a href="http://example.org/image.jpg" rel="lightbox[album-123]">Image</a>

Another plugin was installed that had registered a shortcode for "album". due to the '\b' in the regex the '-' is seen as the ending delimiter in the shortcode name, and the callback for 'album' was processed at this point.

I of course corrected the issue by prefixing the "album" name with a unique string, however I am thinking that we should perhaps update the regex so that some common characters often seen in strings for readability without spaces.

Plugin code to test:

add_shortcode('broken', 'broken');

function broken() {
	return 'Running the broken shortcode';
}

and then in a post:

[broken-shortcode]

Attachments (10)

17657-unit-tests.diff (2.4 KB) - added by aaroncampbell 5 years ago.
17657.diff (553 bytes) - added by solarissmoke 5 years ago.
17657.2.diff (552 bytes) - added by solarissmoke 5 years ago.
Even simpler still
17657.3.diff (754 bytes) - added by kovshenin 4 years ago.
Here's one that's broken down into separate lines like in trunk. I don't think the makes any difference here, but since the other lines use it I thought I'd add it for consistency.
17657-unit-tests.2.diff (2.1 KB) - added by kovshenin 4 years ago.
17657.4.diff (1.4 KB) - added by kovshenin 4 years ago.
Includes shortcode_unautop
scperf.php (6.7 KB) - added by bobbingwide 4 years ago.
shortcode performance analysis php file
scperf.csv (46.0 KB) - added by bobbingwide 4 years ago.
results from running scperf.php from Windows command prompt
17657.update-comment.diff (1.5 KB) - added by duck_ 4 years ago.
17657.5.diff (730 bytes) - added by SergeyBiryukov 4 years ago.

Download all attachments as: .zip

Change History (42)

#1 @aaroncampbell
5 years ago

I verified this, and even wrote some basic unit tests. It only seems to happen with a hyphen. Using realshortcode-notreal passes '-notreal' as the first element (index 0) of the attributes array, but realshortcode_notreal and realshortcodenotreal work as expected.

Last edited 5 years ago by aaroncampbell (previous) (diff)

#2 @solarissmoke
5 years ago

  • Keywords has-patch needs-testing added; needs-patch dev-feedback 2nd-opinion removed

The issue is in get_shortcode_regex() which uses the boundary assertion \b to check for the end of the shortcode tag. Unfortunately a hyphen counts as a word boundary. The solution seems quite simple - just replace that assertion with our own that allows hyphens - see patch.

@solarissmoke
5 years ago

@solarissmoke
5 years ago

Even simpler still

#3 @nacin
5 years ago

  • Version changed from 3.2 to 3.0

Bumping the version down a bit. Probably prevalent since introduction.

#4 @bobbingwide
5 years ago

  • Version changed from 3.0 to 2.9.2

I've just updated the codex saying this is probably a bug.

http://codex.wordpress.org/Shortcode_API#Hyphens

I don't really understand regular expressions so had no idea that the \b was the cause of the problem. But had performed tests showing that it also depended on the sequence of calls to add_shortcode.

The problem was present in version 2.9.2 as well.
When is this going to get fixed? How can I help?

#5 @redpixelstudios
4 years ago

We are also having issues with this bug in many of our shortcodes that use hyphens. Could we push to have this fixed in 3.5? At this point it would be difficult deprecating our hyphenated shortcodes in favor of non-hyphenated ones due to the user-base of some of our plugins. Besides, it would be beneficial for all plugin authors that deal with shortcodes regularly to have the hyphen available as an applicable delimiter.

#6 @redpixelstudios
4 years ago

  • Cc webtools@… added

#7 follow-up: @bobbingwide
4 years ago

There is a workaround for this problem. Register the longer names first.

whether or not a shortcode containing hyphen(s) is handled depends on when it's registered.
if it's registered before the shortcode that is the same as the prefix before the '-' it's OK
it it's registed after, then the shorter shortcode will take precedence

So if you have shortcodes wp, wp-1, wp-2 and wp-3 then you should register wp last.

#8 in reply to: ↑ 7 @redpixelstudios
4 years ago

Replying to bobbingwide:

There is a workaround for this problem. Register the longer names first.

whether or not a shortcode containing hyphen(s) is handled depends on when it's registered.
if it's registered before the shortcode that is the same as the prefix before the '-' it's OK
it it's registed after, then the shorter shortcode will take precedence

So if you have shortcodes wp, wp-1, wp-2 and wp-3 then you should register wp last.

Thanks for the reply, bobbingwide. We will take your workaround under advisement until the issue can be resolved entirely.

#9 @kovshenin
4 years ago

  • Cc kovshenin@… added

#10 follow-up: @nacin
4 years ago

17657.2.diff looks proper. Any noticeable performance difference?

#11 @nacin
4 years ago

  • Keywords needs-unit-tests added
  • Milestone changed from Awaiting Review to 3.5

@kovshenin
4 years ago

Here's one that's broken down into separate lines like in trunk. I don't think the
makes any difference here, but since the other lines use it I thought I'd add it for consistency.

#12 in reply to: ↑ 10 ; follow-up: @kovshenin
4 years ago

Sorry, the "I don't think the makes any difference here" in my above comment should read: "I don't think the \\ makes any difference here." Somebody stripped my slashes.

Replying to nacin:

Any noticeable performance difference?

Nope.

$shortcodes = '[my] [my-shortcode] [my-shortcode-tag] [my-shortcode-tag-tag]';
$duration = array();
for ( $i = 0; $i < 10; $i++ ) {
	$start = microtime( true );
	for ( $j = 0; $j < 10000; $j++ ) {
		do_shortcode( $shortcodes );
		strip_shortcodes( $shortcodes );
	}
	$end = microtime( true );
	$duration[] = $end - $start;
}
printf( "Duration: %f with %f on average.", array_sum( $duration ), array_sum( $duration ) / count( $duration ) );
Last edited 4 years ago by kovshenin (previous) (diff)

#13 follow-up: @kovshenin
4 years ago

17657-unit-tests.2.diff includes @aaroncampbell's tests and a test for hyphenated shortcodes. Two failures with current trunk, both succeed with 17657.3.diff.

I noticed that at least three of the shortcode tests are skipped, because their function doc contains a reference to a ticket with @ticket. I don't think that's normal, is it? If I run the tests in strict mode, the ones marked with @ticket error out with PHP_Invoker_TimeoutException. I'm using PHPUnit 3.6.11.

#14 in reply to: ↑ 13 @nacin
4 years ago

  • Keywords needs-unit-tests removed

Replying to kovshenin:

17657-unit-tests.2.diff includes @aaroncampbell's tests and a test for hyphenated shortcodes. Two failures with current trunk, both succeed with 17657.3.diff.

[UT930]

I noticed that at least three of the shortcode tests are skipped, because their function doc contains a reference to a ticket with @ticket. I don't think that's normal, is it?

Yeah, that is normal. If the referenced ticket is still open, then it is considered to be unit tests for a "known bug" — as in, it would fail — so it is skipped instead. You can run these by definining WP_TESTS_FORCE_KNOWN_BUGS as true in wp-tests-config.php, or more selectively by explicitly running that group. For example:

phpunit --group shortcode,6562,14050

Also, if you run phpunit --verbose --group shortcode, it'll tell you that the tests were skipped because tickets are not fixed.

So to run the tests for this ticket, you'll want:

phpunit --group shortcode,17657

#15 @kovshenin
4 years ago

Wow sweet! Didn't know about that, thanks @nacin!

#16 in reply to: ↑ 12 ; follow-up: @nacin
4 years ago

Replying to kovshenin:

I ran this code, but it didn't have any add_shortcode() calls, nor a lot of content to get through. When I started registering a few shortcodes and gave it some real content to look through, I began to see about a 30% slowdown.

#17 in reply to: ↑ 16 @kovshenin
4 years ago

Replying to nacin:

Sorry, didn't include the add_shortcode() calls in my paste. I altered the code to include some dummy text (18k characters) and four shortcodes which were used a total of 40 times inside the dummy text. I ran ten tests with the original code, and ten tests with the patched code, but still, can't see the performance difference you're talking about. Apparently I'm doing something wrong :)

// Original total: 181.271103, average: 18.127110, min: 13.958034, max: 24.967603.
// Patched total:  182.409497, average: 18.240950, min: 13.885041, max: 25.773981.

Here's the test I ran: https://gist.github.com/3150734

#18 @bobbingwide
4 years ago

Regarding performance, there are two scenarios.

  1. When there are no shortcodes in the content
  2. Otherwise

I ran a suite of tests where the number of registered shortcodes ranged from 1 to 200
and the number of shortcodes in the content ranged from 0 to 100
The shortcodes I used were [wpn] and [wp-n], where n ranged from 1 to 100.
All of them were handled by the same function wpsc() { return "wp"; }

There was no appreciable difference between the average time for the fix and the original code. However, as the number of registered shortcodes increased so the time taken to handle the content increased.

I added this code into the start of do_shortcode()- right at the front

if ( false === strpos( $content, "[" ) )

return $content;

when there were shortcodes in the content then this only increased the execution time insignificantly. But when there were no shortcodes in the content the execution time was constant at .000020 secs per invocation vs .000080 for 0 shortcodes and .000100 for 100.
4 to 5 times faster!

So, if you're concerned about performance please ADD the above code to cater for those sites where shortcodes are hardly ever used in the content.

.php and .csv available on request.


#19 @kovshenin
4 years ago

Hey @bobbingwide, thanks for taking the time to test this! I would appreciate if you shared your test .php, thanks! :)

#20 @mdawaffe
4 years ago

Latest patch seems sane. It's backward compatible since it doesn't change the matching groups.

shortcode_unautop() probably also needs changing.

@kovshenin
4 years ago

Includes shortcode_unautop

@bobbingwide
4 years ago

shortcode performance analysis php file

@bobbingwide
4 years ago

results from running scperf.php from Windows command prompt

#21 @bobbingwide
4 years ago

Run this test on an unchanged version of shortcodes.php
either either from wp-includes or copy wp-includes/shortcodes.php to the folder where you download scperf.php. redirect output to scperf.csv to get your own results.

#22 @ryan
4 years ago

  • Keywords commit added

#23 @ryan
4 years ago

  • Owner set to ryan
  • Resolution set to fixed
  • Status changed from new to closed

In 22382:

Allow hyphens in shortcode names.

Props kovshenin, solarissmoke, aaroncampbell
fixes #17657

#24 @duck_
4 years ago

attachment:17657.update-comment.diff​ updates the inline comment from "Word boundary" to "Not followed by word character or hyphen"

#25 @duck_
4 years ago

In 22401:

Update shortcode regular expression commentary. See #17657.

#26 follow-up: @koopersmith
4 years ago

  • Keywords has-patch needs-testing commit removed
  • Resolution fixed deleted
  • Status changed from closed to reopened

We'll also need to update the regex in wp-includes/js/shortcode.js.

#27 @koopersmith
4 years ago

  • Owner changed from ryan to koopersmith
  • Status changed from reopened to accepted

#28 in reply to: ↑ 26 ; follow-up: @SergeyBiryukov
4 years ago

Replying to koopersmith:

We'll also need to update the regex in wp-includes/js/shortcode.js.

17657.5.diff is an attempt to do that (not sure how to test it though).

#29 in reply to: ↑ 28 @koopersmith
4 years ago

Replying to SergeyBiryukov:

Replying to koopersmith:

We'll also need to update the regex in wp-includes/js/shortcode.js.

17657.5.diff is an attempt to do that (not sure how to test it though).

Looks solid. This is testable by running:

wp.shortcode.next('test', 'a [test-hyphen] shortcode');

If it returns an object it matched, if it returns undefined, it didn't.

#30 @nacin
4 years ago

  • Resolution set to fixed
  • Status changed from accepted to closed

In 22522:

Update the JS version of the shortcode regex to match [22382]. props SergeyBiryukov. fixes #17657.

#31 @hozefze
2 years ago

Hello,

If this bug is fixed, you might want to update the Codex: http://codex.wordpress.org/Shortcode_API#Hyphens

Just add a note like "Since version *.* the hyphens bug described below was fixed."

Thank you

#32 @SergeyBiryukov
2 years ago

  • Keywords needs-codex added
Note: See TracTickets for help on using tickets.