WordPress.org

Make WordPress Core

Opened 4 years ago

Closed 2 years ago

Last modified 12 months ago

#17657 closed defect (bug) (fixed)

Shortcode regex doesn't allow for hyphens in shortcode name

Reported by: sivel Owned by: koopersmith
Milestone: 3.5 Priority: normal
Severity: normal Version: 2.9.2
Component: Shortcodes Keywords: needs-codex
Focuses: Cc:

Description

I found an issue where I wanted to use a string in a post for specifying a lightbox style "album" that looked like a shortcode.

The code I was using looked something like:

<a href="http://example.org/image.jpg" rel="lightbox[album-123]">Image</a>

Another plugin was installed that had registered a shortcode for "album". due to the '\b' in the regex the '-' is seen as the ending delimiter in the shortcode name, and the callback for 'album' was processed at this point.

I of course corrected the issue by prefixing the "album" name with a unique string, however I am thinking that we should perhaps update the regex so that some common characters often seen in strings for readability without spaces.

Plugin code to test:

add_shortcode('broken', 'broken');

function broken() {
	return 'Running the broken shortcode';
}

and then in a post:

[broken-shortcode]

Attachments (10)

17657-unit-tests.diff (2.4 KB) - added by aaroncampbell 4 years ago.
17657.diff (553 bytes) - added by solarissmoke 4 years ago.
17657.2.diff (552 bytes) - added by solarissmoke 4 years ago.
Even simpler still
17657.3.diff (754 bytes) - added by kovshenin 3 years ago.
Here's one that's broken down into separate lines like in trunk. I don't think the makes any difference here, but since the other lines use it I thought I'd add it for consistency.
17657-unit-tests.2.diff (2.1 KB) - added by kovshenin 3 years ago.
17657.4.diff (1.4 KB) - added by kovshenin 3 years ago.
Includes shortcode_unautop
scperf.php (6.7 KB) - added by bobbingwide 3 years ago.
shortcode performance analysis php file
scperf.csv (46.0 KB) - added by bobbingwide 3 years ago.
results from running scperf.php from Windows command prompt
17657.update-comment.diff (1.5 KB) - added by duck_ 2 years ago.
17657.5.diff (730 bytes) - added by SergeyBiryukov 2 years ago.

Download all attachments as: .zip

Change History (42)

comment:1 @aaroncampbell4 years ago

I verified this, and even wrote some basic unit tests. It only seems to happen with a hyphen. Using realshortcode-notreal passes '-notreal' as the first element (index 0) of the attributes array, but realshortcode_notreal and realshortcodenotreal work as expected.

Last edited 4 years ago by aaroncampbell (previous) (diff)

comment:2 @solarissmoke4 years ago

  • Keywords has-patch needs-testing added; needs-patch dev-feedback 2nd-opinion removed

The issue is in get_shortcode_regex() which uses the boundary assertion \b to check for the end of the shortcode tag. Unfortunately a hyphen counts as a word boundary. The solution seems quite simple - just replace that assertion with our own that allows hyphens - see patch.

@solarissmoke4 years ago

@solarissmoke4 years ago

Even simpler still

comment:3 @nacin4 years ago

  • Version changed from 3.2 to 3.0

Bumping the version down a bit. Probably prevalent since introduction.

comment:4 @bobbingwide4 years ago

  • Version changed from 3.0 to 2.9.2

I've just updated the codex saying this is probably a bug.

http://codex.wordpress.org/Shortcode_API#Hyphens

I don't really understand regular expressions so had no idea that the \b was the cause of the problem. But had performed tests showing that it also depended on the sequence of calls to add_shortcode.

The problem was present in version 2.9.2 as well.
When is this going to get fixed? How can I help?

comment:5 @redpixelstudios3 years ago

We are also having issues with this bug in many of our shortcodes that use hyphens. Could we push to have this fixed in 3.5? At this point it would be difficult deprecating our hyphenated shortcodes in favor of non-hyphenated ones due to the user-base of some of our plugins. Besides, it would be beneficial for all plugin authors that deal with shortcodes regularly to have the hyphen available as an applicable delimiter.

comment:6 @redpixelstudios3 years ago

  • Cc webtools@… added

comment:7 follow-up: @bobbingwide3 years ago

There is a workaround for this problem. Register the longer names first.

whether or not a shortcode containing hyphen(s) is handled depends on when it's registered.
if it's registered before the shortcode that is the same as the prefix before the '-' it's OK
it it's registed after, then the shorter shortcode will take precedence

So if you have shortcodes wp, wp-1, wp-2 and wp-3 then you should register wp last.

comment:8 in reply to: ↑ 7 @redpixelstudios3 years ago

Replying to bobbingwide:

There is a workaround for this problem. Register the longer names first.

whether or not a shortcode containing hyphen(s) is handled depends on when it's registered.
if it's registered before the shortcode that is the same as the prefix before the '-' it's OK
it it's registed after, then the shorter shortcode will take precedence

So if you have shortcodes wp, wp-1, wp-2 and wp-3 then you should register wp last.

Thanks for the reply, bobbingwide. We will take your workaround under advisement until the issue can be resolved entirely.

comment:9 @kovshenin3 years ago

  • Cc kovshenin@… added

comment:10 follow-up: @nacin3 years ago

17657.2.diff looks proper. Any noticeable performance difference?

comment:11 @nacin3 years ago

  • Keywords needs-unit-tests added
  • Milestone changed from Awaiting Review to 3.5

@kovshenin3 years ago

Here's one that's broken down into separate lines like in trunk. I don't think the
makes any difference here, but since the other lines use it I thought I'd add it for consistency.

comment:12 in reply to: ↑ 10 ; follow-up: @kovshenin3 years ago

Sorry, the "I don't think the makes any difference here" in my above comment should read: "I don't think the \\ makes any difference here." Somebody stripped my slashes.

Replying to nacin:

Any noticeable performance difference?

Nope.

$shortcodes = '[my] [my-shortcode] [my-shortcode-tag] [my-shortcode-tag-tag]';
$duration = array();
for ( $i = 0; $i < 10; $i++ ) {
	$start = microtime( true );
	for ( $j = 0; $j < 10000; $j++ ) {
		do_shortcode( $shortcodes );
		strip_shortcodes( $shortcodes );
	}
	$end = microtime( true );
	$duration[] = $end - $start;
}
printf( "Duration: %f with %f on average.", array_sum( $duration ), array_sum( $duration ) / count( $duration ) );
Last edited 3 years ago by kovshenin (previous) (diff)

comment:13 follow-up: @kovshenin3 years ago

17657-unit-tests.2.diff includes @aaroncampbell's tests and a test for hyphenated shortcodes. Two failures with current trunk, both succeed with 17657.3.diff.

I noticed that at least three of the shortcode tests are skipped, because their function doc contains a reference to a ticket with @ticket. I don't think that's normal, is it? If I run the tests in strict mode, the ones marked with @ticket error out with PHP_Invoker_TimeoutException. I'm using PHPUnit 3.6.11.

comment:14 in reply to: ↑ 13 @nacin3 years ago

  • Keywords needs-unit-tests removed

Replying to kovshenin:

17657-unit-tests.2.diff includes @aaroncampbell's tests and a test for hyphenated shortcodes. Two failures with current trunk, both succeed with 17657.3.diff.

[UT930]

I noticed that at least three of the shortcode tests are skipped, because their function doc contains a reference to a ticket with @ticket. I don't think that's normal, is it?

Yeah, that is normal. If the referenced ticket is still open, then it is considered to be unit tests for a "known bug" — as in, it would fail — so it is skipped instead. You can run these by definining WP_TESTS_FORCE_KNOWN_BUGS as true in wp-tests-config.php, or more selectively by explicitly running that group. For example:

phpunit --group shortcode,6562,14050

Also, if you run phpunit --verbose --group shortcode, it'll tell you that the tests were skipped because tickets are not fixed.

So to run the tests for this ticket, you'll want:

phpunit --group shortcode,17657

comment:15 @kovshenin3 years ago

Wow sweet! Didn't know about that, thanks @nacin!

comment:16 in reply to: ↑ 12 ; follow-up: @nacin3 years ago

Replying to kovshenin:

I ran this code, but it didn't have any add_shortcode() calls, nor a lot of content to get through. When I started registering a few shortcodes and gave it some real content to look through, I began to see about a 30% slowdown.

comment:17 in reply to: ↑ 16 @kovshenin3 years ago

Replying to nacin:

Sorry, didn't include the add_shortcode() calls in my paste. I altered the code to include some dummy text (18k characters) and four shortcodes which were used a total of 40 times inside the dummy text. I ran ten tests with the original code, and ten tests with the patched code, but still, can't see the performance difference you're talking about. Apparently I'm doing something wrong :)

// Original total: 181.271103, average: 18.127110, min: 13.958034, max: 24.967603.
// Patched total:  182.409497, average: 18.240950, min: 13.885041, max: 25.773981.

Here's the test I ran: https://gist.github.com/3150734

comment:18 @bobbingwide3 years ago

Regarding performance, there are two scenarios.

  1. When there are no shortcodes in the content
  2. Otherwise

I ran a suite of tests where the number of registered shortcodes ranged from 1 to 200
and the number of shortcodes in the content ranged from 0 to 100
The shortcodes I used were [wpn] and [wp-n], where n ranged from 1 to 100.
All of them were handled by the same function wpsc() { return "wp"; }

There was no appreciable difference between the average time for the fix and the original code. However, as the number of registered shortcodes increased so the time taken to handle the content increased.

I added this code into the start of do_shortcode()- right at the front

if ( false === strpos( $content, "[" ) )

return $content;

when there were shortcodes in the content then this only increased the execution time insignificantly. But when there were no shortcodes in the content the execution time was constant at .000020 secs per invocation vs .000080 for 0 shortcodes and .000100 for 100.
4 to 5 times faster!

So, if you're concerned about performance please ADD the above code to cater for those sites where shortcodes are hardly ever used in the content.

.php and .csv available on request.


comment:19 @kovshenin3 years ago

Hey @bobbingwide, thanks for taking the time to test this! I would appreciate if you shared your test .php, thanks! :)

comment:20 @mdawaffe3 years ago

Latest patch seems sane. It's backward compatible since it doesn't change the matching groups.

shortcode_unautop() probably also needs changing.

@kovshenin3 years ago

Includes shortcode_unautop

@bobbingwide3 years ago

shortcode performance analysis php file

@bobbingwide3 years ago

results from running scperf.php from Windows command prompt

comment:21 @bobbingwide3 years ago

Run this test on an unchanged version of shortcodes.php
either either from wp-includes or copy wp-includes/shortcodes.php to the folder where you download scperf.php. redirect output to scperf.csv to get your own results.

comment:22 @ryan2 years ago

  • Keywords commit added

comment:23 @ryan2 years ago

  • Owner set to ryan
  • Resolution set to fixed
  • Status changed from new to closed

In 22382:

Allow hyphens in shortcode names.

Props kovshenin, solarissmoke, aaroncampbell
fixes #17657

@duck_2 years ago

comment:24 @duck_2 years ago

attachment:17657.update-comment.diff​ updates the inline comment from "Word boundary" to "Not followed by word character or hyphen"

comment:25 @duck_2 years ago

In 22401:

Update shortcode regular expression commentary. See #17657.

comment:26 follow-up: @koopersmith2 years ago

  • Keywords has-patch needs-testing commit removed
  • Resolution fixed deleted
  • Status changed from closed to reopened

We'll also need to update the regex in wp-includes/js/shortcode.js.

comment:27 @koopersmith2 years ago

  • Owner changed from ryan to koopersmith
  • Status changed from reopened to accepted

@SergeyBiryukov2 years ago

comment:28 in reply to: ↑ 26 ; follow-up: @SergeyBiryukov2 years ago

Replying to koopersmith:

We'll also need to update the regex in wp-includes/js/shortcode.js.

17657.5.diff is an attempt to do that (not sure how to test it though).

comment:29 in reply to: ↑ 28 @koopersmith2 years ago

Replying to SergeyBiryukov:

Replying to koopersmith:

We'll also need to update the regex in wp-includes/js/shortcode.js.

17657.5.diff is an attempt to do that (not sure how to test it though).

Looks solid. This is testable by running:

wp.shortcode.next('test', 'a [test-hyphen] shortcode');

If it returns an object it matched, if it returns undefined, it didn't.

comment:30 @nacin2 years ago

  • Resolution set to fixed
  • Status changed from accepted to closed

In 22522:

Update the JS version of the shortcode regex to match [22382]. props SergeyBiryukov. fixes #17657.

comment:31 @hozefze12 months ago

Hello,

If this bug is fixed, you might want to update the Codex: http://codex.wordpress.org/Shortcode_API#Hyphens

Just add a note like "Since version *.* the hyphens bug described below was fixed."

Thank you

comment:32 @SergeyBiryukov12 months ago

  • Keywords needs-codex added
Note: See TracTickets for help on using tickets.