WordPress.org

Make WordPress Core

Opened 4 years ago

Closed 4 years ago

Last modified 4 years ago

#33517 closed defect (bug) (fixed)

Crash After Registering 500+ Shortcodes

Reported by: thepolishlad Owned by: miqrogroove
Milestone: 4.4 Priority: normal
Severity: normal Version:
Component: Shortcodes Keywords: roadmap needs-testing wptexturize
Focuses: Cc:
PR Number:

Description

I have a Wishlist Member Site with a massive number of levels. Things were fine until I hit around 250 levels created. The the site started having all kinds of issues, with the nav menu in wp-admin not looking right and none of the pages looking right. Restoring to a previous backup brought me down to 200 levels, but I still had the problem.

I did all sorts of troubleshooting, and narrowed it down to Wishlist being the plugin causing the site to go wonky. I dug into the error_log and found this PHP error:
[23-Aug-2015 06:28:17 UTC] PHP Warning: preg_split(): Compilation failed: regular expression is too large at offset 33167 in /home/myhomedirectory/public_html/notmyrealwebsite.com/wp-includes/formatting.php on line 255

After hours of research, I came across this thread: http://stackoverflow.com/questions/31172837/regular-expression-is-too-large-error-in-php

I am going to continue to need to create Wishlist Member Levels, and it seems that Wishlist Member creates at least 2 shortcodes for each level. Can this regex bug get fixed?

Thanks, happy to answer any questions.

Attachments (4)

33517.patch (3.3 KB) - added by miqrogroove 4 years ago.
33517.2.patch (9.4 KB) - added by miqrogroove 4 years ago.
Covers same code in wptexturize
33517.3.patch (9.5 KB) - added by miqrogroove 4 years ago.
Shortcode name pattern aligned with #34090
33517.4.patch (9.5 KB) - added by miqrogroove 4 years ago.
Shortcode pattern aligned with #34092

Download all attachments as: .zip

Change History (22)

#1 @Clorith
4 years ago

  • Resolution set to wontfix
  • Status changed from new to closed

Hi @thepolishlad,

This sounds like an under-optimized plugin doing something the hard way (I'm afraid I do not know what plugin this is, so I can't tell you exactly what it's doing that's wrong, but if you rely on 500 different shortcodes from one plugin to display content, that's just way too many).

I'd contact the plugin developer about this and have them look into it as it sounds like a very unlikely edge case for WordPress to handle.

Alternatively, if you are really stuck on this, feel free to make a thread on the forums and we'll be happy to help you troubleshoot it from there (We will need more details on what plugin it is etc still though). I'll be closing this for now, but should you come upon more details feel free to supply them and we can reassess it all.

#2 @netweb
4 years ago

  • Milestone Awaiting Review deleted

#3 @miqrogroove
4 years ago

  • Keywords roadmap added
  • Milestone set to Future Release
  • Resolution wontfix deleted
  • Status changed from closed to reopened
  • Summary changed from Wishlist Member levels create 2 shortcodes for each membership Level. 200+ levels crash wp-includes/formatting.php, killing the site. to Crash Caused by Registering 500+ Shortcodes
  • Version 4.3 deleted

Coincidentally, I just started working on a roadmap proposal that would fix this with an alternative shortcode syntax. Let's keep it open for now.

#4 @thepolishlad
4 years ago

@Clorinth,

Thanks for the feedback. The plugin is Wishlist Member (http://member.wishlistproducts.com/). The plugin allows content to be protected to be shown to only members of a specific level. It does this through page protection, but also can allow admins to protect specific content on a page to be shown two ways:

  1. Content to be shown only to the specified member level(s)
  2. Content to be shown to those NOT in the specified member level(s)

Shortcodes are used in both scenarios. I am working with the plugin developer to troubleshoot this as well. I'll likely head over to the forums to post this situation to see if anyone can help troubleshoot.

@netweb, that's great news! I'm guessing that fix will take time to get released though. But good to know it's hopefully in the works.

#5 @thepolishlad
4 years ago

  • Summary changed from Crash Caused by Registering 500+ Shortcodes to Crash Caused by Registering Shortcodes Whose Total Characters Exceed 33k

I discovered that the regex on line 255 of formatting.php that processes the shortcodes can handle around 33k characters.

In my case, I am going to rename the membership levels in Wishlist Member to greatly reduce the # of characters each level has, thereby reducing the overall amount of characters in the shortcodes generated by Wishlist Member.

I updated the Summary as well to reflect the crash being caused by the # of characters, and not the # of shortcodes registered. Feel free to change it as appropriate.

Last edited 4 years ago by thepolishlad (previous) (diff)

#6 follow-up: @miqrogroove
4 years ago

  • Summary changed from Crash Caused by Registering Shortcodes Whose Total Characters Exceed 33k to Crash After Registering 500+ Shortcodes

Let's just keep it concise. "Total characters" is a bit vague.

#7 in reply to: ↑ 6 ; follow-up: @thepolishlad
4 years ago

Replying to miqrogroove:

Let's just keep it concise. "Total characters" is a bit vague.

It's not the # of shortcodes though, it crashes when the total registered shortcodes character count is over 33k.

#8 in reply to: ↑ 7 ; follow-up: @miqrogroove
4 years ago

Replying to thepolishlad:

It's not the # of shortcodes though

This is wishful thinking. Fact is, if anyone registers hundreds of shortcodes, the system becomes extremely inefficient with alternation patterns until it finally crashes the PCRE compiler. The available solutions are to use shortcode attributes correctly (which falls on the plugin author), or to redesign the system (which falls on contributors like me).

Last edited 4 years ago by miqrogroove (previous) (diff)

#9 in reply to: ↑ 8 ; follow-up: @thepolishlad
4 years ago

Replying to miqrogroove:

Replying to thepolishlad:

It's not the # of shortcodes though

This is wishful thinking. Fact is, if anyone registers hundreds of shortcodes, the system becomes extremely inefficient with alternation patterns until it finally crashes the PCRE compiler. The available solutions are to use shortcode attributes correctly (which falls on the plugin author), or to redesign the system (which falls on contributors like me).

I see. I was making a guess (I'm not a programmer) based on what I was seeing.

Do you have any idea of the max # of shortcodes able to be registered before the PCRE compiler crashes?

#10 in reply to: ↑ 9 @miqrogroove
4 years ago

Replying to thepolishlad:

Do you have any idea of the max # of shortcodes able to be registered before the PCRE compiler crashes?

This is going to vary from server to server. When we first tested 4.0 there were some getting crashes with no plugins. What we have now is an improvement on that but still doesn't scale well.

@miqrogroove
4 years ago

#11 @miqrogroove
4 years ago

  • Keywords has-patch needs-testing added
  • Milestone changed from Future Release to 4.4

The attached solution follows recent developments in the shortcode roadmap discussions. This patch will be reviewed during the Slack meeting next week. Patch appears to work, but it will need extensive performance testing and evaluation of backtrack counts in various PHP versions. Also needs to be expanded into wptexturize().

Last edited 4 years ago by miqrogroove (previous) (diff)

#12 @miqrogroove
4 years ago

  • Keywords wptexturize added; has-patch removed

@miqrogroove
4 years ago

Covers same code in wptexturize

#13 @miqrogroove
4 years ago

  • Owner set to miqrogroove
  • Status changed from reopened to accepted

I have a home made PCRE backtrack testing script that I've used in the past. I'm thinking it would be slick to take the new expressions, plus some or all of the expressions in wptexturize() and move them into separate functions similar to get_shortcode_regex(). Then, introduce my backtrack tests as phpunit methods. It would make future PCRE performance evaluations somewhat less labor intensive.

Last edited 4 years ago by miqrogroove (previous) (diff)

@miqrogroove
4 years ago

Shortcode name pattern aligned with #34090

@miqrogroove
4 years ago

Shortcode pattern aligned with #34092

#14 @wonderboymusic
4 years ago

  • Resolution set to fixed
  • Status changed from accepted to closed

In 34747:

Shortcodes: Fix PCRE performance bugs in get_shortcode_regexp() and related to wptexturize(), do_shortcode(), and strip_shortcodes()

Alters unit tests.

Props miqrogroove.
Fixes #33517.

#15 follow-up: @gitlost
4 years ago

The $matches[1] would be better wrapped in an array_unique() to avoid duplicates. Also I think it would be nice to put this logic in a function as apart from DRY it's generally useful and would be one small step to reduce references to the global shortcodes array and to APIize shortcodes, eg

/**
 * Return registered shortcode tagnames, reduced to those in $text if given.
 */
function get_shortcode_tagnames( $text = null ) {
	global $shortcode_tags;

	if ( ! $shortcode_tags || ! is_array( $shortcode_tags ) ) {
		return array();
	}
	if ( null === $text ) {
		return array_keys( $shortcode_tags );
	}
	if ( false === strpos( $text, '[' ) || ! preg_match_all( '@\[/?\K[^<>&/\[\]\x00-\x20]++@', $text, $matches ) ) {
		return array();
	}
	return array_values( array_intersect( array_unique( $matches[0] ), array_keys( $shortcode_tags ) ) );
}


#16 @miqrogroove
4 years ago

Duplicates would occur only if I had the array_intersect args in the wrong order. I thought I was careful enough, but I could do some performance testing with array_unique if you think it's important?

#17 in reply to: ↑ 15 @miqrogroove
4 years ago

Replying to gitlost:

Also I think it would be nice to put this logic in a function as apart from DRY it's generally useful and would be one small step to reduce references to the global shortcodes array and to APIize shortcodes, eg

Just so we're on the same page, this logic is going to be removed from wptexturize() in 4.5 per the current roadmap draft. After that, there would be no core use for this function outside of shortcodes.php. Plugins wont need it because it is already part of do_shortcode().

So, is there really a strong need for this in the API? Something I didn't think about?

#18 @gitlost
4 years ago

Duplicates would occur only if I had the array_intersect args in the wrong order. I thought I was careful enough, but I could do some performance testing with array_unique if you think it's important?

No, I missed that, so performance testing would be silly.

Just so we're on the same page

Evidently not!

So, is there really a strong need for this in the API? Something I didn't think about?

Probably not, but even internally it would make sense.

Note: See TracTickets for help on using tickets.