Make WordPress Core

Opened 16 years ago

Closed 13 years ago

#8553 closed defect (bug) (fixed)

preg_replace_callback in do_shortcode returns empty for large posts

Reported by: aaroncampbell's profile AaronCampbell Owned by:
Milestone: 3.3 Priority: high
Severity: normal Version:
Component: Shortcodes Keywords: has-patch needs-testing
Focuses: Cc:

Description

This is definitely related to #6877, though I don't know that I'd call it a dupe. Anyway, the problem is that on long posts the pcre.backtrack_limit is exceeded. I have a post that is 105k+ characters and I couldn't process even a single shortcode.

Setting the pcre.backtrack_limit to 1,000,000 worked (ini_set('pcre.backtrack_limit', 1000000);), but the default (100,000) should definitely work.

Here are some thoughts to kick around:

  • I wonder if it would help to break the post into chunks first: process from the first [ to the last ] for shortcodes then re-add the start and end. In many cases this would reduce the amount of text being processed.
  • Alternatively, maybe some of them could be handled with str_replace? (if [shortcode] exists and [/shortcode] doesn't). I know that this would only work for shortcodes with no attributes or content, but if that's a common usage (in the specific case I was working with, all the shortcodes were like this).

Anyway, the database allows 4,294,967,296 characters for a post, but I run against this problem pretty regularly on posts over 100,000.

Attachments (6)

8553.diff (1.1 KB) - added by Denis-de-Bernardy 16 years ago.
8553.2.diff (494 bytes) - added by Denis-de-Bernardy 15 years ago.
8553.3.diff (164 bytes) - added by Brusdeylins 15 years ago.
working fix
example-post.txt (39.4 KB) - added by Brusdeylins 15 years ago.
a long article from brusdeylins.info for fix in 8553.3.diff
long-post.zip (16.8 KB) - added by xibe 15 years ago.
xibe's long post
long-post.txt (100.6 KB) - added by aaroncampbell 13 years ago.
Long post content with gallery and caption shortcodes

Download all attachments as: .zip

Change History (71)

#1 @DD32
16 years ago

Setting the pcre.backtrack_limit to 1,000,000 worked (ini_set('pcre.backtrack_limit', 1000000);)

(Just for anyone else who comes accross this ticket, That requires PHP 5.2+)

#2 @AaronCampbell
16 years ago

Good point. Also, I'm noticing that a lot of my tickets have been commented on and I never receive E-Mails. What happened to the notification?

#3 @mrmist
16 years ago

Some sample text for reproduction if required. (Fixed by the workaround). Looking at this you can see when it's going to happen because the word count stays at 0.

#4 @Denis-de-Bernardy
16 years ago

this bracktrack stuff is soooo annoying. not to mention the complete lack of error messages that comes with it. it really makes one wonder why they added pcre into php. I mean, php5.2 renders pcre completely useless except for the most basic patterns.

In case it helps, I ended up fixing this on my end by doing less work in each regexp. That is, instead of matching exactly what I needed, and catching everything I needed, all in one go, I ended up using multiple calls to preg_match, preg_replace, and preg_replace_callback...

and yes, that's another way of saying gobbling more php resources to work around a limit on resources imposed by php. :D

#5 @Denis-de-Bernardy
16 years ago

"fixing this on my end" was for my plugins, of course.

#6 @Denis-de-Bernardy
16 years ago

  • Keywords has-patch needs-testing added; needs-patch removed

Please try the attached fix. It splits shortcode workflow in two steps -- first fetch shortcodes, then match parameters.

#7 @mrmist
16 years ago

I applied the patch but it did not fix the issue for me (my sample text attached above still appears as a blank post.)

#8 @mrmist
16 years ago

Edit - I should say it's in the post editor where it appears blank (after pressing save / publish.)

#9 @tmcookies
16 years ago

is it perhaps a problem of mysql's longtext-implementation? I have the same problem here. When i write a long article i can't really save it. I've tried copying the whole text into the mysql-table directly and didn't work either!

#10 follow-up: @Denis-de-Bernardy
16 years ago

odd... when I tried using your sample text, it worked fine, even with the multiple shortcodes I inserted into it.

for the save that's not working, it's likely another bug. could it be a plugin you've installed or something?

#11 in reply to: ↑ 10 @mrmist
16 years ago

Replying to Denis-de-Bernardy:

for the save that's not working, it's likely another bug. could it be a plugin you've installed or something?

I was testing against trunk from the svn with the patch applied, so the install doesn't have any active plugins.

Just to clarify when you were able to save the sample text successfully was that with or without an increased pcre.backtrack_limit?

#12 follow-up: @Denis-de-Bernardy
16 years ago

With a higher pcre backtrace limit. Further investigation reveals the issue with your test text comes from wpautop.

#13 @Denis-de-Bernardy
16 years ago

there is a bug opened for wpautop: #6877

#14 in reply to: ↑ 12 @mrmist
16 years ago

Replying to Denis-de-Bernardy:

With a higher pcre backtrace limit. Further investigation reveals the issue with your test text comes from wpautop.

OK. Keep in mind though that if I increase the limit then my sample text goes through fine, without any patch. It's the default limit that poses a problem.

#15 follow-up: @Denis-de-Bernardy
16 years ago

yaya, but please try the patch in #6877 - I already requested that it be added 5.7.1, but I (sadly) have no commit rights over here. :-)

#16 in reply to: ↑ 15 @mrmist
16 years ago

Replying to Denis-de-Bernardy:

yaya, but please try the patch in #6877 - I already requested that it be added 5.7.1, but I (sadly) have no commit rights over here. :-)

I'm not sure which patch you mean. The patch attached to #6877 was committed way back in the 2.5 release, and still seems to be present in 10404. (So in other words doesn't fix the sample text.)

#17 follow-up: @Denis-de-Bernardy
16 years ago

this one:

http://trac.wordpress.org/attachment/ticket/6877/6877.2.diff

It was committed only a few weeks ago, but only into wp 2.8 -- not wp 2.7.x

#18 @mrmist
16 years ago

My local test follows trunk so yeah, it has that patch. It's not a fix for the sample text above.

I feel I should point out that I don't have this issue myself, the sample text was from an old thread on the support forums, but it serves to illustrate.

#19 in reply to: ↑ 17 @azaozz
16 years ago

Replying to Denis-de-Bernardy:

this one:

http://trac.wordpress.org/attachment/ticket/6877/6877.2.diff

It was committed only a few weeks ago, but only into wp 2.8 -- not wp 2.7.x

Actually it was committed about 3 months ago to 2.7 (then trunk) [9255].

#20 follow-up: @tmcookies
16 years ago

guys, could those of you for which the text doesn't work perhaps try inserting the text directly using phpmyadmin? For me even that doesn't work but if i upload a file with the content embedded in a sql-insert-statement and "mysql -e file" then it works. Could it be a $_POST-thing?

#21 @Denis-de-Bernardy
16 years ago

definitely not. see my comment on #6877 on the line that makes wpautop clunk on this piece of text.

#22 in reply to: ↑ 20 @mrmist
16 years ago

Replying to tmcookies:

guys, could those of you for which the text doesn't work perhaps try inserting the text directly using phpmyadmin?

Done. Inserted fine through phpmyadmin.

Still blank when viewed from edit post.

#23 @tmcookies
16 years ago

sorry. I just found out what the problem in my case was. For anyone else having the same problem as i have: i had the php-module suhosin enabled. This has some "max"-options enabled (mainly suhosin.request.max_value_length and suhosin.post.max_value_length. Those limit the size of the value php gets from $_POST and $_GET.. So, either disable suhosin or change these values..

#24 @azaozz
16 years ago

(In [10527]) Reduce backtracking in autop, fixes #6877, see #8553

#25 @xibe
15 years ago

  • Cc xavier@… added

Sorry to burst any potential bubble here, but as far as I can tell it is still not fixed in the current trunk.

I told the long story on the support forums, but I'll make it short here: using 2.7, I couldn't only get my loooong post to work by installing the Text Control plugin and setting it to nl2br rather than wpautop. I tested the same text (50 Kb) in a trunk install, and it would fail all the same.

I'd be happy to do more tests if I can help this get fixed proper.

If I'm wrong anywhere in my assumptions, please let me know. Also, if this would be a better fit in a re-opened #6877, please tell me, and I'll oblige.

Thank you.

#26 @hakre
15 years ago

an idea on this one: shortcodes aren't that complex, it should be possible to write a parser that does it with php functions (instead of regex) and therefore shouldn't run into such problems.

can be used transparent through the current api.

the current docblocks in there are a bit unclear wether or not it is possible to put shortcodes into shortcodes. http://codex.wordpress.org/Shortcode_API states yes, but they won't be handeled automatically. http://svn.automattic.com/wordpress-tests/wp-testcase/test_shortcode.php states that the same shortcode can enclose itself but left open where the closing code is.

#27 @Denis-de-Bernardy
15 years ago

Could anyone with looong posts try the attached patch?

#28 @xibe
15 years ago

  • Keywords needs-patch added; has-patch needs-testing removed

I tested the patch on my trunk install, using the aforementioned long post of mine: it's not working.

One thing I don't remember noticing before, is that the post gets blank if saved in Visual mode - saving in HTML mode still leaves the post intact, it's just that online version that's blank.

---

Testing with a huge, generated Lorem Ispum, the post does get properly published (2.8-bleeding w/ patch AND 2.7.2). But if fails if you add just one [caption]-ed image...

#29 @Denis-de-Bernardy
15 years ago

  • Component changed from General to Shortcodes
  • Milestone changed from 2.7.2 to Future Release
  • Owner anonymous deleted

patch needs to be refreshed. moving to future.

#31 @Denis-de-Bernardy
15 years ago

  • Milestone changed from Future Release to 2.9

#32 @Brusdeylins
15 years ago

  • Cc Brusdeylins added
  • Priority changed from normal to high

Hi,

the resolution is descibed on my website (in german):
http://www.brusdeylins.info/wordpress/probleme-mit-shortcodes/

The problem is in shortcodes.php in the function get_shortcode_regex().
Here the RegEx has a non-greedy part between the brackets, which should exclude the closing bracket. This would reduce the memory usage of the backtracking process in the RegEx-Engine (the reason of this problem).

My solution for the last line of this function looks like this (WordPress 2.8):
return '(.?)\[('.$tagregexp.')\b([\]]*?)(\/)?\](?:(.+?)\[\/\2\])?(.?)';

Here I also removed the non-catching brackets around the catching brackets around the slash after the non-greedy area... Don't know why these are existed...

#33 @Brusdeylins
15 years ago

Here my modified function in shortcodes.php:

function get_shortcode_regex() {
	global $shortcode_tags;
	$tagnames = array_keys($shortcode_tags);
	$tagregexp = join( '|', array_map('preg_quote', $tagnames) );

	//return '(.?)\[('.$tagregexp.')\b(.*?)(?:(\/))?\](?:(.+?)\[\/\2\])?(.?)';
	  return '(.?)\[('.$tagregexp.')\b([^\]]*?)(\/)?\](?:(.+?)\[\/\2\])?(.?)';
	
}

#34 @Denis-de-Bernardy
15 years ago

@Brusdeylins - please try the alternative patch over at #9264

#35 @Brusdeylins
15 years ago

Hi,

Bug 9264 is NOT THE SOLUTION ! (of course, it is an other topic...)
I tested it on my long article "Yahoo Finance API"
http://www.brusdeylins.info/projects/yahoo-finance-api/
(This article contains source codes with square brackets in it, you have to face this!)

In the Attachment in 9264 you are still running over the first closing bracket of the first shortcode tag: (.*?)
This don't make sence if your shortcode names should not contain any brackets... So replace the (.*?) like described above...

About 9264:
the (\/)? allows you to use shortcode tags like [NAME/] at the beginning. And with (?:(.+?)\[\/\2\])? the closing tag is optional. So you can use tags combinations like:

[x]...[/x] 
[x/]...[/x] 
[x]
[x/]

#36 @Denis-de-Bernardy
15 years ago

Adding a /x param to make things more readable... The current one is:

"/
(.?)                # optionally catch an extra bracket used to escape shortcode
\[($tagregexp)\b    # begin shortcode
  (.*?)             # optional attributes (non-greedy, stops on close bracket)
  (?:(\/))?         # optional slash to indicate a closing tag, not always set
  \]                # closing bracket
(?:                 # optional content and closing shortcode
  (.+?)             # the content
  \[\/\2\]          # close shortcode with a back reference
)?
(.?)                # optionally catch an extra bracket used to escape shortcode
/x"

you're suggesting:

"/
(.?)                # optionally catch an extra bracket used to escape shortcode
\[($tagregexp)\b    # begin shortcode
  ([^\]]*?)         # optional attributes (non-greedy), equivalent to using a dot
  (\/)?             # optional slash to indicate a closing tag, always set
  \]                # closing bracket
(?:                 # optional content and closing shortcode
  (.+?)             # the content
  \[\/\2\]          # close shortcode with a back reference
)?
(.?)                # optionally catch an extra bracket used to escape shortcode
/x"

the suggestion in #9264 is the following:

"/
(.?)                # optionally catch an extra bracket used to escape shortcode
\[($tagregexp)\b    # begin shortcode
  (.*?)             # optional attributes (non-greedy, stops on close bracket)
  (?:(\/))?         # optional slash to indicate a closing tag, not always set
  \]                # closing bracket
(
  ?(4)              # stop if we already have 4 back references
                    # it would mean we've a slash bracket
  |                 # else
  (?:               # optional content and closing shortcode
    (.+?)           # the content
    \[\/\2\]        # close shortcode with a back reference
  )?
)?
(.?)                # optionally catch an extra bracket used to escape shortcode
/x"

A few notes:

  • Using your patch would remove the possibility for the optimization in #9264 (since we'd always have.
  • Your attribute-related regex might be a bit more efficient from a performance standpoint.

Personally, I was hoping that the optimization in the other ticket would allow you to do [foo/] in order to optimize your way through the mess.

All of these patches, however, leave a few fundamental issues behind:

  1. Something like [foo[bar] get treated as a foo shortcode with [bar as an attribute
  1. We can get brackets in a shortcode's content, which can potentially lead to nested shortcodes:
[foo]bar[][/foo]
[foo][bar/][/foo]
  1. A non-closed shortcode can gobble up an entire document before the regex bails and decides that it has no closing shortcode:
[foo]
long document that runs into the backtrack limit starts here

It's the two last points which I believe are troubling. I take it that some users actually want brackets in shortcodes, so fixing point 2 probably isn't an option. Bailing after, say, 1000 characters after the start of the content might be a good bet.

Would you mind attaching your post's source in a text file, exactly as it is in the WP editor, and specify the plugins you're using?

@Brusdeylins
15 years ago

working fix

@Brusdeylins
15 years ago

a long article from brusdeylins.info for fix in 8553.3.diff

#37 @Brusdeylins
15 years ago

OK, in my last post I wrote "This don't make sence if your shortcode names should not contain any brackets". Here I ment the parameters of the shortcodes, not the names itself. Sorry… I want to explain, how I understand the described regular expression.

--- The double brackets ---

First the double brackets (one non-catching and one catching):

(?:(\/))?         # optional slash to indicate a closing tag, not always set

My suggestion: you don’t need the first non-catching brackets, because they don’t deactivate the catching brackets inside and they don’t catch more then the brackets inside. They are only producing more backtrackings! The Meaning of this part is: The result array holds on position 4 the slash or "empty", but always has the same amount of array elements! And the meaning of “(\/)?” is the same, because the question mark is outside of both brackets…

Here an code example with double brackets:

$TXT = “abcdefg”;
$pattern = '/.*(?:(cd)).*/';
preg_match_all($pattern, $TXT, $array);
echo '<pre>', print_r($array, true), '</pre>';

And the result:

Array
(
    [0] => Array
        (
            [0] => abcdefg
        )

    [1] => Array
        (
            [0] => cd
        )
)


If you run the same program with the following regex:

$pattern = '/.*(cd).*/';

you get exact the same result:

Array
(
    [0] => Array
        (
            [0] => abcdefg
        )

    [1] => Array
        (
            [0] => cd
        )

)

So I think you can reduce this part of the regular expression. Or do you have another example where you get different results?

--- The memory problem with non-greedy quantifiers ---

(.*?)             # optional attributes (non-greedy, stops on close bracket)

Here I don’t think that my solution is equivalent to “using a dot”. Not if you have limits in memory (and time)! And here is the problem. PHP uses a “traditional NFA RegEx-Engine”. This means (in case of non-greedy quantifiers) that the engine likes “save states” and backtracking… if we use non-greedy quantifiers, we have a LIFO process… (This is what I learned yeas ago… maybe this means: trying all - returning the shortest one - if the memory can hold all temp. results?)

Try it. Replace the line like described in 8553.3.diff and you will see, that the example post will appear. With this fix, you don’t resolve your problem No. 3, as you wrote (there is still the pattern (.+?) in the expression)! I am using the WordPress plug-in “NextGen Gallery”. This plug-in uses shortcode tags like “[singlepic id=37 w=150 h=400 float=right]”. Here we don’t have (self) closing tags…

I attached the diff and the HTML-Code of the article I wrote. Happy debugging :-)

#38 @Denis-de-Bernardy
15 years ago

some more testdata:

[foo attr]bar[/foo]
[foo attr]bar[/foo][foo attr]bar[/foo]
[foo attr]bar[/foo][foo attr/]bar[/foo]
[foo attr]bar[/foo][foo attr/]bar
[foo attr]bar[/foo][foo attr]bar
[foo attr]bar[/foo][[foo attr]bar[/foo]
[foo attr]bar[/foo][[foo attr/]bar[/foo]
[foo attr]bar[/foo][[foo attr/]bar
[foo attr]bar[/foo][[foo attr]bar
[foo attr]bar[/foo][foo attr]bar[/foo]]
[foo attr]bar[/foo][foo attr/]]bar[/foo]
[foo attr]bar[/foo][foo attr/]bar]
[foo attr]bar[/foo][foo attr]bar]
[foo attr]bar[/foo][[foo attr]bar[/foo]]
[foo attr]bar[/foo][[foo attr/]]bar[/foo]
[foo attr]bar[/foo][[foo attr/]]bar
[foo attr]bar[/foo][[foo attr]]bar
[foo attr/]bar[/foo]
[foo attr/]bar[/foo][foo attr]bar[/foo]
[foo attr/]bar[/foo][foo attr/]bar[/foo]
[foo attr/]bar[/foo][foo attr/]bar
[foo attr/]bar[/foo][foo attr]bar
[foo attr/]bar[/foo][[foo attr]bar[/foo]
[foo attr/]bar[/foo][[foo attr/]bar[/foo]
[foo attr/]bar[/foo][[foo attr/]bar
[foo attr/]bar[/foo][[foo attr]bar
[foo attr/]bar[/foo][foo attr]bar[/foo]]
[foo attr/]bar[/foo][foo attr/]]bar[/foo]
[foo attr/]bar[/foo][foo attr/]bar]
[foo attr/]bar[/foo][foo attr]bar]
[foo attr/]bar[/foo][[foo attr]bar[/foo]]
[foo attr/]bar[/foo][[foo attr/]]bar[/foo]
[foo attr/]bar[/foo][[foo attr/]]bar
[foo attr/]bar[/foo][[foo attr]]bar
[foo attr/]bar
[foo attr/]bar[foo attr]bar[/foo]
[foo attr/]bar[foo attr/]bar[/foo]
[foo attr/]bar[foo attr/]bar
[foo attr/]bar[foo attr]bar
[foo attr/]bar[[foo attr]bar[/foo]
[foo attr/]bar[[foo attr/]bar[/foo]
[foo attr/]bar[[foo attr/]bar
[foo attr/]bar[[foo attr]bar
[foo attr/]bar[foo attr]bar[/foo]]
[foo attr/]bar[foo attr/]]bar[/foo]
[foo attr/]bar[foo attr/]bar]
[foo attr/]bar[foo attr]bar]
[foo attr/]bar[[foo attr]bar[/foo]]
[foo attr/]bar[[foo attr/]]bar[/foo]
[foo attr/]bar[[foo attr/]]bar
[foo attr/]bar[[foo attr]]bar
[foo attr]bar
[foo attr]bar[foo attr]bar[/foo]
[foo attr]bar[foo attr/]bar[/foo]
[foo attr]bar[foo attr/]bar
[foo attr]bar[foo attr]bar
[foo attr]bar[[foo attr]bar[/foo]
[foo attr]bar[[foo attr/]bar[/foo]
[foo attr]bar[[foo attr/]bar
[foo attr]bar[[foo attr]bar
[foo attr]bar[foo attr]bar[/foo]]
[foo attr]bar[foo attr/]]bar[/foo]
[foo attr]bar[foo attr/]bar]
[foo attr]bar[foo attr]bar]
[foo attr]bar[[foo attr]bar[/foo]]
[foo attr]bar[[foo attr/]]bar[/foo]
[foo attr]bar[[foo attr/]]bar
[foo attr]bar[[foo attr]]bar
[[foo attr]bar[/foo]
[[foo attr]bar[/foo][foo attr]bar[/foo]
[[foo attr]bar[/foo][foo attr/]bar[/foo]
[[foo attr]bar[/foo][foo attr/]bar
[[foo attr]bar[/foo][foo attr]bar
[[foo attr]bar[/foo][[foo attr]bar[/foo]
[[foo attr]bar[/foo][[foo attr/]bar[/foo]
[[foo attr]bar[/foo][[foo attr/]bar
[[foo attr]bar[/foo][[foo attr]bar
[[foo attr]bar[/foo][foo attr]bar[/foo]]
[[foo attr]bar[/foo][foo attr/]]bar[/foo]
[[foo attr]bar[/foo][foo attr/]bar]
[[foo attr]bar[/foo][foo attr]bar]
[[foo attr]bar[/foo][[foo attr]bar[/foo]]
[[foo attr]bar[/foo][[foo attr/]]bar[/foo]
[[foo attr]bar[/foo][[foo attr/]]bar
[[foo attr]bar[/foo][[foo attr]]bar
[[foo attr/]bar[/foo]
[[foo attr/]bar[/foo][foo attr]bar[/foo]
[[foo attr/]bar[/foo][foo attr/]bar[/foo]
[[foo attr/]bar[/foo][foo attr/]bar
[[foo attr/]bar[/foo][foo attr]bar
[[foo attr/]bar[/foo][[foo attr]bar[/foo]
[[foo attr/]bar[/foo][[foo attr/]bar[/foo]
[[foo attr/]bar[/foo][[foo attr/]bar
[[foo attr/]bar[/foo][[foo attr]bar
[[foo attr/]bar[/foo][foo attr]bar[/foo]]
[[foo attr/]bar[/foo][foo attr/]]bar[/foo]
[[foo attr/]bar[/foo][foo attr/]bar]
[[foo attr/]bar[/foo][foo attr]bar]
[[foo attr/]bar[/foo][[foo attr]bar[/foo]]
[[foo attr/]bar[/foo][[foo attr/]]bar[/foo]
[[foo attr/]bar[/foo][[foo attr/]]bar
[[foo attr/]bar[/foo][[foo attr]]bar
[[foo attr/]bar
[[foo attr/]bar[foo attr]bar[/foo]
[[foo attr/]bar[foo attr/]bar[/foo]
[[foo attr/]bar[foo attr/]bar
[[foo attr/]bar[foo attr]bar
[[foo attr/]bar[[foo attr]bar[/foo]
[[foo attr/]bar[[foo attr/]bar[/foo]
[[foo attr/]bar[[foo attr/]bar
[[foo attr/]bar[[foo attr]bar
[[foo attr/]bar[foo attr]bar[/foo]]
[[foo attr/]bar[foo attr/]]bar[/foo]
[[foo attr/]bar[foo attr/]bar]
[[foo attr/]bar[foo attr]bar]
[[foo attr/]bar[[foo attr]bar[/foo]]
[[foo attr/]bar[[foo attr/]]bar[/foo]
[[foo attr/]bar[[foo attr/]]bar
[[foo attr/]bar[[foo attr]]bar
[[foo attr]bar
[[foo attr]bar[foo attr]bar[/foo]
[[foo attr]bar[foo attr/]bar[/foo]
[[foo attr]bar[foo attr/]bar
[[foo attr]bar[foo attr]bar
[[foo attr]bar[[foo attr]bar[/foo]
[[foo attr]bar[[foo attr/]bar[/foo]
[[foo attr]bar[[foo attr/]bar
[[foo attr]bar[[foo attr]bar
[[foo attr]bar[foo attr]bar[/foo]]
[[foo attr]bar[foo attr/]]bar[/foo]
[[foo attr]bar[foo attr/]bar]
[[foo attr]bar[foo attr]bar]
[[foo attr]bar[[foo attr]bar[/foo]]
[[foo attr]bar[[foo attr/]]bar[/foo]
[[foo attr]bar[[foo attr/]]bar
[[foo attr]bar[[foo attr]]bar
[foo attr]bar[/foo]]
[foo attr]bar[/foo]][foo attr]bar[/foo]
[foo attr]bar[/foo]][foo attr/]bar[/foo]
[foo attr]bar[/foo]][foo attr/]bar
[foo attr]bar[/foo]][foo attr]bar
[foo attr]bar[/foo]][[foo attr]bar[/foo]
[foo attr]bar[/foo]][[foo attr/]bar[/foo]
[foo attr]bar[/foo]][[foo attr/]bar
[foo attr]bar[/foo]][[foo attr]bar
[foo attr]bar[/foo]][foo attr]bar[/foo]]
[foo attr]bar[/foo]][foo attr/]]bar[/foo]
[foo attr]bar[/foo]][foo attr/]bar]
[foo attr]bar[/foo]][foo attr]bar]
[foo attr]bar[/foo]][[foo attr]bar[/foo]]
[foo attr]bar[/foo]][[foo attr/]]bar[/foo]
[foo attr]bar[/foo]][[foo attr/]]bar
[foo attr]bar[/foo]][[foo attr]]bar
[foo attr/]]bar[/foo]
[foo attr/]]bar[/foo][foo attr]bar[/foo]
[foo attr/]]bar[/foo][foo attr/]bar[/foo]
[foo attr/]]bar[/foo][foo attr/]bar
[foo attr/]]bar[/foo][foo attr]bar
[foo attr/]]bar[/foo][[foo attr]bar[/foo]
[foo attr/]]bar[/foo][[foo attr/]bar[/foo]
[foo attr/]]bar[/foo][[foo attr/]bar
[foo attr/]]bar[/foo][[foo attr]bar
[foo attr/]]bar[/foo][foo attr]bar[/foo]]
[foo attr/]]bar[/foo][foo attr/]]bar[/foo]
[foo attr/]]bar[/foo][foo attr/]bar]
[foo attr/]]bar[/foo][foo attr]bar]
[foo attr/]]bar[/foo][[foo attr]bar[/foo]]
[foo attr/]]bar[/foo][[foo attr/]]bar[/foo]
[foo attr/]]bar[/foo][[foo attr/]]bar
[foo attr/]]bar[/foo][[foo attr]]bar
[foo attr/]bar]
[foo attr/]bar][foo attr]bar[/foo]
[foo attr/]bar][foo attr/]bar[/foo]
[foo attr/]bar][foo attr/]bar
[foo attr/]bar][foo attr]bar
[foo attr/]bar][[foo attr]bar[/foo]
[foo attr/]bar][[foo attr/]bar[/foo]
[foo attr/]bar][[foo attr/]bar
[foo attr/]bar][[foo attr]bar
[foo attr/]bar][foo attr]bar[/foo]]
[foo attr/]bar][foo attr/]]bar[/foo]
[foo attr/]bar][foo attr/]bar]
[foo attr/]bar][foo attr]bar]
[foo attr/]bar][[foo attr]bar[/foo]]
[foo attr/]bar][[foo attr/]]bar[/foo]
[foo attr/]bar][[foo attr/]]bar
[foo attr/]bar][[foo attr]]bar
[foo attr]bar]
[foo attr]bar][foo attr]bar[/foo]
[foo attr]bar][foo attr/]bar[/foo]
[foo attr]bar][foo attr/]bar
[foo attr]bar][foo attr]bar
[foo attr]bar][[foo attr]bar[/foo]
[foo attr]bar][[foo attr/]bar[/foo]
[foo attr]bar][[foo attr/]bar
[foo attr]bar][[foo attr]bar
[foo attr]bar][foo attr]bar[/foo]]
[foo attr]bar][foo attr/]]bar[/foo]
[foo attr]bar][foo attr/]bar]
[foo attr]bar][foo attr]bar]
[foo attr]bar][[foo attr]bar[/foo]]
[foo attr]bar][[foo attr/]]bar[/foo]
[foo attr]bar][[foo attr/]]bar
[foo attr]bar][[foo attr]]bar
[[foo attr]bar[/foo]]
[[foo attr]bar[/foo]][foo attr]bar[/foo]
[[foo attr]bar[/foo]][foo attr/]bar[/foo]
[[foo attr]bar[/foo]][foo attr/]bar
[[foo attr]bar[/foo]][foo attr]bar
[[foo attr]bar[/foo]][[foo attr]bar[/foo]
[[foo attr]bar[/foo]][[foo attr/]bar[/foo]
[[foo attr]bar[/foo]][[foo attr/]bar
[[foo attr]bar[/foo]][[foo attr]bar
[[foo attr]bar[/foo]][foo attr]bar[/foo]]
[[foo attr]bar[/foo]][foo attr/]]bar[/foo]
[[foo attr]bar[/foo]][foo attr/]bar]
[[foo attr]bar[/foo]][foo attr]bar]
[[foo attr]bar[/foo]][[foo attr]bar[/foo]]
[[foo attr]bar[/foo]][[foo attr/]]bar[/foo]
[[foo attr]bar[/foo]][[foo attr/]]bar
[[foo attr]bar[/foo]][[foo attr]]bar
[[foo attr/]]bar[/foo]
[[foo attr/]]bar[/foo][foo attr]bar[/foo]
[[foo attr/]]bar[/foo][foo attr/]bar[/foo]
[[foo attr/]]bar[/foo][foo attr/]bar
[[foo attr/]]bar[/foo][foo attr]bar
[[foo attr/]]bar[/foo][[foo attr]bar[/foo]
[[foo attr/]]bar[/foo][[foo attr/]bar[/foo]
[[foo attr/]]bar[/foo][[foo attr/]bar
[[foo attr/]]bar[/foo][[foo attr]bar
[[foo attr/]]bar[/foo][foo attr]bar[/foo]]
[[foo attr/]]bar[/foo][foo attr/]]bar[/foo]
[[foo attr/]]bar[/foo][foo attr/]bar]
[[foo attr/]]bar[/foo][foo attr]bar]
[[foo attr/]]bar[/foo][[foo attr]bar[/foo]]
[[foo attr/]]bar[/foo][[foo attr/]]bar[/foo]
[[foo attr/]]bar[/foo][[foo attr/]]bar
[[foo attr/]]bar[/foo][[foo attr]]bar
[[foo attr/]]bar
[[foo attr/]]bar[foo attr]bar[/foo]
[[foo attr/]]bar[foo attr/]bar[/foo]
[[foo attr/]]bar[foo attr/]bar
[[foo attr/]]bar[foo attr]bar
[[foo attr/]]bar[[foo attr]bar[/foo]
[[foo attr/]]bar[[foo attr/]bar[/foo]
[[foo attr/]]bar[[foo attr/]bar
[[foo attr/]]bar[[foo attr]bar
[[foo attr/]]bar[foo attr]bar[/foo]]
[[foo attr/]]bar[foo attr/]]bar[/foo]
[[foo attr/]]bar[foo attr/]bar]
[[foo attr/]]bar[foo attr]bar]
[[foo attr/]]bar[[foo attr]bar[/foo]]
[[foo attr/]]bar[[foo attr/]]bar[/foo]
[[foo attr/]]bar[[foo attr/]]bar
[[foo attr/]]bar[[foo attr]]bar
[[foo attr]]bar
[[foo attr]]bar[foo attr]bar[/foo]
[[foo attr]]bar[foo attr/]bar[/foo]
[[foo attr]]bar[foo attr/]bar
[[foo attr]]bar[foo attr]bar
[[foo attr]]bar[[foo attr]bar[/foo]
[[foo attr]]bar[[foo attr/]bar[/foo]
[[foo attr]]bar[[foo attr/]bar
[[foo attr]]bar[[foo attr]bar
[[foo attr]]bar[foo attr]bar[/foo]]
[[foo attr]]bar[foo attr/]]bar[/foo]
[[foo attr]]bar[foo attr/]bar]
[[foo attr]]bar[foo attr]bar]
[[foo attr]]bar[[foo attr]bar[/foo]]
[[foo attr]]bar[[foo attr/]]bar[/foo]
[[foo attr]]bar[[foo attr/]]bar
[[foo attr]]bar[[foo attr]]bar

#39 @Denis-de-Bernardy
15 years ago

Brusdeylins: be so kind to try the patch we'll attach to #10082 when it's ready. I'll be based on the following tests:

http://core.trac.wordpress.org/attachment/ticket/10082/shortcode-tests.php

#41 @hakre
15 years ago

The test-data should be extended to reflect multi-line shortcodes as well. according to some reports, there are issues as well and AFAIK regexes are aware of line endings (when commanded). Therefore looks reasonable to test that as well.

#42 @Brusdeylins
15 years ago

Hi all,

I tested the Patch #10082, but it don't fix my problem.
My long article don't appear with this DIFF.

Sorry.
Changed back to my solution :)

#43 @xibe
15 years ago

  • Keywords has-patch needs-testing added; needs-patch removed

Sorry I didn't notice earlier there was a patch for this.
I applied it to my current trunk, and in my case it does work!

Before applying patch:

  • post updated in HTML mode would display fine in both editor and online
  • same post updated in Visual mode would become blank

After applying patch:

  • In both case, the post does get published, and is still editable in Visual mode after clicking the update button.

For the sake of helping other test my conditions, I'm adding it to Trac, zipped. In current 2.9 trunk, copy the code into the HTML editor, publish & check online presence ; switch to Visual editor, update & witness blankness both in editor and online.

Note: I don't seem to have an issue with Brusdeylins' test post, even without applying the patch, which sounds weird to me.

@xibe
15 years ago

xibe's long post

#44 @jamescollins
15 years ago

We have encountered this on several WordPress installations after upgrading PHP from v5.1.6 to 5.2.10.

The large pages (which contain various shortcodes) have no output displayed on the page when it is viewed.

Our pcre.backtrack_limit is set to the default value (100000).

Adding the following to wp-config.php allows the pages to be parsed successfully.

ini_set('pcre.backtrack_limit', 1000000);

Although I'm not sure of the side effects of increasing this value.

#45 @hakre
15 years ago

Side effects are probably an increase of memory usage and/or CPU load.

#46 @usermrpapa
15 years ago

  • Cc usermrpapa added

We too have several users who have encountered this bug using shortcodes on posts/pages with a lot of content (or images)... increasing the backtrack limit does mitigate the problem (at least for now), but it would be great if this could be optimized to not require the increased limit...

#47 follow-up: @Brusdeylins
15 years ago

This Problem still exists in WP 2.8.6.
My solution in ./wp-includes/shortcodes.php :

function get_shortcode_regex() {
  global $shortcode_tags;
  $tagnames = array_keys($shortcode_tags);
  $tagregexp = join( '|', array_map('preg_quote', $tagnames) );
  
  // defect for long posts
  // return '(.?)\[('.$tagregexp.')\b(.*?)(?:(\/))?\](?:(.+?)\[\/\2\])?(.?)'; 
	
  // solution for brusdeylins.info
  return '(.?)\[('.$tagregexp.')\b([^\]]*?)(\/)?\](?:(.+?)\[\/\1\])?(.?)'; 	
}

#48 @ryan
15 years ago

  • Milestone changed from 2.9 to 3.0

#49 in reply to: ↑ 47 @jalenack
15 years ago

Replying to Brusdeylins:

This Problem still exists in WP 2.8.6.
My solution in ./wp-includes/shortcodes.php :

function get_shortcode_regex() {
  global $shortcode_tags;
  $tagnames = array_keys($shortcode_tags);
  $tagregexp = join( '|', array_map('preg_quote', $tagnames) );
  
  // defect for long posts
  // return '(.?)\[('.$tagregexp.')\b(.*?)(?:(\/))?\](?:(.+?)\[\/\2\])?(.?)'; 
	
  // solution for brusdeylins.info
  return '(.?)\[('.$tagregexp.')\b([^\]]*?)(\/)?\](?:(.+?)\[\/\1\])?(.?)'; 	
}

I was suffering from this issue in 2.8.x and 2.9 on several posts (7600 words with images etc). I added the pcre ini_set('pcre.backtrack_limit', 1000000); and that fixed the problem. Wordpress was taking a really long time to save so I tried using the above get_shortcode_regex() code, and it broke my [caption] shortcode that was generated by visual mode:

[caption id="attachment_9522" align="alignright" width="350" caption="___desc__"]<a href="__url__"><img class="size-full wp-image-9522" title="___desc___" src="____url___" alt="___desc___" width="350" height="351" /></a>[/caption]

broke as in failed to parse and left a hanging [/caption]. I didn't try any other patches besides the pcre.backtrack_limit part, which did fix my problem.

#50 @mrmist
15 years ago

I didn't try any other patches besides the pcre.backtrack_limit part, which did fix my problem.

Yeah but increasing the backtrack limit is not an actual fix, just a workaround. This ticket has twisted on for a while now but I reckon there is stuff in here that could probably fix it.

#51 @nacin
14 years ago

  • Milestone changed from 3.0 to Future Release

No activity for months. Punting.

#52 @Denis-de-Bernardy
14 years ago

  • Milestone changed from Future Release to 3.0

Looks like this is occurring on wordpress.com....

See: http://hakre.wordpress.com/

#53 @Denis-de-Bernardy
14 years ago

  • Severity changed from normal to blocker

#54 @aaroncampbell
14 years ago

It looks like all his posts are coming back as empty. They can't all be that big, right?

#55 @aaroncampbell
14 years ago

Well, now they're all showing, and none of them seem anywhere near big enough to have triggered this bug.

#56 @nacin
14 years ago

  • Milestone changed from 3.0 to Future Release
  • Severity changed from blocker to normal

It was an unrelated wp.com bug.

#57 @holizz
14 years ago

  • Cc tom@… added

#58 @mrmist
14 years ago

I was inspired to check this out again with the sample text I posted 22 months ago, and the other sample text added to the ticket and it all now worksforme on stock 3.0.1 and trunk. (Without patch)

#59 @msafi
13 years ago

I think ini_set('pcre.backtrack_limit', 1000000); can also be set in functions.php of the theme or a plugin file—it doesn't have to be in wp-config.php.

I set mine to 200000 instead of 1000000 and it did the trick.

How come there's no fix for this yet? It wasn't easy tracing down the source of this problem!

#60 @mdawaffe
13 years ago

Is this a dupe of #15600?

If not, is this still a bug as of [18952]?

#61 @aaroncampbell
13 years ago

Yes, it looks #15600 fixed get_shortcode_regex() which is where the issue was for this ticket. I don't actually have a test case for this one at the moment, but I think it was fixed. I'm fine with closing at this point.

#62 follow-up: @xibe
13 years ago

Well, according to my aforementioned long post (using the latest nightly), this is still an issue in some cases.

Last edited 13 years ago by xibe (previous) (diff)

@aaroncampbell
13 years ago

Long post content with gallery and caption shortcodes

#63 in reply to: ↑ 62 @aaroncampbell
13 years ago

Replying to xibe:

Well, according to my aforementioned long post (using the latest nightly), this is still an issue in some cases.

Are you sure it's the same problem (pcre.backtrack_limit)?

I created a test post that uses the gallery and caption shortcodes (about 1500-1600 shortcodes total). I added the post to a 3.2.1 install and a trunk install. On the trunk install it displays as expected and on the 3.2.1 install it shows as blank (which is what was happening when this ticket was filed).

I think this is fixed.

#64 @xibe
13 years ago

You're very right, I mixed the issues in my head, sorry :/

#65 @nacin
13 years ago

  • Milestone changed from Future Release to 3.3
  • Resolution set to fixed
  • Status changed from new to closed
Note: See TracTickets for help on using tickets.