WordPress.org

Make WordPress Core

Opened 8 months ago

Last modified 6 months ago

#25851 new defect (bug)

post_content lost when inserting Posts with large base64-encoded images

Reported by: ctayloroomphinc Owned by:
Milestone: Future Release Priority: low
Severity: critical Version: 3.8
Component: Formatting Keywords:
Focuses: Cc:

Description

post_content will be silently blanked-out during a wp_insert_post() call that contains large base64-encoded images within Post HTML content.

I've found the source of this problem to be the wp_pre_kses_less_than() function. In the scenario described above, the call to preg_replace_callback() fails, causing a NULL to be returned from wp_pre_kses_less_than(), instead of a string. In this case preg_replace_callback() returns NULL because of a PREG_BACKTRACK_LIMIT_ERROR.

The error can be worked around by setting a higher pcre.backtrack_limit in php.ini, but I still consider this to be a bug because the wp_pre_kses_less_than() should at least emit a warning to the debug.log informing someone that there was an internal PHP error (silent in this case, regardless of error_reporting settings).

Beyond this, the function should check the return value of preg_replace_callback(), and if it is NULL, perhaps consider returning the original string as it was passed by the caller, so the caller does not end up with unexpectedly empty post_content.

Another option would be to look in to tweaking the regex so that it does need exceed the default resources as configured by pcre.backtrack_limit, which is 100000 by default.

More info: http://www.php.net/manual/en/pcre.configuration.php#ini.pcre.backtrack-limit

I have attached a file that shows how to re-produce this bug. Run the file via WP-CLI like so:

wp eval-file wp_pre_kses_less_than_bug_repro.php

I'm setting the Severity of this bug to critical, since it results in unexpected data loss.

Attachments (1)

wp_pre_kses_less_than_bug_repro.php (588.5 KB) - added by ctayloroomphinc 8 months ago.

Download all attachments as: .zip

Change History (2)

comment:1 nacin6 months ago

  • Milestone changed from Awaiting Review to Future Release
  • Priority changed from normal to low

Thanks for the report, Corey. This is a tough one. It would be great to optimize kses, but given its nature (security), it isn't easy to do. Many have tried, only a few have come out alive.

One workaround would be to run wp_insert_post() as a user privileged enough to not have kses run. Or simply kses_remove_filters() followed by a kses_init(). If the data is trusted, of course.

I will say, that's a *ton* of data (and it's looking through to find exploitative protocols, among other things), but it's possible that for certain kinds of data, we can simply bypass it due to a whitelist. For example, at some point, all we have to deal with are 64 characters, none of them dangerous. It could be one approach to take. I could also totally go for increasing the backtrack limit if we can't come up with a better way to do it (we've done it elsewhere) — out of curiosity, how much did you need to increase it to get it to work?

Note: See TracTickets for help on using tickets.