Opened 18 months ago

Last modified 14 months ago

#19354 new defect (bug)

wp_allowed_protocols() does not allow data URI scheme

Reported by: hardy101 Owned by:
Priority: normal Milestone: Awaiting Review
Component: Editor Version: 3.2.1
Severity: normal Keywords: dev-feedback has-patch
Cc: kpayne@…, azizur

Description

When inserting images into a post via copy-paste, Firefox will paste a base64 text string (using the Data URI scheme) into the post editor. The result will look something like:

<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA
AAAFCAYAAACNbyblAAAAHElEQVQI12P48/w38GIAXDIBKE0DHxgljNBAAO
9TXL0Y4OHwAAAABJRU5ErkJggg==" alt="Red dot">

When the post is saved, the "data:" portion of the src attribute is stripped away by wp_kses_hair() via the line:

if ( in_array(strtolower($attrname), $uris) )

$thisval = wp_kses_bad_protocol($thisval, $allowed_protocols);

"data:" is treated as a protocol prefix, and is not seen as part of the src attribute.

To reproduce this error, try the following in Firefox:

1) Do a Google image search for a rendom image.
2) Right-click -> "Copy Image"
3) Paste into rich text editor
4) Save post
5) View HTML tab of the editor and notice that the "data:" scheme has been removed.

A side effect of this issue is that the image src is treated as a relative image path on the server (in subdirectory "image/png" with long string of characters as the "file name." The server will typically report an error in its log file about the request length of the URI being too long.

Attachments (2)

19354.diff (639 bytes) - added by solarissmoke 18 months ago.
Allow data: protocol
19354.2.patch (1.1 KB) - added by kurtpayne 18 months ago.
Adding ini_set for pcre backtrack limit

Download all attachments as: .zip

Change History (10)

  • Summary changed from wMulti-site wp_kses_hair() strips "data:" from base64-encoded images pasted into rich editior with Data URI scheme to Multi-site wp_kses_hair() strips "data:" from base64-encoded images pasted into rich editior with Data URI scheme

Allow data: protocol

  • Keywords has-patch added; needs-patch removed
  • Summary changed from Multi-site wp_kses_hair() strips "data:" from base64-encoded images pasted into rich editior with Data URI scheme to wp_allowed_protocols() does not allow data URI scheme

Happens in single-site too. wp_allowed_protocols() doesn't currently allow data: as a protocol.

comment:3 follow-up: ↓ 4   kurtpayne18 months ago

  • Cc kpayne@… added

I was not able to duplicate your results in 3.3 RC1 (single and multisite).

I used FF 8. The pasted image was interpreted as a data uri, but "data:" portion was not stripped when the post was saved.

Can you still reproduce this on 3.3?

comment:4 in reply to: ↑ 3   solarissmoke18 months ago

Replying to kurtpayne:

Can you still reproduce this on 3.3?

Yes. You need to test as a user without unfiltered HTML capability - i.e., not an administrator/editor.

In my testing, I encountered an image string that was too large for the regex to handle. I was getting a PREG_BACKTRACK_LIMIT_ERROR from a 26K string. The php documentation states that the default value for pcre.backtrack_limit is 100000 (1 million), but the stock installs of php I've tested show it to be 100000 (one-hundred thousand). Raising the backtrack limit via ini_set() allow the code to work on the test string.

I was able to duplicate the original problem. After applying patch 19354.diff, I was able to embed an image as an unprivileged author that survived saving.

Adding ini_set for pcre backtrack limit

Related/duplicate: #19886

  • Cc azizur added

#19866 was closed as a dupe, as this ticket has patches, but it was assigned to the 3.4 milestone. May want to move this one if it's still a candidate, but I'll leave that to somebody else.

Version 0, edited 14 months ago by helenyhou (next)
Note: See TracTickets for help on using tickets.