WordPress.org

Make WordPress Core

Opened 2 years ago

Closed 7 months ago

Last modified 7 months ago

#19354 closed defect (bug) (wontfix)

wp_allowed_protocols() does not allow data URI scheme

Reported by: hardy101 Owned by:
Milestone: Priority: normal
Severity: normal Version: 3.2.1
Component: General Keywords: dev-feedback has-patch
Focuses: Cc:

Description

When inserting images into a post via copy-paste, Firefox will paste a base64 text string (using the Data URI scheme) into the post editor. The result will look something like:

<img src="data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAUA
AAAFCAYAAACNbyblAAAAHElEQVQI12P48/w38GIAXDIBKE0DHxgljNBAAO
9TXL0Y4OHwAAAABJRU5ErkJggg==" alt="Red dot">

When the post is saved, the "data:" portion of the src attribute is stripped away by wp_kses_hair() via the line:

if ( in_array(strtolower($attrname), $uris) )

$thisval = wp_kses_bad_protocol($thisval, $allowed_protocols);

"data:" is treated as a protocol prefix, and is not seen as part of the src attribute.

To reproduce this error, try the following in Firefox:

1) Do a Google image search for a rendom image.
2) Right-click -> "Copy Image"
3) Paste into rich text editor
4) Save post
5) View HTML tab of the editor and notice that the "data:" scheme has been removed.

A side effect of this issue is that the image src is treated as a relative image path on the server (in subdirectory "image/png" with long string of characters as the "file name." The server will typically report an error in its log file about the request length of the URI being too long.

Attachments (2)

19354.diff (639 bytes) - added by solarissmoke 2 years ago.
Allow data: protocol
19354.2.patch (1.1 KB) - added by kurtpayne 2 years ago.
Adding ini_set for pcre backtrack limit

Download all attachments as: .zip

Change History (14)

comment:1 hardy1012 years ago

  • Summary changed from wMulti-site wp_kses_hair() strips "data:" from base64-encoded images pasted into rich editior with Data URI scheme to Multi-site wp_kses_hair() strips "data:" from base64-encoded images pasted into rich editior with Data URI scheme

solarissmoke2 years ago

Allow data: protocol

comment:2 solarissmoke2 years ago

  • Keywords has-patch added; needs-patch removed
  • Summary changed from Multi-site wp_kses_hair() strips "data:" from base64-encoded images pasted into rich editior with Data URI scheme to wp_allowed_protocols() does not allow data URI scheme

Happens in single-site too. wp_allowed_protocols() doesn't currently allow data: as a protocol.

comment:3 follow-up: kurtpayne2 years ago

  • Cc kpayne@… added

I was not able to duplicate your results in 3.3 RC1 (single and multisite).

I used FF 8. The pasted image was interpreted as a data uri, but "data:" portion was not stripped when the post was saved.

Can you still reproduce this on 3.3?

comment:4 in reply to: ↑ 3 solarissmoke2 years ago

Replying to kurtpayne:

Can you still reproduce this on 3.3?

Yes. You need to test as a user without unfiltered HTML capability - i.e., not an administrator/editor.

comment:5 kurtpayne2 years ago

In my testing, I encountered an image string that was too large for the regex to handle. I was getting a PREG_BACKTRACK_LIMIT_ERROR from a 26K string. The php documentation states that the default value for pcre.backtrack_limit is 100000 (1 million), but the stock installs of php I've tested show it to be 100000 (one-hundred thousand). Raising the backtrack limit via ini_set() allow the code to work on the test string.

I was able to duplicate the original problem. After applying patch 19354.diff, I was able to embed an image as an unprivileged author that survived saving.

kurtpayne2 years ago

Adding ini_set for pcre backtrack limit

comment:6 SergeyBiryukov2 years ago

Related/duplicate: #19886

comment:7 azizur2 years ago

  • Cc azizur added

comment:8 helenyhou2 years ago

#19886 was closed as a dupe, as this ticket has patches, but it was assigned to the 3.4 milestone. May want to move this one if it's still a candidate, but I'll leave that to somebody else.

Last edited 2 years ago by helenyhou (previous) (diff)

comment:9 SergeyBiryukov8 months ago

  • Component changed from Editor to General
  • Milestone changed from Awaiting Review to 3.7

Related: #25147

comment:10 follow-up: nacin8 months ago

I'm not sure we can trust the contents of the data URI scheme. Anyone have any links/white papers arguing either way?

comment:11 in reply to: ↑ 10 duck_7 months ago

  • Milestone 3.7 deleted
  • Resolution set to wontfix
  • Status changed from new to closed

Replying to nacin:

I'm not sure we can trust the contents of the data URI scheme. Anyone have any links/white papers arguing either way?

Nope, we cannot trust data URIs.

Backup evidence: "several pseudo-schemes exist specifically to enable scripting or URL-contained data rendering in the security context inherited from the caller" from https://code.google.com/p/browsersec/wiki/Part1#Pseudo_URL_schemes

comment:12 nacin7 months ago

In 25301:

wp_allowed_protocols() should not contain 'data', as it is not safe. see #19354.

Note: See TracTickets for help on using tickets.