Make WordPress Core

Opened 10 years ago

Closed 10 years ago

Last modified 9 years ago

#31214 closed defect (bug) (fixed)

oEmbed: Newlines within <pre> tags get removed

Reported by: cweiske's profile cweiske Owned by: wonderboymusic's profile wonderboymusic
Milestone: 4.2 Priority: normal
Severity: normal Version: 4.1
Component: Embeds Keywords: has-patch
Focuses: Cc:

Description

An oEmbed's HTML content gets all \n newlines removed through

add_filter( 'oembed_dataparse', array($this, '_strip_newlines'), 10, 3 );

WP_oEmbed::strip_newlines() unfortunately removes every newline, even those within <pre> tags - but they are crucial there. My oEmbedded code pastes all only have a single line currently: http://fotostore.cweiske.de/screenshots/2015-02-02%20wordpress%20oembed%20newlines.png

Please fix this method to not remove newlines within <pre> tags.

Attachments (2)

31214.diff (1.2 KB) - added by wonderboymusic 10 years ago.
31214.2.diff (1.7 KB) - added by wonderboymusic 10 years ago.

Download all attachments as: .zip

Change History (15)

#1 @wonderboymusic
10 years ago

  • Keywords needs-patch added
  • Milestone changed from Awaiting Review to Future Release

#2 @wonderboymusic
10 years ago

  • Keywords has-patch added; needs-patch removed
  • Milestone changed from Future Release to 4.2

31214.diff does this if DOMDocument exists, which falls in line with other places that bow out if DOMDocument isn't available (please don't disable this, for crying out loud).

The tokens have to be sequential to account for HTML nodes that may be identical

#3 @wonderboymusic
10 years ago

  • Owner set to wonderboymusic
  • Resolution set to fixed
  • Status changed from new to closed

In 31415:

Protect <pre> tags when parsing oEmbed responses in WP_oEmbed::_strip_newlines() in DOMDocument is available.

Fixes #31214.

#4 @wonderboymusic
10 years ago

In 31416:

After [31415], make sure str_replace() only occurs once for each matched tag to avoid overwriting until <pre>s.

See #31214.

#5 @cweiske
10 years ago

The applied patch does not work when the <pre> tag contains HTML entities like &#160;.

$dom->saveHTML() replaces the entities with their real characters, which in turn leads to a failing following str_replace call.

Try to embed http://p.cweiske.de/158 - it is what I have the problem with.

Last edited 10 years ago by cweiske (previous) (diff)

#6 @wonderboymusic
10 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

#7 @cweiske
10 years ago

I tried to use html_entity_decode on the input HTML and the saveHTML()'ed HTML, but it seems that the saved <pre> tag is UTF-8, while the input html is a different encoding.

#8 @wonderboymusic
10 years ago

it's the DOMDocument internals, trying a few different things now...

#9 @wonderboymusic
10 years ago

  1. DOMDocument is awesome and completely sucks at the same time
  2. 31214.2.diff uses RegEx instead, please test.

#11 @wonderboymusic
10 years ago

In 31423:

Use RegEx instead of DOMDocument when protecting <pre> tags in WP_oEmbed::_strip_newlines(). It is incredibly difficult to maintain character encoding and whitespace when parsing via DOMDocument.

See #31214.

#12 @wonderboymusic
10 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

Calling this fixed - I have been filtering oembed responses for 3 weeks now and adding <pre> tags. All works as expected.

This ticket was mentioned in Slack in #feature-respimg by joemcgill. View the logs.


9 years ago

Note: See TracTickets for help on using tickets.