Opened 10 years ago
Last modified 5 years ago
#31695 new defect (bug)
Enclosing oEmbed using `embed_oembed_html` is generating invalid HTML
Reported by: | vaurdan | Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | |
Component: | Formatting | Keywords: | needs-unit-tests wpautop needs-patch |
Focuses: | Cc: |
Description
This bug was detected when trying to preview a post that had a Twitter oembed URL. There was no error on the log, and the previewed content was empty ( the_content()
was printing an empty string ).
After some debugging, I found that something was adding an extra <p>
and never closing that tag, on the oembed returned markup, right before <script>
:
<blockquote class="twitter-tweet" width="550"> <p>Lisbon with #WordPressVIP @ Terreiro do Paço http://instagram.com/p/vGw5IchocB/ </p> <p>— Henrique Mouta (@vaurdan) <a href="https://twitter.com/vaurdan/status/530762912462557185">November 7, 2014</a></blockquote> <p><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script></div>
I did some tracing and found out that it was wpautop
fault. Even though WP_oEmbed::_strip_newlines()
was successfully stripping the new lines from the returned oEmbed HTML, something was adding an extra newline before the <script>
tag.
$allblocks = '(?:table|thead|tfoot|caption|col|colgroup|tbody|tr|td|th|div|dl|dd|dt|ul|ol|li|pre|form|map|area|blockquote|address|math|style|p|h[1-6]|hr|fieldset|legend|section|article|aside|hgroup|header|footer|nav|figure|details|menu|summary)'; $pee = preg_replace('!(<' . $allblocks . '[^>]*>)!', "\n$1", $pee); $pee = preg_replace('!(</' . $allblocks . '>)!', "$1\n\n", $pee);
And that was happening because <blockquote>
is on the $allblocks
, making a newline between blockquote
and script
.
To fix this, I used the same logic that wpautop
uses for <object>
, <option>
, <source>
, and <track>
: have no <p>
or <br/>
around the <script>
tag:
if ( strpos( $pee, '<script' ) !== false ) { // no P/BR around script $pee = preg_replace( '|\s*<script|', '<script', $pee ); $pee = preg_replace( '|</script>\s*|', '</script>', $pee ); }
Now the oembed is fully functional and doesn't have that unclosed <p>
tag:
<blockquote class="twitter-tweet" width="550"> <p>Lisbon with #WordPressVIP @ Terreiro do Paço http://instagram.com/p/vGw5IchocB/ </p> <p>— Henrique Mouta (@vaurdan) <a href="https://twitter.com/vaurdan/status/530762912462557185">November 7, 2014</a></blockquote><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script></div>
Attachments (2)
Change History (21)
#1
@
10 years ago
- Keywords needs-unit-tests added
Thanks for the report and initial patch @vaurdan.
Was this something you found in the current trunk version or in 4.2 beta1?
This Is something that we will need unit tests for. wpautop
can be a bit of a beast, so making sure that it has automated tests needs to be a high priority.
#2
@
10 years ago
Hey @jorbin,
Sorry, that diff related to the WordPress.com codebase, I'm attaching the trunk patch right now.
Regarding the unit testing, I didn't wrote any new testing, but running phpunit
passes all the tests:
OK, but incomplete or skipped tests! Tests: 3266, Assertions: 12636, Skipped: 34.
Should I write an unit test for this?
#3
@
10 years ago
Actually, my patch isn't fixing the issue ( props to @paulschreiber for the heads up ). There is still an unclosed <p>
tag.
I'm trying to fix this by adding a regex that will search for a <p>
that doesn't have a correspondent </p>
tag, and strip the tag.
Do you think this could be the best way?
#6
@
9 years ago
- Keywords needs-patch added
I've been trying to work on a fix for this during the last month, but I couldn't find a way to do it. Can someone try to make a patch for it?
#7
@
9 years ago
After some testing, we're able to discover what was making the closing </p> to disappear. We've found that if the user has Jetpack, and the theme has support for jetpack-responsive-videos
, like twentyfourteen have, the closing </p>
tag would disappear from the Twitter oEmbed HTML, making it invalid (as reported above).
This is happening here: https://github.com/Automattic/jetpack/blob/a5d57deb765944f207edc6c1c2482a43ede21101/modules/theme-tools/responsive-videos.php#L43
We can isolate only the "bad" code, and we will have this filter:
function break_embed( $html ) { return '<div> ' . $html . '</div>'; } add_filter( 'embed_oembed_html', 'break_embed' );
If you test with that filter, you will see that the </p>
tag will disappear, making the HTML invalid.
This ticket was mentioned in Slack in #core by henriquemouta. View the logs.
9 years ago
#9
@
9 years ago
I was able to reproduce the bug in the last trunk version, in a clean install with twentyfifteeen and no plugins. Here are the steps:
- Create a new post with a Twitter link.
- Save the post
- Check the HTML that was generated for the tweet. It will contain a valid HTML:
(...) <p><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script></p>
- Edit
tentyfifteen/functions.php
and add the following code:function break_embed( $html ) { return '<div> ' . $html . '</div>'; } add_filter( 'embed_oembed_html', 'break_embed' );
- Check the HTML again, you will see that it is now missing the final
</p>
:(...) <p><script async src="//platform.twitter.com/widgets.js" charset="utf-8"></script>
This ticket was mentioned in Slack in #core by henriquemouta. View the logs.
9 years ago
#11
follow-up:
↓ 12
@
9 years ago
I'm a little confused. You had to hack a theme file to reproduce a bug?
#12
in reply to:
↑ 11
@
9 years ago
Replying to miqrogroove:
I'm a little confused. You had to hack a theme file to reproduce a bug?
It's not a hack, it's just an example. That's just a piece of code that can be in any plugin - it was actually copy-pasted from JetPack - and that is making the twitter oEmbed HTML invalid. I've added it to the functions.php
just to exemplify.
Just because it happens with custom code, it doesn't mean it isn't a bug, right? :)
#13
follow-ups:
↓ 14
↓ 18
@
9 years ago
There are of course bugs in wpautop that need to be addressed. I think it's fair to say this is a valid core bug, but the reported symptoms are plugin related. We will prioritize accordingly.
#14
in reply to:
↑ 13
@
9 years ago
Replying to miqrogroove:
There are of course bugs in wpautop that need to be addressed. I think it's fair to say this is a valid core bug, but the reported symptoms are plugin related. We will prioritize accordingly.
Oh, yes, of corse! Thanks for working on it.
#15
@
9 years ago
Is there any update on this ticket? We've a few more clients from WordPress.com VIP reporting this issue.
#16
@
9 years ago
- Summary changed from Extra unclosed <p> tag before <script> breaking the_content() to Enclosing oEmbed using `embed_oembed_html` is generating invalid HTML
#17
@
9 years ago
We're planning to re-factor wpautop in 4.5. This should get some attention during that chaos :)
#18
in reply to:
↑ 13
@
9 years ago
The same problem happens when you add a Facebook video to a post. An unclosed <p>
is added before the <script>
tag. The following code:
<div id="fb-root"></div><script>(function(d, s, id) { var js, fjs = d.getElementsByTagName(s)[0]; if (d.getElementById(id)) return; js = d.createElement(s); js.id = id; js.src = "//connect.facebook.net/pt_BR/sdk.js#xfbml=1&version=v2.3"; fjs.parentNode.insertBefore(js, fjs);}(document, 'script', 'facebook-jssdk'));</script><div class="fb-video" data-allowfullscreen="1" data-href="/CatracaLivre/videos/vl.344619742400665/907235222646823/?type=3"><div class="fb-xfbml-parse-ignore"></div></div>
Becomes (note the added unclosed <p>
before <script>
on the second line):
<div id="fb-root"></div> <p><script>(function(d, s, id) { var js, fjs = d.getElementsByTagName(s)[0]; if (d.getElementById(id)) return; js = d.createElement(s); js.id = id; js.src = "//connect.facebook.net/pt_BR/sdk.js#xfbml=1&version=v2.3"; fjs.parentNode.insertBefore(js, fjs);}(document, 'script', 'facebook-jssdk'));</script> <div class="fb-video" data-allowfullscreen="1" data-href="/CatracaLivre/videos/vl.344619742400665/907235222646823/?type=3"> <div class="fb-xfbml-parse-ignore"></div>
Contrary to the case @vaurdan reported, this is happening on a clean WordPress install with no plugins enabled running the latest version from master.
Diff file with the patch