Make WordPress Core

Opened 6 years ago

Last modified 2 months ago

#44571 new defect (bug)

force_balance_tags breaks JavaScript

Reported by: yellowafterlife's profile yellowafterlife Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 4.9.7
Component: Shortcodes Keywords:
Focuses: Cc:

Description

Input:

<?php
echo force_balance_tags('<script>console.log("extest:", 0<=1);</script>')
?>

Output:

<script>console.log("extest:", 0< =1);</script></script>

(no longer valid JavaScript)

Context:

I use tiny (<20KB) JS programs for interactive illustrations inside tutorials on a WP blog. These would most commonly break on pages where excerpts are shown (post list, search) and it took a little while to narrow this down to force_balance_tags running for !--more splitter in get_the_content.

Workarounds: enclosing JS code in a comment

<script><!-- code --></script>

which is a legacy part of spec https://html.spec.whatwg.org/multipage/scripting.html#restrictions-for-contents-of-script-elements

Could the script-tags be treated like the comments in this regard? This is a rather unobvious behaviour.

Attachments (1)

44571.tests.diff (1.1 KB) - added by soulseekah 6 years ago.
Tests for force_balance_tags

Download all attachments as: .zip

Change History (3)

#1 @soulseekah
6 years ago

This is a very common issue among formatting functions in WordPress, wherein "stray" brackets break tag detection. This happens in scripts, styles and even regular HTML attributes with angle brackets in them. There are a bunch of tickets open regarding this.

The issue here is that <=1);</script> is detected as a tag.

Also there are no unit tests. Let's start writing some.

@soulseekah
6 years ago

Tests for force_balance_tags

#2 @dmsnell
2 months ago

Hi @soulseekah - I'm exploring replacing the logic from force_balance_tags() with the new WP_HTML_Processor::normalize() and it seems to handle the cases you've raised here.

<?php
echo force_balance_tags( '<script>console.log("extest:", 0<=1);</script>' );
// <script>console.log("extest:", 0<=1);</script>

echo force_balance_tags( '<p>hello' );
// <p>hello</p>

echo force_balance_tags( '<p>hello</p>' );
// <p>hello</p>

since the logic is based on the HTML API it should be robust regardless of the input. no more guessing! as of WordPress 6.7.0 there will see be a marginal number of input HTML documents not supported by the HTML API; these will fall back to the legacy behavior of force_balance_tags() (or perhaps we'll replace the custom parsing with the Tag Processor which is also based on the HTML spec but doesn't know HTML structure).

https://github.com/WordPress/wordpress-develop/pull/7409

Note: See TracTickets for help on using tickets.