Make WordPress Core

Opened 8 months ago

Closed 8 months ago

Last modified 8 months ago

#60320 closed defect (bug) (fixed)

JSON is corrupted when output via wp_get_inline_script_tag() and HTML5 theme support absent

Reported by: westonruter's profile westonruter Owned by: westonruter's profile westonruter
Milestone: 6.5 Priority: normal
Severity: normal Version: 6.4
Component: Script Loader Keywords: has-patch has-unit-tests
Focuses: javascript Cc:

Description

The Speculation Rules plugin as part of Performance Lab outputs JSON in a script tag via:

<?php
wp_print_inline_script_tag(
        wp_json_encode( $rules ),
        array( 'type' => 'speculationrules' )
);

We found that when html5 theme support is absent (specifically scripts), this results in the following being printed:

<script type="speculationrules">
/* <![CDATA[ */
[]
/* ]]> */
</script>

Note the CDATA wrappers which are added for XML/XHTML compatibility. These break the JSON parsing. (Note in #59883 that these CDATA comments are also proposed to be eliminated entirely with the removal of legacy XHTML compat.) Ironically, wp_get_inline_script_tag()/wp_print_inline_script_tag() didn't actually used to print the CDATA wrapper comments. They were missing until 6.4 in which they were "fixed" (ahem, by me) in [56687] as part of #58664.

This will also be an issue for the new script modules in 6.4 (#56313). When HTML5 theme support is absent, an import map script is currently printed as:

<script type="importmap" id="wp-importmap">
/* <![CDATA[ */
{"imports":{"bar":"http:\/\/localhost:10023\/bar.js?ver=6.5-alpha-57321"}}
/* ]]> */
</script>

Example plugin code to reproduce:

<?php
add_action(
        'after_setup_theme',
        static function () {
                remove_theme_support( 'html5' );
        },
        PHP_INT_MAX
);

add_action(
        'wp_enqueue_scripts',
        static function () {
                wp_register_module( 'bar', home_url( '/bar.js' ) );
                wp_enqueue_module( 'foo', home_url( '/foo.js' ), array( 'bar' ) );
        }
);

In short, the CDATA wrapper comments need to only ever be printed when the type of the script will be JavaScript. They should be omitted for every other type (e.g. importmap, speculationrules, application/json).

Change History (10)

This ticket was mentioned in PR #5925 on WordPress/wordpress-develop by @westonruter.


8 months ago
#1

  • Keywords has-patch has-unit-tests added; needs-patch removed

@dmsnell commented on PR #5925:


8 months ago
#2

The only situation where we could break out of a SCRIPT tag is when the JS contents include the string </script and then some content and then >. Would it make more sense simply to rip out the fake CDATA?

We would only need it I think in XML circumstances, meaning inside RSS feeds, which don't get SCRIPT elements.

Since these pages are truly delivering HTML5 even without theme support, this seems a bit of a practical choice. We could reject JavaScript code containing </script, or more robustly check for content that would exit the SCRIPT element and reject it. Reject, because we can't know how to split it, if it's in a JavaScript string, comment, etc…

@westonruter commented on PR #5925:


8 months ago
#3

@dmsnell AFAIK the only purpose of the CDATA wrappers was to prevent scripts from breaking XML when they use <, for example in an if statement. The CDATA wrapper allows inline scripts to use a literal < instead of &lt; when the browser is in XML parsing mode. I don't think the intention was ever to guard against a script breaking out.

So I think more protection against accidentally (or not) breaking out of a script would be a good idea, but should probably be done in a separate ticket. And since json_encode() escapes slashes by default, I think this is already usually mitigated.

@dmsnell commented on PR #5925:


8 months ago
#4

@westonruter "AFAIK the only purpose of the CDATA wrappers was to prevent scripts from breaking XML when they use <, for example in an if statement." this was my point exactly. I'm suggesting that given how near-impossible it is to trigger the XML parsing mode within WordPress, this code isn't practically necessary and the breakage it introduces more than outweighs a hypothetical breakage we could see without any of it.

In fact, on the pages this serves, the CDATA wrappers _are not_ letting inline scripts use &lt; or <. Scripts may already use < and &lt; will come across as the verbatim four-character string in the script. That is, based on your web querying and what I've found in my own tests, adding CDATA is more harmful than removing it.

Thus my question is _should we rip it all out_ since it's not fulfilling the purpose for which it's there? (Okay technically it's fulfilling the purpose _if_ and only if the page is properly sent as XML, which you confirmed "it is not." In those cases the burden is on the extender to properly escape their own JavaScript code. This affects _a fraction_ of the less-than 0.0001% of sites serving XML that also contain script data that would break without the CDATA. On the other hand, it seems like it's breaking a noticeable number of normative sites where the CDATA's presence or absence has zero impact on how the SCRIPT contents are interpreted.

@westonruter commented on PR #5925:


8 months ago
#5

I think you're right that the CDATA wrappers should be removed. But maybe we should do it in a follow-up ticket so there is more visibility?

@westonruter commented on PR #5925:


8 months ago
#6

Something else we should do is make it clear that these script functions are not exclusively for JavaScript. Currently it explicitly is mentioning "JavaScript" in the function description and in the initial argument.

@flixos90 commented on PR #5925:


8 months ago
#7

+1 to discussing removal of CDATA wrappers separately. This PR is about a bug with the CDATA wrappers rather than removing them.

#8 @westonruter
8 months ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 57341:

Script Loader: Only emit CDATA wrapper comments in wp_get_inline_script_tag() for JavaScript.

This avoids erroneously adding CDATA wrapper comments for non-JavaScript scripts, including those for JSON such as the importmap for script modules in #56313.

Props westonruter, flixos90, mukesh27, dmsnell.
See #56313.
Fixes #60320.

@westonruter commented on PR #5925:


8 months ago
#9

Committed in r57341 (5139923)

@westonruter commented on PR #5925:


8 months ago
#10

Something else we should do is make it clear that these script functions are not exclusively for JavaScript. Currently it explicitly is mentioning "JavaScript" in the function description and in the initial argument.

See Core-60331.

Note: See TracTickets for help on using tickets.