Make WordPress Core

Opened 6 years ago

Last modified 18 months ago

#47420 new defect (bug)

Block markup containing HTML in block attributes is corrupted when using wp_insert_post

Reported by: modernnerd's profile modernnerd Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version:
Component: General Keywords: 2nd-opinion
Focuses: Cc:

Description

Issue
Some blocks allow HTML in their block attributes, which display correctly in the editor and the front end. An example is the Pricing block in Atomic Blocks: https://wordpress.org/plugins/atomic-blocks/

If block content copied from the block editor as raw code contains HTML in its block attributes, inserting that content with wp_insert_post() results in corrupt blocks.

This appears to be due to the way block attributes like this:

{"price":"<strong>49</strong>","currency":"$","term":"/mo"}

Are encoded like this when saved:

{"price":"\u003cstrong\u003e49\u003c/strong\u003e","currency":"$","term":"/mo"}

To reproduce

  1. Install Atomic Blocks and activate Twenty Nineteen.
  2. Add this code to your theme's functions.php, refresh any page to trigger the code, then remove the code:
<?php
$post_content = <<<CONTENT
<!-- wp:atomic-blocks/ab-pricing -->
<div class="wp-block-atomic-blocks-ab-pricing ab-pricing-columns-2"><div class="ab-pricing-table-wrap ab-block-pricing-table-gap-2"><!-- wp:atomic-blocks/ab-pricing-table -->
<div class="wp-block-atomic-blocks-ab-pricing-table ab-block-pricing-table-center ab-block-pricing-table" itemscope itemtype="http://schema.org/Product"><div class="ab-block-pricing-table-inside" style="border-width:2px;border-style:solid"><!-- wp:atomic-blocks/ab-pricing-table-title {"title":"\u003cstrong\u003ePrice Title\u003c/strong\u003e","fontSize":"medium","paddingTop":30} -->
<div itemprop="name" style="padding-top:30px;padding-right:20px;padding-bottom:10px;padding-left:20px" class="wp-block-atomic-blocks-ab-pricing-table-title ab-pricing-table-title has-medium-font-size"><strong>Price Title</strong></div>
<!-- /wp:atomic-blocks/ab-pricing-table-title -->

<!-- wp:atomic-blocks/ab-pricing-table-subtitle {"subtitle":"Price Subtitle Description","customFontSize":20} -->
<div class="wp-block-atomic-blocks-ab-pricing-table-subtitle ab-pricing-table-subtitle" style="font-size:20px;padding-top:10px;padding-right:20px;padding-bottom:10px;padding-left:20px">Price Subtitle Description</div>
<!-- /wp:atomic-blocks/ab-pricing-table-subtitle -->

<!-- wp:atomic-blocks/ab-pricing-table-price {"price":"\u003cstrong\u003e49\u003c/strong\u003e","currency":"$","term":"/mo"} -->
<div class="wp-block-atomic-blocks-ab-pricing-table-price ab-pricing-table-price-wrap ab-pricing-has-currency" style="padding-top:10px;padding-right:20px;padding-bottom:10px;padding-left:20px"><div itemprop="offers" itemscope itemtype="http://schema.org/Offer"><span itemprop="priceCurrency" class="ab-pricing-table-currency" style="font-size:24px">$</span><div itemprop="price" class="ab-pricing-table-price" style="font-size:60px"><strong>49</strong></div><span class="ab-pricing-table-term" style="font-size:24px">/mo</span></div></div>
<!-- /wp:atomic-blocks/ab-pricing-table-price -->

<!-- wp:atomic-blocks/ab-pricing-table-features {"customFontSize":20,"paddingTop":15,"paddingBottom":15} -->
<ul itemprop="description" class="wp-block-atomic-blocks-ab-pricing-table-features ab-pricing-table-features ab-list-border-none ab-list-border-width-1" style="font-size:20px;padding-top:15px;padding-right:20px;padding-bottom:15px;padding-left:20px"><li>Product Feature One</li><li>Product Feature Two</li><li>Product Feature Three</li></ul>
<!-- /wp:atomic-blocks/ab-pricing-table-features -->

<!-- wp:atomic-blocks/ab-pricing-table-button {"buttonText":"Buy Now","buttonBackgroundColor":"#272c30","paddingTop":15,"paddingBottom":35} -->
<div class="wp-block-atomic-blocks-ab-pricing-table-button ab-pricing-table-button" style="padding-top:15px;padding-right:20px;padding-bottom:35px;padding-left:20px"><div class="ab-block-button"><a class="ab-button ab-button-shape-rounded ab-button-size-medium" style="color:#ffffff;background-color:#272c30">Buy Now</a></div></div>
<!-- /wp:atomic-blocks/ab-pricing-table-button --></div></div>
<!-- /wp:atomic-blocks/ab-pricing-table -->

<!-- wp:atomic-blocks/ab-pricing-table -->
<div class="wp-block-atomic-blocks-ab-pricing-table ab-block-pricing-table-center ab-block-pricing-table" itemscope itemtype="http://schema.org/Product"><div class="ab-block-pricing-table-inside" style="border-width:2px;border-style:solid"><!-- wp:atomic-blocks/ab-pricing-table-title {"title":"\u003cstrong\u003ePrice Title\u003c/strong\u003e","fontSize":"medium","paddingTop":30} -->
<div itemprop="name" style="padding-top:30px;padding-right:20px;padding-bottom:10px;padding-left:20px" class="wp-block-atomic-blocks-ab-pricing-table-title ab-pricing-table-title has-medium-font-size"><strong>Price Title</strong></div>
<!-- /wp:atomic-blocks/ab-pricing-table-title -->

<!-- wp:atomic-blocks/ab-pricing-table-subtitle {"subtitle":"Price Subtitle Description","customFontSize":20} -->
<div class="wp-block-atomic-blocks-ab-pricing-table-subtitle ab-pricing-table-subtitle" style="font-size:20px;padding-top:10px;padding-right:20px;padding-bottom:10px;padding-left:20px">Price Subtitle Description</div>
<!-- /wp:atomic-blocks/ab-pricing-table-subtitle -->

<!-- wp:atomic-blocks/ab-pricing-table-price {"price":"\u003cstrong\u003e49\u003c/strong\u003e","currency":"$","term":"/mo"} -->
<div class="wp-block-atomic-blocks-ab-pricing-table-price ab-pricing-table-price-wrap ab-pricing-has-currency" style="padding-top:10px;padding-right:20px;padding-bottom:10px;padding-left:20px"><div itemprop="offers" itemscope itemtype="http://schema.org/Offer"><span itemprop="priceCurrency" class="ab-pricing-table-currency" style="font-size:24px">$</span><div itemprop="price" class="ab-pricing-table-price" style="font-size:60px"><strong>49</strong></div><span class="ab-pricing-table-term" style="font-size:24px">/mo</span></div></div>
<!-- /wp:atomic-blocks/ab-pricing-table-price -->

<!-- wp:atomic-blocks/ab-pricing-table-features {"customFontSize":20,"paddingTop":15,"paddingBottom":15} -->
<ul itemprop="description" class="wp-block-atomic-blocks-ab-pricing-table-features ab-pricing-table-features ab-list-border-none ab-list-border-width-1" style="font-size:20px;padding-top:15px;padding-right:20px;padding-bottom:15px;padding-left:20px"><li>Product Feature One</li><li>Product Feature Two</li><li>Product Feature Three</li></ul>
<!-- /wp:atomic-blocks/ab-pricing-table-features -->

<!-- wp:atomic-blocks/ab-pricing-table-button {"buttonText":"Buy Now","buttonBackgroundColor":"#272c30","paddingTop":15,"paddingBottom":35} -->
<div class="wp-block-atomic-blocks-ab-pricing-table-button ab-pricing-table-button" style="padding-top:15px;padding-right:20px;padding-bottom:35px;padding-left:20px"><div class="ab-block-button"><a class="ab-button ab-button-shape-rounded ab-button-size-medium" style="color:#ffffff;background-color:#272c30">Buy Now</a></div></div>
<!-- /wp:atomic-blocks/ab-pricing-table-button --></div></div>
<!-- /wp:atomic-blocks/ab-pricing-table --></div></div>
<!-- /wp:atomic-blocks/ab-pricing -->
CONTENT;

wp_insert_post([
        'post_title' => 'Test block import with HTML in attributes',
        'post_content' => $post_content
]);
  1. View the newly imported post. You'll see “this block contains unexpected or invalid content” where you expect to see blocks.

If you repeat the above steps but use unencoded attributes as follows (find and replace '\u003c' with '<' and '\u003e' with '>'), the blocks import as expected:

{"price":"<strong>49</strong>","currency":"$","term":"/mo"}

Environment
WordPress 5.2.1, Twenty Nineteen, no plugins active except for Atomic Blocks.
macOS/Chrome.

Further info
This isn't limited to Atomic Blocks, as other blocks use HTML in block attributes. This issue was originally reported against the Gutenberg repo by another user who encountered the same, but it was suggested the issue belongs in Trac. I couldn't find a corresponding ticket here.

https://github.com/WordPress/gutenberg/issues/14068

Change History (5)

#1 @jeremyfelt
6 years ago

  • Keywords 2nd-opinion added

Hi @modernnerd, thanks for opening a ticket.

As you've noticed, the HTML stored in a block attribute needs to be encoded properly. WordPress doesn't necessarily have a core function for doing this directly, but a combination of things will help:

<?php
// String from non-Gutenberg data source.
$string = 'Hey, a <a href="https://jeremyfelt.com/">link to my website</a>.';

$attribute = wp_json_encode( $string, JSON_HEX_TAG | JSON_HEX_APOS | JSON_HEX_QUOT | JSON_HEX_AMP | JSON_UNESCAPED_UNICODE );

// And $attribute is now:
// "Hey, a \u003Ca href=\u0022https:\/\/jeremyfelt.com\/\u0022\u003Elink to my website\u003C\/a\u003E."

// If storing with wp_insert_post(), etc...
$attribute = addslashes( $attribute );

$block = '<!-- wp:jf/custom { "data":' . $attribute . ' } /-->';

I'd support a helper function for this like wp_block_attribute_encode(), though I could also see it being solved with documentation that explains the extra flags on wp_json_encode().

#2 follow-up: @modernnerd
6 years ago

Thanks for the fast reply and for your advice, @jeremyfelt.

It seems that HTML in block attributes is already encoded by WordPress. At least, I see content like this stored in the database already as part of the post content:

{"title":"\u003cstrong\u003ePackage #1\u003c/strong\u003e","fontSize":"larger","paddingTop":30}

Are you suggesting that these attributes need to be handled differently by the block developer, or that anyone wanting to import such content via wp_insert_post() needs to somehow first decode block attributes that could contain HTML?

My goal is to be able to copy code from the block editor and insert it via wp_insert_post() during the theme activation process in order to reduce theme setup steps.

The way HTML in block attributes is currently stored as unicode escape sequences by WordPress appears to prevent that workflow. (I understand that escape sequences are part of the JSON spec and exist to prevent some browsers interpreting code as HTML, so it probably needs to be stored the way it currently is, but the issue seems to be with decoding that stored content for import, rather than encoding it.)

#3 in reply to: ↑ 2 @jeremyfelt
6 years ago

Replying to modernnerd:

It seems that HTML in block attributes is already encoded by WordPress. At least, I see content like this stored in the database already as part of the post content:

{"title":"\u003cstrong\u003ePackage #1\u003c/strong\u003e","fontSize":"larger","paddingTop":30}

Exactly. Client side JavaScript in Gutenberg does some additional work beyond JSON.stringify() to encode the data before it is sent to the server for storage.

Are you suggesting that these attributes need to be handled differently by the block developer, or that anyone wanting to import such content via wp_insert_post() needs to somehow first decode block attributes that could contain HTML?

Anything processed in PHP should effectively mimic what is done in JavaScript. So - if the block developer is encoding on the client side, they can use Gutenberg's serializeAttributes() (or something like it). But if they're encoding on the server side, they'll likely need to pass additional flags to wp_json_encode() so that those additional characters are encoded properly.

(Edit for clarification) These flags: JSON_HEX_TAG | JSON_HEX_APOS | JSON_HEX_QUOT | JSON_HEX_AMP | JSON_UNESCAPED_UNICODE are what instruct wp_json_encode() to handle those additional characters properly.

Last edited 6 years ago by jeremyfelt (previous) (diff)

#4 @modernnerd
6 years ago

Thanks, @jeremyfelt.

Just to clarify, the encoded snippet I posted in my previous reply was from the plugin in question.

Isn't that already encoded correctly since it's been stored that way (presumably due to passing through serializeAttributes() in WP)? How are you suggesting it be encoded instead to solve this issue?

#5 @Jules Colle
18 months ago

I'm practically doing the same thing as @modernnerd. I'm copying code from the block editor to a HTML file, which I use later to dynamically insert pages in my testing/development environment.

At some point, I'm creating the page from the HTML file like this:

$page_content = file_get_contents('/path/to/test-page.html');
$page_id = wp_insert_post([
  //...
  'post_content' => $page_content,
]);

With this code I ran into the problem where the actual content on the page had u003c/strongu003e all over the place.

Simply adding addslashes($page_content) seems to fix all these problems for me.

Working code:

$page_content = file_get_contents('/path/to/test-page.html');
$page_content = addslashes($page_content);
$page_id = wp_insert_post([
  //...
  'post_content' => $page_content,
]);
Note: See TracTickets for help on using tickets.