Make WordPress Core

Opened 3 weeks ago

Closed 2 weeks ago

#64531 closed defect (bug) (fixed)

assertEqualHTML may ignore leading whitespace text

Reported by: jonsurrell's profile jonsurrell Owned by: jonsurrell's profile jonsurrell
Milestone: 6.9.1 Priority: normal
Severity: normal Version: 6.9
Component: Build/Test Tools Keywords: has-patch has-unit-tests dev-feedback fixed-major
Focuses: Cc:

Description (last modified by jonsurrell)

Some leading whitespace is trimmed from text nodes resulting in incorrect assertions like the following (passes, but should not):

<?php
$this->assertEqualHTML(
  "<p> x</p>",
  "<p>\nx</p>"
);

The following tree is constructed by build_visual_html_tree() in both cases:

<p>
  "x"

The expected trees are:

<p>
  " x"
<p>
  "
x"

Change History (23)

#1 @jonsurrell
3 weeks ago

  • Milestone changed from Awaiting Review to 6.9.1

#2 @jonsurrell
3 weeks ago

  • Milestone 6.9.1 deleted

I've been investigating the background. The whitespace trim is not part of the HTML5lib test suite which was the inspiration for the HTML tree representation.

Whitespace trimming was added as part of the block delimiter handling. For example:

<!-- wp:block -->
  <!-- wp:nested-void /-->
  <div></div>
<!-- /wp:block -->

Produces this tree with the whitespace trimmed:

BLOCK["core/block"]
  BLOCK["core/nested-void"]
  <div>

This would be the resulting tree of the same input with whitespace included:

BLOCK["core/block"]
  "
  "
  BLOCK["core/nested-void"]
  "
  "
  <div>
  "
"

It seems clear to me that whitespace trimming _should not occure_ when block delimiters are not involved. There are several cases to consider for whitespace and block delimiters:

  • Between nested block delimiters
  • Leading whitespace inside an opening block delimiter
  • Trailing whitespace inside a closing block delimiter

From an HTML perspective, the whitespace should be present in all of them.

I believe the whitespace is preserved when the content is passed to a block's render_callback.

It's simplest to remove the unexpected trimming everywhere. The only reason to avoid that is to ensure this remains ergonomic for writing block tests. It's worth noting that the handling just needs to match in most cases, so blocks could run into problems for cases where the HTML inside block delimiters is not indented equally. Whether that's relevant or not really depends on the block and is beyond the scope of what this assertion handles.

This ticket was mentioned in PR #10765 on WordPress/wordpress-develop by @jonsurrell.


3 weeks ago
#3

  • Keywords has-patch has-unit-tests added

Ensure that leading text whitespace is included in assertEqualHTML comparisons. See the Trac ticket description:

…(ticket description)

#4 @jonsurrell
3 weeks ago

  • Description modified (diff)

@jonsurrell commented on PR #10765:


3 weeks ago
#5

I've pushed commit `9f672ba` that updates a test to add the whitespace inside block delimiters.

This is a good example for discussion, I'd like to hear other opinions: @ockham @dmsnell

@jonsurrell commented on PR #10765:


3 weeks ago
#6

I was surprised when whitespace wasn't detected recently but didn't think to look further. This explains it and the changes in the scripts tests are important.

@jonsurrell commented on PR #10765:


3 weeks ago
#7

If we consider block delimiters as analogous to HTML tags, then it makes complete sense to preserve their inner whitespace. I'm more and more convinced that it should be preserved.

@jonsurrell commented on PR #10765:


3 weeks ago
#8

Commit `951307e` is informative.

It's a bit surprising because HTML is being compared with rendered block markup. This includes odd whitespace combinations like <div>\n\t\n\t<div> which is caused by whitespace around block-delimiter contents (which are removed) and rendered block contents. Many code editors (mine included) are configured to remove trailing whitespace, which makes it tricky to actually print \t\n inside the block markup (and I used the character reference &#x9;).

  • tests/phpunit/tests/blocks/wpBlock.php

    diff --git a/tests/phpunit/tests/blocks/wpBlock.php b/tests/phpunit/tests/blocks/wpBlock.php
    index b3c11ad084386..85fa6e7151d46 100644
    a b public function data_provider_test_render_enqueues_scripts_and_styles(): array { 
    387387                        'all_printed'                             => array(
    388388                                'set_up'                  => null,
    389389                                'block_markup'            => $block_markup,
    390                                 'expected_rendered_block' => '
    391                                         <div class="static">
    392                                                 <div class="static-child">First child</div>
    393                                                 <p class="dynamic">Hello World!</p>
    394                                                 <div class="static-child">Last child</div>
    395                                         </div>
    396                                 ',
     390                                'expected_rendered_block' => <<<'HTML'
     391
     392<div class="static">
     393&#x9;
     394        <div class="static-child">First child</div>
     395&#x9;
     396        <p class="dynamic">Hello World!</p>
     397&#x9;
     398        <div class="static-child">Last child</div>
     399&#x9;
     400</div>
     401
     402HTML
     403                                ,
    397404                                'expected_styles'         => array( 'static-view-style', 'static-child-view-style', 'dynamic-view-style' ),
    398405                                'expected_scripts'        => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ),
    399406                                'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ),
    static function ( $content ) { 
    414421                                        );
    415422                                },
    416423                                'block_markup'            => $block_markup,
    417                                 'expected_rendered_block' => '
    418                                         <div class="static">
    419                                                 <div class="static-child">First child</div>
    420                                                 <p class="dynamic filtered">Hello World!</p>
    421                                                 <div class="static-child">Last child</div>
    422                                         </div>
    423                                 ',
     424                                'expected_rendered_block' => <<<'HTML'
     425
     426<div class="static">
     427&#x9;
     428        <div class="static-child">First child</div>
     429&#x9;
     430        <p class="dynamic filtered">Hello World!</p>
     431&#x9;
     432        <div class="static-child">Last child</div>
     433&#x9;
     434</div>
     435
     436HTML
     437                                ,
    424438                                'expected_styles'         => array( 'static-view-style', 'dynamic-extra', 'static-child-view-style', 'dynamic-view-style' ),
    425439                                'expected_scripts'        => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ),
    426440                                'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ),
    static function ( $content ) { 
    430444                                        add_filter( 'render_block_core/dynamic', '__return_empty_string' );
    431445                                },
    432446                                'block_markup'            => $block_markup,
    433                                 'expected_rendered_block' => '
    434                                         <div class="static">
    435                                                 <div class="static-child">First child</div>
    436                                                 <div class="static-child">Last child</div>
    437                                         </div>
    438                                 ',
     447                                'expected_rendered_block' => <<<'HTML'
     448
     449<div class="static">
     450&#x9;
     451        <div class="static-child">First child</div>
     452&#x9;
     453&#x9;
     454&#x9;
     455        <div class="static-child">Last child</div>
     456&#x9;
     457</div>
     458
     459HTML
     460                                ,
    439461                                'expected_styles'         => array( 'static-view-style', 'static-child-view-style' ),
    440462                                'expected_scripts'        => array( 'static-view-script', 'static-child-view-script' ),
    441463                                'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module' ),
    static function ( $enqueue, $block_name ) { 
    456478                                        );
    457479                                },
    458480                                'block_markup'            => $block_markup,
    459                                 'expected_rendered_block' => '
    460                                         <div class="static">
    461                                                 <div class="static-child">First child</div>
    462                                                 <div class="static-child">Last child</div>
    463                                         </div>
    464                                 ',
     481                                'expected_rendered_block' => <<<'HTML'
     482
     483<div class="static">
     484&#x9;
     485        <div class="static-child">First child</div>
     486&#x9;
     487&#x9;
     488&#x9;
     489        <div class="static-child">Last child</div>
     490&#x9;
     491</div>
     492
     493HTML
     494                                ,
    465495                                'expected_styles'         => array( 'static-view-style', 'static-child-view-style', 'dynamic-view-style' ),
    466496                                'expected_scripts'        => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ),
    467497                                'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ),
    static function ( $content ) { 
    488518                                        add_filter( 'render_block_core/static-child', '__return_empty_string' );
    489519                                },
    490520                                'block_markup'            => $block_markup,
    491                                 'expected_rendered_block' => '
    492                                         <div class="static">
    493                                                 <p class="dynamic">Hello World!</p>
    494                                         </div>
    495                                 ',
     521                                'expected_rendered_block' => <<<'HTML'
     522
     523<div class="static">
     524&#x9;
     525        <p class="dynamic">Hello World!</p>
     526&#x9;
     527</div>
     528
     529HTML
     530                                ,
    496531                                'expected_styles'         => array( 'static-view-style', 'dynamic-view-style' ),
    497532                                'expected_scripts'        => array( 'static-view-script', 'dynamic-view-script' ),
    498533                                'expected_script_modules' => array( 'static-view-script-module', 'dynamic-view-script-module' ),
    static function ( $content ) { 
    512547                                        );
    513548                                },
    514549                                'block_markup'            => $block_markup,
    515                                 'expected_rendered_block' => '
    516                                         <div class="static">
    517                                                 <div class="static-child">First child</div>
    518                                                 <p class="dynamic">Hello World!</p>
    519                                         </div>
    520                                 ',
     550                                'expected_rendered_block' => <<<'HTML'
     551
     552<div class="static">
     553&#x9;
     554        <div class="static-child">First child</div>
     555&#x9;
     556        <p class="dynamic">Hello World!</p>
     557&#x9;
     558</div>
     559
     560HTML
     561                                ,
    521562                                'expected_styles'         => array( 'static-view-style', 'static-child-view-style', 'dynamic-view-style' ),
    522563                                'expected_scripts'        => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ),
    523564                                'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ),

A simple alternative in this case is to remove the tabs from the block markup in the test.

@jonsurrell commented on PR #10765:


3 weeks ago
#9

Here's a similar diff of the test fix without the block indentation:

  • tests/phpunit/tests/blocks/wpBlock.php

    diff --git a/tests/phpunit/tests/blocks/wpBlock.php b/tests/phpunit/tests/blocks/wpBlock.php
    index b3c11ad084..ae067e7d79 100644
    a b class Tests_Blocks_wpBlock extends WP_UnitTestCase { 
    371371                $block_markup = <<<'HTML'
    372372
    373373<div class="static">
    374        
    375         <div class="static-child">First child</div>
    376        
    377        
    378        
    379         <div class="static-child">Last child</div>
    380        
     374
     375<div class="static-child">First child</div>
     376
     377
     378
     379<div class="static-child">Last child</div>
     380
    381381</div>
    382382
    383383HTML;
    HTML; 
    387387                        'all_printed'                             => array(
    388388                                'set_up'                  => null,
    389389                                'block_markup'            => $block_markup,
    390                                 'expected_rendered_block' => '
    391                                         <div class="static">
    392                                                 <div class="static-child">First child</div>
    393                                                 <p class="dynamic">Hello World!</p>
    394                                                 <div class="static-child">Last child</div>
    395                                         </div>
    396                                 ',
     390                                'expected_rendered_block' => <<<'HTML'
     391
     392<div class="static">
     393
     394<div class="static-child">First child</div>
     395
     396<p class="dynamic">Hello World!</p>
     397
     398<div class="static-child">Last child</div>
     399
     400</div>
     401
     402HTML
     403                                ,
    397404                                'expected_styles'         => array( 'static-view-style', 'static-child-view-style', 'dynamic-view-style' ),
    398405                                'expected_scripts'        => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ),
    399406                                'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ),
    HTML; 
    414421                                        );
    415422                                },
    416423                                'block_markup'            => $block_markup,
    417                                 'expected_rendered_block' => '
    418                                         <div class="static">
    419                                                 <div class="static-child">First child</div>
    420                                                 <p class="dynamic filtered">Hello World!</p>
    421                                                 <div class="static-child">Last child</div>
    422                                         </div>
    423                                 ',
     424                                'expected_rendered_block' => <<<'HTML'
     425
     426<div class="static">
     427
     428<div class="static-child">First child</div>
     429
     430<p class="dynamic filtered">Hello World!</p>
     431
     432<div class="static-child">Last child</div>
     433
     434</div>
     435
     436HTML
     437                                ,
    424438                                'expected_styles'         => array( 'static-view-style', 'dynamic-extra', 'static-child-view-style', 'dynamic-view-style' ),
    425439                                'expected_scripts'        => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ),
    426440                                'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ),
    HTML; 
    430444                                        add_filter( 'render_block_core/dynamic', '__return_empty_string' );
    431445                                },
    432446                                'block_markup'            => $block_markup,
    433                                 'expected_rendered_block' => '
    434                                         <div class="static">
    435                                                 <div class="static-child">First child</div>
    436                                                 <div class="static-child">Last child</div>
    437                                         </div>
    438                                 ',
     447                                'expected_rendered_block' => <<<'HTML'
     448
     449<div class="static">
     450
     451<div class="static-child">First child</div>
     452
     453
     454
     455<div class="static-child">Last child</div>
     456
     457</div>
     458
     459HTML
     460                                ,
    439461                                'expected_styles'         => array( 'static-view-style', 'static-child-view-style' ),
    440462                                'expected_scripts'        => array( 'static-view-script', 'static-child-view-script' ),
    441463                                'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module' ),
    HTML; 
    456478                                        );
    457479                                },
    458480                                'block_markup'            => $block_markup,
    459                                 'expected_rendered_block' => '
    460                                         <div class="static">
    461                                                 <div class="static-child">First child</div>
    462                                                 <div class="static-child">Last child</div>
    463                                         </div>
    464                                 ',
     481                                'expected_rendered_block' => <<<'HTML'
     482
     483<div class="static">
     484
     485<div class="static-child">First child</div>
     486
     487
     488
     489<div class="static-child">Last child</div>
     490
     491</div>
     492
     493HTML
     494                                ,
    465495                                'expected_styles'         => array( 'static-view-style', 'static-child-view-style', 'dynamic-view-style' ),
    466496                                'expected_scripts'        => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ),
    467497                                'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ),
    HTML; 
    488518                                        add_filter( 'render_block_core/static-child', '__return_empty_string' );
    489519                                },
    490520                                'block_markup'            => $block_markup,
    491                                 'expected_rendered_block' => '
    492                                         <div class="static">
    493                                                 <p class="dynamic">Hello World!</p>
    494                                         </div>
    495                                 ',
     521                                'expected_rendered_block' => <<<'HTML'
     522
     523<div class="static">
     524
     525<p class="dynamic">Hello World!</p>
     526
     527</div>
     528
     529HTML
     530                                ,
    496531                                'expected_styles'         => array( 'static-view-style', 'dynamic-view-style' ),
    497532                                'expected_scripts'        => array( 'static-view-script', 'dynamic-view-script' ),
    498533                                'expected_script_modules' => array( 'static-view-script-module', 'dynamic-view-script-module' ),
    HTML; 
    512547                                        );
    513548                                },
    514549                                'block_markup'            => $block_markup,
    515                                 'expected_rendered_block' => '
    516                                         <div class="static">
    517                                                 <div class="static-child">First child</div>
    518                                                 <p class="dynamic">Hello World!</p>
    519                                         </div>
    520                                 ',
     550                                'expected_rendered_block' => <<<'HTML'
     551
     552<div class="static">
     553
     554<div class="static-child">First child</div>
     555
     556<p class="dynamic">Hello World!</p>
     557
     558</div>
     559
     560HTML
     561                                ,
    521562                                'expected_styles'         => array( 'static-view-style', 'static-child-view-style', 'dynamic-view-style' ),
    522563                                'expected_scripts'        => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ),
    523564                                'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ),

#10 follow-up: @dmsnell
3 weeks ago

The expected trees are:

@jonsurrell I think there might be a mistake in this declaration, as I would consider the two inputs identical.

<?php
$this->assertEqualHTML(
  "<p> x</p>",
  "<p>\nx</p>"
);

Should produce the following tree in both cases.

<p>
  " x"

This is because of whitespace normalization. If we wanted the newline we’d need a character reference instead of the actual newline character.

#11 in reply to: ↑ 10 @jonsurrell
3 weeks ago

Replying to dmsnell:

Should produce the following tree in both cases.

<p>
  " x"

Both produce this tree, they're both wrong in a way that makes them agree:

<p>
  "x"

I disagree that they're identical … This is because of whitespace normalization. If we wanted the newline we’d need a character reference instead of the actual newline character.

I disagree that they're identical. Different whitespace is not treated identically nor is whitespace removed anywhere else in text node. The following two tags are not and have never been considered matches by assertEqualHTML():

<p>
x
</p>
<p> x </p>

The text in the tree representation is x\n in the first example and x in the second. The leading whitespace is trimmed (which I believe is a bug).


Edit: I've corrected this statement.

Unfortunately, the history of how this came to be is not open source. The actual intent of the whitespace trimming is unclear, but it does not normalize. It simply removes some whitespace. The change that introduced the white space trimming was related the block delimiters appearing the the visual tree. They typically include a lot of whitespace that doesn't seem relevant.

Last edited 3 weeks ago by jonsurrell (previous) (diff)

#12 @dmsnell
3 weeks ago

Okay well I had a slip of mind. When in doubt: trust the HTML API. I was just previously looking at some HTML-to-Markdown code and had rendering concerns in my head rather than parsing concerns.

#13 @jonsurrell
3 weeks ago

  • Milestone set to 6.9.1

This ticket was mentioned in Slack in #core by jorbin. View the logs.


3 weeks ago

@jonsurrell commented on PR #10765:


3 weeks ago
#16

it would be much smaller perhaps if those didn’t move. From the HTML perspective it shouldn’t matter, as those all have at least one whitespace character (the newline).

I don't think there's a solution that produces smaller output. it's a question of picking between the version without indentation (what I've proposed) and a version where the indentation appears in the result (because it _is_ present in the rendered block's HTML).

I chose this version because the sequence of mixed newlines and tabs led to lines where the only text was a tab. Lines like that are often flagged by linters or removed by editors and are generally surprising. My solution to that (in a previous commit in this PR) was to include a tab character reference in the expected HTML (see `951307e`):

<div class="static">
&#x9;
        <div class="static-child">First child</div>
&#x9;
        <p class="dynamic">Hello World!</p>
&#x9;
        <div class="static-child">Last child</div>
&#x9;
</div>

I preferred removing the indentation from the source rather than attempting to maintain it in the expected output.

#17 @jonsurrell
3 weeks ago

  • Owner set to jonsurrell
  • Resolution set to fixed
  • Status changed from new to closed

In 61519:

Build/Test Tools: Ensure assertEqualHTML() recognizes whitespace text.

Ensure whitespace text nodes are correctly represented by build_visual_html_tree().

The build_visual_html_tree() function used by assertEqualHTML() would remove some leading whitespace from text nodes. Some whitespace-only text nodes were omitted from the tree.

Developed in https://github.com/WordPress/wordpress-develop/pull/10765.

Follow-up to [60295].

Props jonsurrell, dmsnell, bernhard-reiter.
Fixes #64531.

#18 @jonsurrell
3 weeks ago

  • Keywords dev-feedback added

Reopening to consider [61519] for 6.9 backport.

#19 @jonsurrell
3 weeks ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

This ticket was mentioned in PR #10787 on WordPress/wordpress-develop by @jonsurrell.


2 weeks ago
#20

# Backport testing PR for r61519

Ensure whitespace text nodes are correctly represented by build_visual_html_tree().

The build_visual_html_tree() function used by assertEqualHTML() would remove some leading whitespace from text nodes. Some whitespace-only text nodes were omitted from the tree.

Developed in https://github.com/WordPress/wordpress-develop/pull/10765.

Follow-up to [60295].

Props jonsurrell, dmsnell, bernhard-reiter.
Fixes #64531.

Backport of [61519].

Trac ticket: https://core.trac.wordpress.org/ticket/64531

#21 @jonsurrell
2 weeks ago

  • Keywords fixed-major added

The backport had a lot of conflicts thanks to the changes with HTML5 support in script tags. I'm happy to have PR10787 prepared to run the 6.9 tests. It also makes a 6.9 diff easy to access.

Last edited 2 weeks ago by jonsurrell (previous) (diff)

@wildworks commented on PR #10787:


2 weeks ago
#22

The failing unit test is not related to this PR. See https://wordpress.slack.com/archives/C08MVR1MU2K/p1769495036842279

This ticket was mentioned in Slack in #core by wildworks. View the logs.


2 weeks ago

#24 @jonsurrell
2 weeks ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 61536:

Build/Test Tools: Ensure assertEqualHTML() recognizes whitespace text.

Ensure whitespace text nodes are correctly represented by build_visual_html_tree().

The build_visual_html_tree() function used by assertEqualHTML() would remove some leading whitespace from text nodes. Some whitespace-only text nodes were omitted from the tree.

Developed in https://github.com/WordPress/wordpress-develop/pull/10765.

Follow-up to [60295].

Reviewed by wildworks.
Merges [61519] to the 6.9 branch.

Props jonsurrell, dmsnell, bernhard-reiter.
Fixes #64531.

Note: See TracTickets for help on using tickets.