Opened 3 weeks ago
Closed 2 weeks ago
#64531 closed defect (bug) (fixed)
assertEqualHTML may ignore leading whitespace text
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Milestone: | 6.9.1 | Priority: | normal |
| Severity: | normal | Version: | 6.9 |
| Component: | Build/Test Tools | Keywords: | has-patch has-unit-tests dev-feedback fixed-major |
| Focuses: | Cc: |
Description (last modified by )
Some leading whitespace is trimmed from text nodes resulting in incorrect assertions like the following (passes, but should not):
<?php $this->assertEqualHTML( "<p> x</p>", "<p>\nx</p>" );
The following tree is constructed by build_visual_html_tree() in both cases:
<p> "x"
The expected trees are:
<p> " x"
<p> " x"
Change History (23)
This ticket was mentioned in PR #10765 on WordPress/wordpress-develop by @jonsurrell.
3 weeks ago
#3
- Keywords has-patch has-unit-tests added
Ensure that leading text whitespace is included in assertEqualHTML comparisons. See the Trac ticket description:
…(ticket description)
@jonsurrell commented on PR #10765:
3 weeks ago
#5
I've pushed commit `9f672ba` that updates a test to add the whitespace inside block delimiters.
This is a good example for discussion, I'd like to hear other opinions: @ockham @dmsnell
@jonsurrell commented on PR #10765:
3 weeks ago
#6
I was surprised when whitespace wasn't detected recently but didn't think to look further. This explains it and the changes in the scripts tests are important.
@jonsurrell commented on PR #10765:
3 weeks ago
#7
If we consider block delimiters as analogous to HTML tags, then it makes complete sense to preserve their inner whitespace. I'm more and more convinced that it should be preserved.
@jonsurrell commented on PR #10765:
3 weeks ago
#8
Commit `951307e` is informative.
It's a bit surprising because HTML is being compared with rendered block markup. This includes odd whitespace combinations like <div>\n\t\n\t<div> which is caused by whitespace around block-delimiter contents (which are removed) and rendered block contents. Many code editors (mine included) are configured to remove trailing whitespace, which makes it tricky to actually print \t\n inside the block markup (and I used the character reference 	).
-
tests/phpunit/tests/blocks/wpBlock.php
diff --git a/tests/phpunit/tests/blocks/wpBlock.php b/tests/phpunit/tests/blocks/wpBlock.php index b3c11ad084386..85fa6e7151d46 100644
a b public function data_provider_test_render_enqueues_scripts_and_styles(): array { 387 387 'all_printed' => array( 388 388 'set_up' => null, 389 389 'block_markup' => $block_markup, 390 'expected_rendered_block' => ' 391 <div class="static"> 392 <div class="static-child">First child</div> 393 <p class="dynamic">Hello World!</p> 394 <div class="static-child">Last child</div> 395 </div> 396 ', 390 'expected_rendered_block' => <<<'HTML' 391 392 <div class="static"> 393 	 394 <div class="static-child">First child</div> 395 	 396 <p class="dynamic">Hello World!</p> 397 	 398 <div class="static-child">Last child</div> 399 	 400 </div> 401 402 HTML 403 , 397 404 'expected_styles' => array( 'static-view-style', 'static-child-view-style', 'dynamic-view-style' ), 398 405 'expected_scripts' => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ), 399 406 'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ), … … static function ( $content ) { 414 421 ); 415 422 }, 416 423 'block_markup' => $block_markup, 417 'expected_rendered_block' => ' 418 <div class="static"> 419 <div class="static-child">First child</div> 420 <p class="dynamic filtered">Hello World!</p> 421 <div class="static-child">Last child</div> 422 </div> 423 ', 424 'expected_rendered_block' => <<<'HTML' 425 426 <div class="static"> 427 	 428 <div class="static-child">First child</div> 429 	 430 <p class="dynamic filtered">Hello World!</p> 431 	 432 <div class="static-child">Last child</div> 433 	 434 </div> 435 436 HTML 437 , 424 438 'expected_styles' => array( 'static-view-style', 'dynamic-extra', 'static-child-view-style', 'dynamic-view-style' ), 425 439 'expected_scripts' => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ), 426 440 'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ), … … static function ( $content ) { 430 444 add_filter( 'render_block_core/dynamic', '__return_empty_string' ); 431 445 }, 432 446 'block_markup' => $block_markup, 433 'expected_rendered_block' => ' 434 <div class="static"> 435 <div class="static-child">First child</div> 436 <div class="static-child">Last child</div> 437 </div> 438 ', 447 'expected_rendered_block' => <<<'HTML' 448 449 <div class="static"> 450 	 451 <div class="static-child">First child</div> 452 	 453 	 454 	 455 <div class="static-child">Last child</div> 456 	 457 </div> 458 459 HTML 460 , 439 461 'expected_styles' => array( 'static-view-style', 'static-child-view-style' ), 440 462 'expected_scripts' => array( 'static-view-script', 'static-child-view-script' ), 441 463 'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module' ), … … static function ( $enqueue, $block_name ) { 456 478 ); 457 479 }, 458 480 'block_markup' => $block_markup, 459 'expected_rendered_block' => ' 460 <div class="static"> 461 <div class="static-child">First child</div> 462 <div class="static-child">Last child</div> 463 </div> 464 ', 481 'expected_rendered_block' => <<<'HTML' 482 483 <div class="static"> 484 	 485 <div class="static-child">First child</div> 486 	 487 	 488 	 489 <div class="static-child">Last child</div> 490 	 491 </div> 492 493 HTML 494 , 465 495 'expected_styles' => array( 'static-view-style', 'static-child-view-style', 'dynamic-view-style' ), 466 496 'expected_scripts' => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ), 467 497 'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ), … … static function ( $content ) { 488 518 add_filter( 'render_block_core/static-child', '__return_empty_string' ); 489 519 }, 490 520 'block_markup' => $block_markup, 491 'expected_rendered_block' => ' 492 <div class="static"> 493 <p class="dynamic">Hello World!</p> 494 </div> 495 ', 521 'expected_rendered_block' => <<<'HTML' 522 523 <div class="static"> 524 	 525 <p class="dynamic">Hello World!</p> 526 	 527 </div> 528 529 HTML 530 , 496 531 'expected_styles' => array( 'static-view-style', 'dynamic-view-style' ), 497 532 'expected_scripts' => array( 'static-view-script', 'dynamic-view-script' ), 498 533 'expected_script_modules' => array( 'static-view-script-module', 'dynamic-view-script-module' ), … … static function ( $content ) { 512 547 ); 513 548 }, 514 549 'block_markup' => $block_markup, 515 'expected_rendered_block' => ' 516 <div class="static"> 517 <div class="static-child">First child</div> 518 <p class="dynamic">Hello World!</p> 519 </div> 520 ', 550 'expected_rendered_block' => <<<'HTML' 551 552 <div class="static"> 553 	 554 <div class="static-child">First child</div> 555 	 556 <p class="dynamic">Hello World!</p> 557 	 558 </div> 559 560 HTML 561 , 521 562 'expected_styles' => array( 'static-view-style', 'static-child-view-style', 'dynamic-view-style' ), 522 563 'expected_scripts' => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ), 523 564 'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ),
A simple alternative in this case is to remove the tabs from the block markup in the test.
@jonsurrell commented on PR #10765:
3 weeks ago
#9
Here's a similar diff of the test fix without the block indentation:
-
tests/phpunit/tests/blocks/wpBlock.php
diff --git a/tests/phpunit/tests/blocks/wpBlock.php b/tests/phpunit/tests/blocks/wpBlock.php index b3c11ad084..ae067e7d79 100644
a b class Tests_Blocks_wpBlock extends WP_UnitTestCase { 371 371 $block_markup = <<<'HTML' 372 372 373 373 <div class="static"> 374 375 <div class="static-child">First child</div>376 377 378 379 <div class="static-child">Last child</div>380 374 375 <div class="static-child">First child</div> 376 377 378 379 <div class="static-child">Last child</div> 380 381 381 </div> 382 382 383 383 HTML; … … HTML; 387 387 'all_printed' => array( 388 388 'set_up' => null, 389 389 'block_markup' => $block_markup, 390 'expected_rendered_block' => ' 391 <div class="static"> 392 <div class="static-child">First child</div> 393 <p class="dynamic">Hello World!</p> 394 <div class="static-child">Last child</div> 395 </div> 396 ', 390 'expected_rendered_block' => <<<'HTML' 391 392 <div class="static"> 393 394 <div class="static-child">First child</div> 395 396 <p class="dynamic">Hello World!</p> 397 398 <div class="static-child">Last child</div> 399 400 </div> 401 402 HTML 403 , 397 404 'expected_styles' => array( 'static-view-style', 'static-child-view-style', 'dynamic-view-style' ), 398 405 'expected_scripts' => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ), 399 406 'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ), … … HTML; 414 421 ); 415 422 }, 416 423 'block_markup' => $block_markup, 417 'expected_rendered_block' => ' 418 <div class="static"> 419 <div class="static-child">First child</div> 420 <p class="dynamic filtered">Hello World!</p> 421 <div class="static-child">Last child</div> 422 </div> 423 ', 424 'expected_rendered_block' => <<<'HTML' 425 426 <div class="static"> 427 428 <div class="static-child">First child</div> 429 430 <p class="dynamic filtered">Hello World!</p> 431 432 <div class="static-child">Last child</div> 433 434 </div> 435 436 HTML 437 , 424 438 'expected_styles' => array( 'static-view-style', 'dynamic-extra', 'static-child-view-style', 'dynamic-view-style' ), 425 439 'expected_scripts' => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ), 426 440 'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ), … … HTML; 430 444 add_filter( 'render_block_core/dynamic', '__return_empty_string' ); 431 445 }, 432 446 'block_markup' => $block_markup, 433 'expected_rendered_block' => ' 434 <div class="static"> 435 <div class="static-child">First child</div> 436 <div class="static-child">Last child</div> 437 </div> 438 ', 447 'expected_rendered_block' => <<<'HTML' 448 449 <div class="static"> 450 451 <div class="static-child">First child</div> 452 453 454 455 <div class="static-child">Last child</div> 456 457 </div> 458 459 HTML 460 , 439 461 'expected_styles' => array( 'static-view-style', 'static-child-view-style' ), 440 462 'expected_scripts' => array( 'static-view-script', 'static-child-view-script' ), 441 463 'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module' ), … … HTML; 456 478 ); 457 479 }, 458 480 'block_markup' => $block_markup, 459 'expected_rendered_block' => ' 460 <div class="static"> 461 <div class="static-child">First child</div> 462 <div class="static-child">Last child</div> 463 </div> 464 ', 481 'expected_rendered_block' => <<<'HTML' 482 483 <div class="static"> 484 485 <div class="static-child">First child</div> 486 487 488 489 <div class="static-child">Last child</div> 490 491 </div> 492 493 HTML 494 , 465 495 'expected_styles' => array( 'static-view-style', 'static-child-view-style', 'dynamic-view-style' ), 466 496 'expected_scripts' => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ), 467 497 'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ), … … HTML; 488 518 add_filter( 'render_block_core/static-child', '__return_empty_string' ); 489 519 }, 490 520 'block_markup' => $block_markup, 491 'expected_rendered_block' => ' 492 <div class="static"> 493 <p class="dynamic">Hello World!</p> 494 </div> 495 ', 521 'expected_rendered_block' => <<<'HTML' 522 523 <div class="static"> 524 525 <p class="dynamic">Hello World!</p> 526 527 </div> 528 529 HTML 530 , 496 531 'expected_styles' => array( 'static-view-style', 'dynamic-view-style' ), 497 532 'expected_scripts' => array( 'static-view-script', 'dynamic-view-script' ), 498 533 'expected_script_modules' => array( 'static-view-script-module', 'dynamic-view-script-module' ), … … HTML; 512 547 ); 513 548 }, 514 549 'block_markup' => $block_markup, 515 'expected_rendered_block' => ' 516 <div class="static"> 517 <div class="static-child">First child</div> 518 <p class="dynamic">Hello World!</p> 519 </div> 520 ', 550 'expected_rendered_block' => <<<'HTML' 551 552 <div class="static"> 553 554 <div class="static-child">First child</div> 555 556 <p class="dynamic">Hello World!</p> 557 558 </div> 559 560 HTML 561 , 521 562 'expected_styles' => array( 'static-view-style', 'static-child-view-style', 'dynamic-view-style' ), 522 563 'expected_scripts' => array( 'static-view-script', 'static-child-view-script', 'dynamic-view-script' ), 523 564 'expected_script_modules' => array( 'static-view-script-module', 'static-child-view-script-module', 'dynamic-view-script-module' ),
#10
follow-up:
↓ 11
@
3 weeks ago
The expected trees are:
@jonsurrell I think there might be a mistake in this declaration, as I would consider the two inputs identical.
<?php $this->assertEqualHTML( "<p> x</p>", "<p>\nx</p>" );
Should produce the following tree in both cases.
<p> " x"
This is because of whitespace normalization. If we wanted the newline we’d need a character reference instead of the actual newline character.
#11
in reply to:
↑ 10
@
3 weeks ago
Replying to dmsnell:
Should produce the following tree in both cases.
<p> " x"
Both produce this tree, they're both wrong in a way that makes them agree:
<p> "x"
I disagree that they're identical … This is because of whitespace normalization. If we wanted the newline we’d need a character reference instead of the actual newline character.
I disagree that they're identical. Different whitespace is not treated identically nor is whitespace removed anywhere else in text node. The following two tags are not and have never been considered matches by assertEqualHTML():
<p> x </p> <p> x </p>
The text in the tree representation is x\n in the first example and x in the second. The leading whitespace is trimmed (which I believe is a bug).
Edit: I've corrected this statement.
Unfortunately, the history of how this came to be is not open source. The actual intent of the whitespace trimming is unclear, but it does not normalize. It simply removes some whitespace. The change that introduced the white space trimming was related the block delimiters appearing the the visual tree. They typically include a lot of whitespace that doesn't seem relevant.
#12
@
3 weeks ago
Okay well I had a slip of mind. When in doubt: trust the HTML API. I was just previously looking at some HTML-to-Markdown code and had rendering concerns in my head rather than parsing concerns.
This ticket was mentioned in Slack in #core by jorbin. View the logs.
3 weeks ago
@jonsurrell commented on PR #10765:
3 weeks ago
#16
it would be much smaller perhaps if those didn’t move. From the HTML perspective it shouldn’t matter, as those all have at least one whitespace character (the newline).
I don't think there's a solution that produces smaller output. it's a question of picking between the version without indentation (what I've proposed) and a version where the indentation appears in the result (because it _is_ present in the rendered block's HTML).
I chose this version because the sequence of mixed newlines and tabs led to lines where the only text was a tab. Lines like that are often flagged by linters or removed by editors and are generally surprising. My solution to that (in a previous commit in this PR) was to include a tab character reference in the expected HTML (see `951307e`):
<div class="static"> 	 <div class="static-child">First child</div> 	 <p class="dynamic">Hello World!</p> 	 <div class="static-child">Last child</div> 	 </div>
I preferred removing the indentation from the source rather than attempting to maintain it in the expected output.
#17
@
3 weeks ago
- Owner set to jonsurrell
- Resolution set to fixed
- Status changed from new to closed
In 61519:
This ticket was mentioned in PR #10787 on WordPress/wordpress-develop by @jonsurrell.
2 weeks ago
#20
# Backport testing PR for r61519
Ensure whitespace text nodes are correctly represented by
build_visual_html_tree().
The
build_visual_html_tree()function used byassertEqualHTML()would remove some leading whitespace from text nodes. Some whitespace-only text nodes were omitted from the tree.
Developed in https://github.com/WordPress/wordpress-develop/pull/10765.
Follow-up to [60295].
Props jonsurrell, dmsnell, bernhard-reiter.
Fixes #64531.
Backport of [61519].
Trac ticket: https://core.trac.wordpress.org/ticket/64531
#21
@
2 weeks ago
- Keywords fixed-major added
The backport had a lot of conflicts thanks to the changes with HTML5 support in script tags. I'm happy to have PR10787 prepared to run the 6.9 tests. It also makes a 6.9 diff easy to access.
@wildworks commented on PR #10787:
2 weeks ago
#22
The failing unit test is not related to this PR. See https://wordpress.slack.com/archives/C08MVR1MU2K/p1769495036842279
I've been investigating the background. The whitespace trim is not part of the HTML5lib test suite which was the inspiration for the HTML tree representation.
Whitespace trimming was added as part of the block delimiter handling. For example:
Produces this tree with the whitespace trimmed:
This would be the resulting tree of the same input with whitespace included:
It seems clear to me that whitespace trimming _should not occure_ when block delimiters are not involved. There are several cases to consider for whitespace and block delimiters:
From an HTML perspective, the whitespace should be present in all of them.
I believe the whitespace is preserved when the content is passed to a block's
render_callback.It's simplest to remove the unexpected trimming everywhere. The only reason to avoid that is to ensure this remains ergonomic for writing block tests. It's worth noting that the handling just needs to match in most cases, so blocks could run into problems for cases where the HTML inside block delimiters is not indented equally. Whether that's relevant or not really depends on the block and is beyond the scope of what this assertion handles.