Make WordPress Core

Opened 7 months ago

Closed 6 months ago

Last modified 3 months ago

#61617 closed enhancement (fixed)

HTML API: Add method to replace modifiable text.

Reported by: dmsnell's profile dmsnell Owned by: dmsnell's profile dmsnell
Milestone: 6.7 Priority: normal
Severity: normal Version: 6.6
Component: HTML API Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

The HTML API has provided get_modifiable_text() since WordPress 6.5, but it has never provided a method to change the modifiable text for a given token in the document.

It should provide this function to make it possible to safely replace HTML text nodes.

Change History (14)

This ticket was mentioned in PR #7007 on WordPress/wordpress-develop by @dmsnell.


7 months ago
#1

  • Keywords has-unit-tests added

Trac ticket: Core-61617

This patch introduces a new method, set_modifiable_text() to the Tag Processor, which makes it possible and safe to replace text nodes within an HTML document, performing the appropriate escaping.

This method can be used in conjunction with other code to modify the text content of a document, and can be used for transforming HTML in a streaming fashion.

@zieladam commented on PR #7007:


6 months ago
#2

get_updated_html() might need this adjustment:

		if ($this->get_token_type() === '#tag') {
			$this->bytes_already_parsed = $before_current_tag;
		}

Without it, replacing a longer string with a shorter one can make $this->bytes_already_parsed negative.

@dmsnell commented on PR #7007:


6 months ago
#3

get_updated_html() might need this adjustment:

Thanks @adamziel - this level of review is so helpful! Unfortunately, when I added a test to try and trigger the condition, I couldn't. What do you think? Do you see any problem with this?

https://github.com/WordPress/wordpress-develop/pull/7007/files#diff-ef817e9b61dcdc18d5e1a84bfbb55370916b179cfed01eab03121456d7ce847cR42-R86

#4 @dmsnell
6 months ago

  • Owner set to dmsnell
  • Resolution set to fixed
  • Status changed from new to closed

In 58829:

HTML API: Add set_modifiable_text() for replacing text nodes.

This patch introduces a new method, set_modifiable_text() to the
Tag Processor, which makes it possible and safe to replace text nodes
within an HTML document, performing the appropriate escaping.

This method can be used in conjunction with other code to modify the
text content of a document, and can be used for transforming HTML
in a streaming fashion.

Developed in https://github.com/wordpress/wordpress-develop/pull/7007
Discussed in https://core.trac.wordpress.org/ticket/61617

Props: dmsnell, gziolo, zieladam.
Fixes #61617.

@zieladam commented on PR #7007:


6 months ago
#6

Thanks @adamziel - this level of review is so helpful! Unfortunately, when I added a test to try and trigger the condition, I couldn't. What do you think? Do you see any problem with this?

That test looks good. I'd also check for regular tags like DIV and then for HTML comments. I ran into that in https://github.com/adamziel/site-transfer-protocol, although now can't find the commit where I did it. Eventually I've switched to a different method of setting modifiable text, and now I'm happy to migrate to this now that it's merged.

@dmsnell commented on PR #7007:


6 months ago
#7

@adamziel if you ever find this problem again I'll be happy to fix it ASAP

This ticket was mentioned in PR #7150 on WordPress/wordpress-develop by @dmsnell.


6 months ago
#8

Trac ticket: Core-61617

When set_modifiable_text() was added to the Tag Processor, it was overlooked that the same information could be queried after setting its value and before proceeding to the next token.

This means that reads following writes return the wrong value.

In this patch, get_modifiable_text() will read any enqueued values before reading from the input HTML document to ensure consistency.

Follow-up to [58829].
Props dmsnell, ramonopoly.

#9 @dmsnell
6 months ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Re-opening to address bug discovered during the development of #61828

This ticket was mentioned in Slack in #core-committers by dmsnell. View the logs.


6 months ago

#11 @dmsnell
6 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 58866:

HTML API: Ensure that get_modifiable_text() reads enqueued updates.

When set_modifiable_text() was added to the Tag Processor, it was considered that the same information could be queried after setting its value and before proceeding to the next token, but unfortunately overlooked that if the starting modifiable text length was zero, then the read in get_modifiable_text() would ignore enqueued updates.

In this patch, get_modifiable_text() will read any enqueued values before reading from the input HTML document to ensure consistency.

Follow-up to [58829].
Props dmsnell, jonsurrell, ramonopoly.
Fixes #61617.

@zieladam commented on PR #7007:


3 months ago
#13

Thanks @adamziel - this level of review is so helpful! Unfortunately, when I added a test to try and trigger the condition, I couldn't. What do you think? Do you see any problem with this?

I just isolated the problem @dmsnell:

$p = new WP_HTML_Tag_Processor('Hello there');
$p->next_token();
$p->set_modifiable_text('Short');
echo $p->get_updated_html();

It's not occurring with this patch:

-               $this->bytes_already_parsed = $before_current_tag;
+               if ($this->get_token_type() === '#tag') {
+                       $this->bytes_already_parsed = $before_current_tag;
+               }

@gziolo commented on PR #7007:


3 months ago
#14

@adamziel, thank you for the report. I was able to create 3 failing tests where it doesn't work as expected:

There were 3 failures:

1) Tests_HtmlApi_WpHtmlTagProcessorModifiableText::test_get_modifiable_text_is_consistent_after_writes_when_text_shorter
Should have found updated text.
Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
-'shorter text'
+'xt'

/var/www/tests/phpunit/tests/html-api/wpHtmlTagProcessorModifiableText.php:116
phpvfscomposer:///var/www/vendor/phpunit/phpunit/phpunit:106

2) Tests_HtmlApi_WpHtmlTagProcessorModifiableText::test_get_modifiable_text_is_consistent_after_writes_when_text_longer
Should have found updated text.
Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
-'a bit longer text'
+'it longer text'

/var/www/tests/phpunit/tests/html-api/wpHtmlTagProcessorModifiableText.php:155
phpvfscomposer:///var/www/vendor/phpunit/phpunit/phpunit:106

3) Tests_HtmlApi_WpHtmlTagProcessorModifiableText::test_get_modifiable_text_is_consistent_after_writes_when_text_after_closed_tag_element
Should have found updated text.
Failed asserting that two strings are identical.
--- Expected
+++ Actual
@@ @@
-'<p>some content</p>a bit longer text'
+'it longer text'

/var/www/tests/phpunit/tests/html-api/wpHtmlTagProcessorModifiableText.php:197
phpvfscomposer:///var/www/vendor/phpunit/phpunit/phpunit:106

I will work on the patch and improve the tests I quickly drafted:

Note: See TracTickets for help on using tickets.