Make WordPress Core

Opened 5 days ago

Last modified 5 days ago

#64776 new defect (bug)

HTML API: set_modifiable_text() ignores leading newlines in PRE, LISTING

Reported by: jonsurrell's profile jonsurrell Owned by:
Milestone: Future Release Priority: normal
Severity: normal Version: 6.7
Component: HTML API Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

The ::set_modifiable_text() method on WP_HTML_Tag_Processor and WP_HTML_Processor will fail to include a leading newline in the provided plaintext content on the first text of PRE and LISTING elements.

This is due to special rules for these elements that cause a single leading newline to be ignored immediately following the open tag.

<?php
$html_processor = WP_HTML_Processor::create_fragment('<pre>X</pre>');
$html_processor->next_token();
$html_processor->next_token(); // on the text node
$html_processor->set_modifiable_text( "\nAFTER NEWLINE" );
var_dump( $html_processor->get_modifiable_text() );
// string(13) "AFTER NEWLINE"
echo $html_processor->get_updated_html();
/* Prints:
<textarea>
AFTER NEWLINE</textarea>
*/

Note that the newline is present in the updated HTML, however the HTML parsing rules cause the leading newline to be ignored.

When rendered in the browser, this PRE element will report its .textContent as AFTER NEWLINE (with no leading newline) and the rendered element shows AFTER NEWLINE on the first line of the input box.

In order for the PRE to begin with a newline in its content, an additional newline must be included.

This is a follow-up to #64609 and [61754] which handled TEXTAREA.

Change History (2)

This ticket was mentioned in PR #10879 on WordPress/wordpress-develop by @jonsurrell.


5 days ago
#1

  • Keywords has-patch has-unit-tests added

Detect cases where ::set_modifiable_text() would omit a leading newline from its input and adjust accordingly. This is done by adding an additional leading newline (that is ignored by HTML parsers) in case a leading newline is detected in the input.

This follows the HTML parsing rules for TEXTAREA, PRE, and LISTING elements that ignores a single U+000A LINE FEED character immediately following the open tag. It also respects the guidlines on newline normalization, so a U+000D CARRIAGE RETURN also triggers the extra newline.

Textarea was handled in r61754.

The ::set_modifiable_text() method on WP_HTML_Tag_Processor and WP_HTML_Processor will fail to include a leading newline in the provided plaintext content on the first text of TEXTAREA, PRE, and LISTING elements.

This is due to special rules for these elements that cause a single leading newline to be ignored immediately following the open tag.

$html_processor = WP_HTML_Processor::create_fragment('<textarea></textarea>');
$html_processor->next_token();
$html_processor->set_modifiable_text( "\nAFTER NEWLINE" );
var_dump( $html_processor->get_modifiable_text() );
// string(13) "AFTER NEWLINE"
echo $html_processor->get_updated_html();
/* Prints:
<textarea>
AFTER NEWLINE</textarea>
*/

Note that the newline is present in the updated HTML, however the HTML parsing rules cause the leading newline to be ignored.

When rendered in the browser, this TEXTAREA element will report its .textContent as AFTER NEWLINE (with no leading newline) and the rendered element shows AFTER NEWLINE on the first line of the input box.

In order for the TEXTAREA to begin with a newline in its content, an additional newline must be included.

This is closely related to #64607.

Trac ticket: https://core.trac.wordpress.org/ticket/64776

(Previously https://core.trac.wordpress.org/ticket/64609)

@jonsurrell commented on PR #10879:


5 days ago
#2

The main issue here is determining whether the current position is a text node that is the first child of PRE (or LISTING). In that position, it's likely possible to always add a newline.

The tag processor has very limited information in that regard. No stack of elements and no awareness of siblings.

The HTML processor has a stack of open elements to inspect, but it's still difficult to determine whether it's the first text node child or not: <pre>find-me<hr>not-me</pre>.

Note: See TracTickets for help on using tickets.