Opened 2 months ago
Closed 4 weeks ago
#63854 closed defect (bug) (fixed)
Handle a non-string passed to WP_HTML_Tag_Processor gracefully
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Milestone: | 6.9 | Priority: | normal |
| Severity: | normal | Version: | 6.2 |
| Component: | HTML API | Keywords: | has-unit-tests has-patch commit |
| Focuses: | Cc: |
Description
The WP_HTML_Tag_Processor constructor assigns the passed value directly to $this->html. The method get_updated_html() is typed only to return a string, resulting in a runtime fatal if null is passed to the constructor.
I proposed adding a doing_it_wrong and setting the value to an empty string when a non-string is passed to it.
Discovered by a filter hooked onto the_content that passed the value to the processor. Earlier code must have nulled out the value or something similar.
wp> $html = new WP_HTML_Tag_Processor(null);
=> object(WP_HTML_Tag_Processor)#3418 (26) {
["html":protected]=>
NULL
["last_query":"WP_HTML_Tag_Processor":private]=>
NULL
["sought_tag_name":"WP_HTML_Tag_Processor":private]=>
NULL
["sought_class_name":"WP_HTML_Tag_Processor":private]=>
NULL
["sought_match_offset":"WP_HTML_Tag_Processor":private]=>
NULL
["stop_on_tag_closers":"WP_HTML_Tag_Processor":private]=>
NULL
["parser_state":protected]=>
string(11) "STATE_READY"
["compat_mode":protected]=>
string(14) "no-quirks-mode"
["parsing_namespace":"WP_HTML_Tag_Processor":private]=>
string(4) "html"
["comment_type":protected]=>
NULL
["text_node_classification":protected]=>
string(15) "TEXT_IS_GENERIC"
["bytes_already_parsed":"WP_HTML_Tag_Processor":private]=>
int(0)
["token_starts_at":"WP_HTML_Tag_Processor":private]=>
NULL
["token_length":"WP_HTML_Tag_Processor":private]=>
NULL
["tag_name_starts_at":"WP_HTML_Tag_Processor":private]=>
NULL
["tag_name_length":"WP_HTML_Tag_Processor":private]=>
NULL
["text_starts_at":"WP_HTML_Tag_Processor":private]=>
NULL
["text_length":"WP_HTML_Tag_Processor":private]=>
NULL
["is_closing_tag":"WP_HTML_Tag_Processor":private]=>
NULL
["attributes":"WP_HTML_Tag_Processor":private]=>
array(0) {
}
["duplicate_attributes":"WP_HTML_Tag_Processor":private]=>
NULL
["classname_updates":"WP_HTML_Tag_Processor":private]=>
array(0) {
}
["bookmarks":protected]=>
array(0) {
}
["lexical_updates":protected]=>
array(0) {
}
["seek_count":protected]=>
int(0)
["skip_newline_at":"WP_HTML_Tag_Processor":private]=>
NULL
}
wp> $html->get_updated_html();
Fatal error: Uncaught TypeError: WP_HTML_Tag_Processor::get_updated_html(): Return value must be of type string, null returned in /srv/htdocs/__wp__/wp-includes/html-api/class-wp-html-tag-processor.php:4146
Stack trace:
#0 phar:///usr/local/bin/wp-cli/vendor/wp-cli/shell-command/src/WP_CLI/Shell/REPL.php(46) : eval()'d code(1): WP_HTML_Tag_Processor->get_updated_html()
#1 phar:///usr/local/bin/wp-cli/vendor/wp-cli/shell-command/src/WP_CLI/Shell/REPL.php(46): eval()
#2 phar:///usr/local/bin/wp-cli/vendor/wp-cli/shell-command/src/Shell_Command.php(52): WP_CLI\Shell\REPL->start()
#3 [internal function]: Shell_Command->__invoke(Array, Array)
#4 phar:///usr/local/bin/wp-cli/vendor/wp-cli/wp-cli/php/WP_CLI/Dispatcher/CommandFactory.php(100): call_user_func(Array, Array, Array)
#5 [internal function]: WP_CLI\Dispatcher\CommandFactory::WP_CLI\Dispatcher\{closure}(Array, Array)
#6 phar:///usr/local/bin/wp-cli/vendor/wp-cli/wp-cli/php/WP_CLI/Dispatcher/Subcommand.php(497): call_user_func(Object(Closure), Array, Array)
#7 phar:///usr/local/bin/wp-cli/vendor/wp-cli/wp-cli/php/WP_CLI/Runner.php(470): WP_CLI\Dispatcher\Subcommand->invoke(Array, Array, Array)
#8 phar:///usr/local/bin/wp-cli/vendor/wp-cli/wp-cli/php/WP_CLI/Runner.php(493): WP_CLI\Runner->run_command(Array, Array)
#9 phar:///usr/local/bin/wp-cli/vendor/wp-cli/wp-cli/php/WP_CLI/Runner.php(1295): WP_CLI\Runner->run_command_and_exit()
#10 phar:///usr/local/bin/wp-cli/vendor/wp-cli/wp-cli/php/WP_CLI/Bootstrap/LaunchRunner.php(28): WP_CLI\Runner->start()
#11 phar:///usr/local/bin/wp-cli/vendor/wp-cli/wp-cli/php/bootstrap.php(84): WP_CLI\Bootstrap\LaunchRunner->process(Object(WP_CLI\Bootstrap\BootstrapState))
#12 phar:///usr/local/bin/wp-cli/vendor/wp-cli/wp-cli/php/wp-cli.php(35): WP_CLI\bootstrap()
#13 phar:///usr/local/bin/wp-cli/php/boot-phar.php(20): include('phar:///usr/loc...')
#14 /usr/local/bin/wp-cli(4): include('phar:///usr/loc...')
#15 {main}
thrown in /srv/htdocs/__wp__/wp-includes/html-api/class-wp-html-tag-processor.php on line 4146
Change History (11)
This ticket was mentioned in PR #9545 on WordPress/wordpress-develop by @kraftbj.
2 months ago
#1
- Keywords has-patch has-unit-tests added
#2
@
2 months ago
What is the function/method that is constructing WP_HTML_Tag_Processor which is passing null to begin with?
Personally I advocate to always start filter code with a type check since plugins can do bad things with violating the "contract" for the filtered value's type. In other words, I do the same thing as you propose, but in the filter callback:
<?php add_filter( 'the_content', function ( $value ): string { if ( ! is_string( $value ) ) { $value = ''; } // ... return $value; } );
That's not to exclude what you're proposing in your PR.
#3
@
2 months ago
@westonruter I agree with you ( https://github.com/Automattic/jetpack/pull/44874 to pass through the value when it's unexpected instead of trying to process it ).
I'm thinking of a change in Core since there is a strict type on the method to return a value out of it, but constructing (and interacting, at least how we are in the Jetpack CDN code) with the tag processor with an invalid construction arg occurs without complaint.
So, I agree that the filtering function should be defensive, and Core could be more defensive too.
#4
@
2 months ago
The change seems like a good direction.
I think WP_HTML_Processor::create_fragment() and WP_HTML_Processor::create_full_parser() are also susceptible. Those can also _doing_it_wrong(), but they're simpler in that they can return null. Do you mind including that in your PR?
@jonsurrell commented on PR #9545:
2 months ago
#5
I think WP_HTML_Processor::create_fragment() and WP_HTML_Processor::create_full_parser() are also susceptible. Those can also _doing_it_wrong(), but they're simpler in that they can return null. Do you mind adding those changes here?
#6
@
2 months ago
- Keywords has-patch removed
- Owner set to kraftbj
- Status changed from new to accepted
Will do! Some personal things came up, so I'm delayed on circling back on this. I'll update the other methods and review the rest. Will re-add the has-patch keyword once it's been refreshed.
@jonsurrell commented on PR #9545:
5 weeks ago
#7
@kraftbj Just checking in, I'd love to get this in 6.9. Are you able to get back to it or would you like help to finish up?
Adds handling of non-string values passed to the WP_HTML_Tag_Processor by throwing a _doing_it_wrong and setting it to an empty string to avoid further issues.