#63891 closed defect (bug) (fixed)
Processing directives function in interactivity API class may cause fatal error
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Milestone: | 6.9 | Priority: | normal |
| Severity: | normal | Version: | 6.5 |
| Component: | Interactivity API | Keywords: | has-patch has-unit-tests commit |
| Focuses: | php-compatibility | Cc: |
Description
This may be an edge case for posts created with Classic editor or classic editor block.
_process_directives function within WP_Interactivity API class will process html and check for closer tags.
if there's a wrong tag like </br> (in my case) will result in a TypeError:
Fatal error: Uncaught TypeError: count(): Argument #1 ($value) must be of type Countable|array, null given in wp-includes/interactivity-api/class-wp-interactivity-api.php on line 442
by checking the line of code, count() function is applied to get_attribute_names_with_prefix() method from WP_HTML_Tag_Processor which can eventually return null
for newer PHP version (8.0.0 and later) this will throw a fatal error.
probably worth getting an extra check if the funcion is returning a countable variable
Change History (21)
This ticket was mentioned in PR #9641 on WordPress/wordpress-develop by @hugosolar.
6 months ago
#1
- Keywords has-patch added; needs-patch removed
This ticket was mentioned in PR #9657 on WordPress/wordpress-develop by @jonsurrell.
6 months ago
#2
- Keywords has-unit-tests added
</br> gets special treatment in the HTML standard. It creates a BR element with no attributes (regardless of whether it _appears_ to have attributes in the HTML like </br attr-name>).
- Stop on
</br>in::next_tag()like other openers. - Return
array()from::get_attribute_names_with_prefix()on</br>instead ofnulllike true close tags. - Add tests.
Trac ticket: https://core.trac.wordpress.org/ticket/63891
#3
@
6 months ago
Checking for null as proposed in PR 9641 is a good idea. I have some other thoughts as this is related to HTML API behavior.
The behavior may be exceptional with </br>, I don't believe there are other cases where the tag processor will report an open tag and not return an array of attributes.
This stems from how </br> is a bit unusual. It appears to be a close tag, but produces a BR element with no attributes (</br class="c"> does not have a class attribute).
This is handled in WP_HTML_Tag_Processor::is_tag_closer(), but :: get_attribute_names_with_prefix() checks is_closing_tag directly.
<?php public function get_attribute_names_with_prefix( $prefix ): ?array { if ( self::STATE_MATCHED_TAG !== $this->parser_state || $this->is_closing_tag ) { return null; } $comparable = strtolower( $prefix ); $matches = array(); foreach ( array_keys( $this->attributes ) as $attr_name ) { if ( str_starts_with( $attr_name, $comparable ) ) { $matches[] = $attr_name; } } return $matches; }
The behavior around </br> is somewhat inconsistent in that regard.
::get_attribute_names_with_prefix() could return an empty array in this case, making it consistent with other open tags. Swapping $this->is_closing_tag for $this->is_tag_closer() in the snippet above seems sufficient in that regard.
Other methods like ::set_attribute() continue to check $this->is_closing_tag and always return false. HTML modification methods would require more involved HTML changes (eliminate the closing / from </br>), the type of modification that the HTML API has avoided so far although it doesn't seem unreasonable.
#5
@
6 months ago
Here are a couple of notes about the history of this usage, as the issue at hand lies with the Interactivity API’s use of the HTML API, not with the defined and commented and annotated behaviors of the HTML API.
::get_attribute_names_with_prefix() may return null, so it’s always the case that we need to assume it may do that and write our code accordingly. it essentially indicates the difference between this has no attributes and this cannot have any attributes in any situation. So for opening tags we find an empty set of arrays, but HTML comments, text nodes, etc…, which cannot have attributes — those return null. That nuance is not well documented in the function interface, and we could also go about updating it to be clearer. Regardless, null is still a valid return type.
The issue with </br> is identical to with an unexpected </p>. These tags cannot contain HTML attributes and so return null. We discussed the role of the Tag Processor in these cases of whether they should report as closing tags or as opening tags and we decided at the time to try and keep the Tag Processor focused as a lexer and the HTML Processor as the machine of imbuing semantic interpretation to the tokens (@zieladam was part of these discussions and I can’t remember his thoughts on the matter).
It’s probably fine/safe to consider </br> an opening tag because it’s atomic, but if we do that it does then raise concerns about inconsistency with </p> because _sometimes_ the </p> is an opening tag and _sometimes_ it’s a closing tag, but the Tag Processor lacks the knowledge to make that decision. In the same way there are a plethora of cases where the low-level tokenization remains a bit less expected because the nuance is if you want to know semantic information about the HTML you need to use the HTML Processor. Here are some other examples:
- in the Tag Processor,
<image>is an opening tag whose name isIMAGE. In the HTML Processor it’s a void element whose name isIMG, unless it’s in the SVG or MathML context and then it’s still anIMAGE.get_tag()reports the name difference. - in the Tag Processor,
</sarcasm>is a closing tag whose name isSARCASM. In the HTML Processor it’s non-existent; completely skipped-over. - in the Tag Processor,
</p><p>contains a closingPtag and then an openingPtag. In the HTML Processor it contains an openingPtag, a closingPtag, a second openingPtag, and a second closingPtag, in that order. the Tag Processor has no way of knowing whether</p>should open or close aPelement.
</br> is one of very few exceptions I would feel comfortable incorporating into the HTML API, but I am also very reluctant to move some semantic exceptions into the Tag Processor when we know that someone needs to rely on the HTML Processor if they want that. we might already make some exceptions like this and this may all seem duplicitous, but I want to stress that just because something is unexpected or because very-low-level primitives are severely delegated between the two classes, that does not imply we necessarily have to change the interface.
My recommendation is that code calling get_attribute_names_with_prefix() which does not check for null simply do this for convenience.
<?php $count = count( $processor->get_attribute_names_with_prefix() ?? array() );
We can step back and note that this reported problem does not manifest when using the HTML Processor.
#6
@
6 months ago
To clarify, special </br> handling already exists in the Tag Processor, which is why this is exceptional. I believe it's the only case where you can do this:
<?php $p = new WP_HTML_Tag_Processor('</br>'); $p->next_tag( array( 'tag_closers' => 'visit' ) ); var_dump($p->get_tag()); if ( ! $p->is_tag_closer() ) { assert( null !== $p->get_attribute_names_with_prefix('') ); // Throws with </br>, not with </p> }
It seems inconsistent to
- Not stop on the
</br>if not visiting closers - On the
</br>returnfalsefrom::is_tag_closer(). - Return
nullafter reporting that it's a tag opener.
The appropriate way to handle this and what it means to be consistent is debatable.
The HTML Processor uses "virtual tokens" to address this for things like </p> where a P element may be pushed to the stack and then popped. In that case, there's a "virtual" <p> followed by a "real" </p> and everything works out. </br> is exceptional for a few reasons:
- It's not virtual, that token really behaves like a tag opener.
- It's propagated all the way from the "before html" insertion mode.
An end tag whose tag name is "br"
Drop the attributes from the token, and act as described in the next entry; i.e. act as if this was a "br" start tag token with no attributes, rather than the end tag token that it actually is.
What's absolutely clear is that get_attribute_names_with_prefix() should have a null check on its return value. The rest of this is nuance around HTML that has some issues to begin with. I don't feel strongly about the best path forward, but I do see inconsistencies in the way it's currently handled.
#7
follow-up:
↓ 8
@
6 months ago
agreed @jonsurrell we should make the Tag Processor consistent. I don’t know what that implicates for these specific instances.
I propose we focus this ticket on updating the Interactivity API calling code, which is actually where the bug is in place, since the HTML API is weirdly upholding its contract, even if in an unexpected way.
#8
in reply to:
↑ 7
@
6 months ago
- Milestone changed from Awaiting Review to 6.9
Replying to dmsnell:
I propose we focus this ticket on updating the Interactivity API calling code, which is actually where the bug is in place, since the HTML API is weirdly upholding its contract, even if in an unexpected way.
Agreed.
#9
@
6 months ago
@jonsurrell @dmsnell thanks for all the context here!
I agree with addressing Interactivity API issue to avoid the fatal error
I'll get this updated in my PR ASAP, and also add unit tests
#10
@
6 months ago
The issue with </br> is identical to with an unexpected </p>. These tags cannot contain HTML attributes and so return null.
AFAIR closing tags can contain attributes. The HTML spec has this parsing path:
Tag open state -> End tag open state -> Tag Name State -> before attribute name state -> attribute name state.
I remember we've discussed that in context of the inline token / funky comments and decided not to expose them.
@jonsurrell commented on PR #9641:
5 months ago
#11
👋 @hugosolar Just checking in, are you able to finish this up or would you like some help? The next release is approaching and it would be nice to get this fix in place.
#12
@
4 months ago
- Milestone changed from 6.9 to 7.0
The RC1 release is approaching, and since the feedback on the pull request has not yet been incorporated, I'm punting this ticket to 7.0.
If this issue significantly impacts the 6.9 release, please submit an updated pull request, and I would appreciate it if it could be resolved before RC1.
#13
@
4 months ago
@wildworks yes! sorry for the delay
a lot going on here
I'l make sure to update my PR before EOW
This ticket was mentioned in PR #10485 on WordPress/wordpress-develop by @samueljseay.
4 months ago
#14
Trac ticket: https://core.trac.wordpress.org/ticket/63891
@hugosolar commented on PR #9641:
4 months ago
#15
@sirreal I've pushed the fix that seems to work better regarding the tests I've made here
Also, added one test to verify if </br> is breaking _process_directive function
I tested it locally and it's passing but not sure why it's failing in the pipelines
would you help me to figure what's the issue here?
The error is throwing this
Parse error: syntax error, unexpected ')' in /var/www/tests/phpunit/tests/interactivity-api/wpInteractivityAPI.php on line 1272
checked that line but can't see anything wrong with it
@hugosolar commented on PR #9641:
4 months ago
#16
Thanks for the help @sirreal !
I think this is ready for a CR
This ticket was mentioned in Slack in #core by welcher. View the logs.
4 months ago
#18
@
4 months ago
- Milestone changed from 7.0 to 6.9
@jonsurrell @luisherranz if this can be committed before RC 1 tomorrow, we can get it into WordPress 6.9. I'll move it back into the milestone and follow up then.
This may be an edge case for posts created with Classic editor or classic editor block.
_process_directivesfunction withinWP_Interactivity_APIclass will process html and check for closer tags.if there's a wrong tag like
</br>(in my case) will result in a TypeError:by checking the line of code,
count()function is applied toget_attribute_names_with_prefix()method fromWP_HTML_Tag_Processorwhich can eventually returnnullfor newer PHP version (8.0.0 and later) this will throw a fatal error.
probably worth getting an extra check if the funcion is returning a countable variable