#63694 closed enhancement (fixed)
HTML Processing Improvements in 6.9
| Reported by: |
|
Owned by: |
|
|---|---|---|---|
| Milestone: | 6.9 | Priority: | normal |
| Severity: | normal | Version: | 6.9 |
| Component: | HTML API | Keywords: | has-patch has-unit-tests |
| Focuses: | Cc: |
Description
This ticket is created as a placeholder for various efforts during the 6.9 release cycle to improve WordPress Core reliability handling and processing HTML.
Change History (53)
This ticket was mentioned in PR #9248 on WordPress/wordpress-develop by @dmsnell.
6 months ago
#1
- Keywords has-patch added
This ticket was mentioned in PR #9264 on WordPress/wordpress-develop by @dmsnell.
6 months ago
#2
- Keywords has-unit-tests added
Trac ticket: Core-63694
Prep work for #9248.
This ticket was mentioned in PR #9259 on WordPress/wordpress-develop by @dmsnell.
6 months ago
#3
Trac ticket: Core-63694
Prep work for #9248.
This ticket was mentioned in PR #9258 on WordPress/wordpress-develop by @dmsnell.
6 months ago
#4
Trac ticket: Core-63694
Prep work for #9248.
This ticket was mentioned in PR #9257 on WordPress/wordpress-develop by @dmsnell.
6 months ago
#5
Trac ticket: Core-63694
Prep work for #9248
This ticket was mentioned in PR #9255 on WordPress/wordpress-develop by @dmsnell.
6 months ago
#6
This ticket was mentioned in PR #9270 on WordPress/wordpress-develop by @dmsnell.
6 months ago
#7
Trac ticket: Core-63694
This probably improves the performance in terms of both CPU time and memory compared to the old PCRE-based approach.
This ticket was mentioned in PR #9271 on WordPress/wordpress-develop by @dmsnell.
6 months ago
#8
Trac ticket: Core-63694
This ticket was mentioned in PR #9272 on WordPress/wordpress-develop by @dmsnell.
6 months ago
#9
Trac ticket: Core-63694
This also decodes the URL whereas the previous code didn’t, so strings like http:// will be properly decoded as http://.
This ticket was mentioned in Slack in #core by benjamin_zekavica. View the logs.
6 months ago
#17
@
6 months ago
GB50050 may be interesting candidate for improvement. It seems related to these efforts.
@jonsurrell commented on PR #9270:
6 months ago
#18
I believe this would fix https://core.trac.wordpress.org/ticket/45387.
#19
@
5 months ago
- Owner set to nerrad
- Resolution set to fixed
- Status changed from new to closed
In 60665:
#21
@
5 months ago
- Resolution fixed deleted
- Status changed from closed to reopened
As there's an is_string() check already, the ! empty( $href ) should be changed to '' !== $href.
This ticket was mentioned in PR #9809 on WordPress/wordpress-develop by @TobiasBg.
4 months ago
#23
As there's an is_string() check already, the ! empty( $href ) can be simplified to a string comparison, as other variable types that are checked in empty() won't appear.
empty() also returns false for the string "0" which would however be a valid (relative) URL and thus should be detectable by the function.
Trac ticket: https://core.trac.wordpress.org/ticket/63694#comment:21
4 months ago
#27
@github-actions why don’t I come in and mess with all of your work unsolicited, huh?
4 months ago
#28
@github-actions: You are older than two weeks, therefore you have been deemed obsolete. Closing your bot.
4 months ago
#29
If I had a dime, @github-actions, for every time you barged in and interrupted my work, I wouldn’t need to work.
Oh if only you gave some indication of where to go to restrain your over-zealous ideology, forcing your mindset arbitrarily on those around you, oh what a bug report or patch I would love to provide. But no, you are faceless, left only to deny and reject and delete. You are @github-actions-I-will-destroy bot, born to raze and raised to burn.
This ticket was mentioned in PR #9850 on WordPress/wordpress-develop by @dmsnell.
4 months ago
#31
Trac ticket: Core-63694
See: #9270,
This ticket was mentioned in PR #9851 on WordPress/wordpress-develop by @dmsnell.
4 months ago
#32
This ticket was mentioned in Slack in #core by welcher. View the logs.
4 months ago
This ticket was mentioned in PR #10043 on WordPress/wordpress-develop by @dmsnell.
4 months ago
#34
Trac ticket: Core-63694.
This patch introduces a new CSS helper module containing a new function, wp_split_class_names(). This function wraps some code to rely on the HTML API to take an HTML class attribute value and return a Generator to iterate over the classes in that value.
Many existing functions perform ad-hoc parsing of CSS class names, usually by splitting on a space character. However, there are issues with this approach:
- There is no decoding of HTML character references, which is normative inside HTML attributes.
- There is no handling of null bytes.
- Class names can be split by more than just the space character.
- There is no handling of duplicates, and while mostly benign, code forgetting to account for duplicates can lead to defects.
The new function handles the nuances to let developers focus on reading CSS class names, adding new class names, and removing class names. This serves a middleground between legacy code interacting with CSS class names in isolation and code processing full HTML documents.
@westonruter commented on PR #10043:
3 months ago
#35
- The name isn’t great.
What about wp_parse_css_class_names()? I think this would be more clear. Mentioning “css” makes it clear you're not talking about PHP class names somehow. And “parse” implies it's not as simple as just splitting on whitespace tokens.
@westonruter commented on PR #10043:
3 months ago
#36
- Should it be more useful to people wanting to conditionally add class names? Something more akin to
classnames()in JS? We could pass varargs which arestring|falseor an array of additional class names to add.
Seems cool, but do we have any use cases for this in core PHP? It would be nice to include some example implementations in the core codebase for this function to actually leverage it.
@dmsnell commented on PR #10043:
3 months ago
#37
What about wp_parse_css_class_names()?
I like this, though I still like split since it communicates the intent. parse here feels like it communicates more than it performs. I am changing it to wp_split_css_class_list() — maybe something like wp_explode_css_class_names() would also work, at the cost of getting long.
Would love to continue stewing on the name. Overly-short, overly-long, it’s hard to find one that’s just right.
@dmsnell commented on PR #10043:
3 months ago
#38
I’ve turned this into a static method on the Tag Processor, but I instantly don’t like it because it lost the nuance of decoding HTML character references.
This is a conundrum, however, because existing code mixes decoded and non-decoded class names. For example, code will read the class attribute on an HTML string, but then add new raw class names to a list. While it’s unlikely that someone adds a class whose name _should_ be &, if they do so, there’Í a discrepancy between the existing classes and this new one — what should be escaped or unescaped?
---
I may revert the last commit. While it’s helpful that this function properly splits and deduplicates that class names, decoding the HTML character references was an important piece as well, and I think that’s a bit harder to merge into the Tag Processor’s interface.
@dmsnell commented on PR #10043:
3 months ago
#39
@westonruter I tossed out some refactors in #10215. They highlight two things to me:
- there needs to be more clarity around whether the inputs are HTML escaped or not.
- the functions should return an array and not an iterator.
It also leads me to feel like having a new separate function is best and exporting the internals of the HTML API is a mistake. Perhaps there is room for two new functions:
wp_parse_html_class_attribute()wp_split_decoded_class_list()
Something like this to more clearly communicate whether things like null bytes and character references shall be transformed or whether it’s assumed that the class names are the “raw” and unescaped class names build within source code.
This ticket was mentioned in PR #10218 on WordPress/wordpress-develop by @dmsnell.
3 months ago
#40
Trac ticket: Core-63694.
See wordpress/gutenberg#72264.
For classic themes, image blocks need to create a DIV wrapper which contains alignment classes from the inner FIGURE. This has been processed using PCRE matching.
With this change the HTML API is used instead of PCRE functions to provide more semantic transformation, clearer intent, and eliminate possible parsing issues.
@dmsnell commented on PR #10218:
3 months ago
#41
@tellthemachines I updated this patch, it still had the wrong negation in it that you found in the Gutenberg side. Here is the diff of the diffs between this patch and the one applied in Gutenberg.
--- /var/folders/lv/12zyh9p565q7mmycrw6zqkvw0000gn/T//.psub.C4OXyR 2025-10-17 15:55:08
+++ /var/folders/lv/12zyh9p565q7mmycrw6zqkvw0000gn/T//.psub.zmgqZj 2025-10-17 15:55:09
@@ -1,11 +1,19 @@
-diff --git a/src/wp-includes/block-supports/layout.php b/src/wp-includes/block-supports/layout.php
-index 454eea3c80..63eb384e77 100644
---- a/src/wp-includes/block-supports/layout.php
-+++ b/src/wp-includes/block-supports/layout.php
-@@ -1074,50 +1074,53 @@ add_filter( 'render_block_core/group', 'wp_restore_group_inner_container', 10, 2
+diff --git a/lib/block-supports/layout.php b/lib/block-supports/layout.php
+index bc6da575724..667c7b5c614 100644
+--- a/lib/block-supports/layout.php
++++ b/lib/block-supports/layout.php
+@@ -1113,7 +1113,6 @@ if ( function_exists( 'wp_restore_group_inner_container' ) ) {
+ }
+ add_filter( 'render_block_core/group', 'gutenberg_restore_group_inner_container', 10, 2 );
+
+-
+ /**
+ * For themes without theme.json file, make sure
+ * to restore the outer div for the aligned image block
+@@ -1124,50 +1123,53 @@ add_filter( 'render_block_core/group', 'gutenberg_restore_group_inner_container'
* @return string Filtered block content.
*/
- function wp_restore_image_outer_container( $block_content, $block ) {
+ function gutenberg_restore_image_outer_container( $block_content, $block ) {
- $image_with_align = "
-/# 1) everything up to the class attribute contents
-(
@@ -90,4 +98,4 @@
+ return "{$wrapper_processor->get_updated_html()}{$figure_processor->get_updated_html()}</div>";
}
- add_filter( 'render_block_core/image', 'wp_restore_image_outer_container', 10, 2 );
+ if ( function_exists( 'wp_restore_image_outer_container' ) ) {
I’m going to merge this, based on a high confidence that the changes are identical now. But we might want to confirm during the Beta phase that I did this right 😄
@dmsnell commented on PR #10218:
3 months ago
#43
#49
@
3 months ago
The 6.9 Beta1 release is coming soon and I would like to know the status of this ticket. Should we close this ticket?
#50
@
3 months ago
- Resolution set to fixed
- Status changed from reopened to closed
As the Beta1 release begins, I will close this ticket. If there are any other issues that need to be addressed, please leave a comment.
This ticket was mentioned in Slack in #core by wildworks. View the logs.
3 months ago
This ticket was mentioned in Slack in #core by desrosj. View the logs.
3 months ago
@jonsurrell commented on PR #9248:
10 days ago
#53
This function had quirks that change with this PR and I want to understand them.
I created a test suite for wp_kses_hair(), then I merged this branch and updated to get a diff of test changes. I also looked at several of the most popular results from WP Directory to understand usage.
My review of the most common usages on suggest that _this change is safe to make and would not negatively impact plugin authors_.
- Historically the value and whole properties of the returned array indicate the raw parsed bytes from the HTML (with some exceptions). This means that HTML character references are not decoded. This represents an abstraction leak between the HTML and structural return value.
- - Should this refactor leave the messy return values in place or should it decode the attribute values to enforce the view of the world developers are imagining when calling it? (that all values are normal PHP strings and not HTML text node strings)?
This is a tricky question. It doesn't _seem_ like folks rely on specifics of the input representation being present in the output, however it's certainly possible.
In one of the examples from plugins, esc_attr() is called on the attribute value to construct a new HTML string. This should be perfectly fine because the original HTML was re-encoded in this PR and esc_attr() will avoid double-encoding. They also statically wrap with ", which made the esc_attr() necessary because the attribute value could have contained "!
After some reflection, I believe the behavior you've implemented here _is a good decision_. Consider that the input is HTML and the output (value and whole) have always been some form of HTML. The difference here is a _normalization_ of the HTML in the output.
---
<details>
<summary>behavior diff</summary>
-
tests/phpunit/tests/kses/wpKsesHair.php
diff --git a/tests/phpunit/tests/kses/wpKsesHair.php b/tests/phpunit/tests/kses/wpKsesHair.php index 2ed83679f2e3d..05d573bc070bc 100644
a b public function data_attribute_parsing() { 57 57 'title' => array( 58 58 'name' => 'title', 59 59 'value' => 'My Title', 60 'whole' => "title='My Title'",60 'whole' => 'title="My Title"', 61 61 'vless' => 'n', 62 62 ), 63 63 ), … … public function data_attribute_parsing() { 188 188 array( 189 189 'title' => array( 190 190 'name' => 'title', 191 'value' => '& #60;test>',192 'whole' => 'title="& #60;test>"',191 'value' => '<test>', 192 'whole' => 'title="<test>"', 193 193 'vless' => 'n', 194 194 ), 195 195 ), … … public function data_attribute_parsing() { 200 200 array( 201 201 'title' => array( 202 202 'name' => 'title', 203 'value' => '& #x3C;hex>',204 'whole' => 'title="& #x3C;hex>"',203 'value' => '<hex>', 204 'whole' => 'title="<hex>"', 205 205 'vless' => 'n', 206 206 ), 207 207 ), … … public function data_attribute_parsing() { 212 212 array( 213 213 'title' => array( 214 214 'name' => 'title', 215 'value' => '& #X3C;HEX>',216 'whole' => 'title="& #X3C;HEX>"',215 'value' => '<HEX>', 216 'whole' => 'title="<HEX>"', 217 217 'vless' => 'n', 218 218 ), 219 219 ), … … public function data_attribute_parsing() { 224 224 array( 225 225 'title' => array( 226 226 'name' => 'title', 227 'value' => '& invalid; &#; &#x;',228 'whole' => 'title="& invalid; &#; &#x;"',227 'value' => '&invalid; &#; &#x;', 228 'whole' => 'title="&invalid; &#; &#x;"', 229 229 'vless' => 'n', 230 230 ), 231 231 ), … … public function data_attribute_parsing() { 249 249 'data-text' => array( 250 250 'name' => 'data-text', 251 251 'value' => 'Single quoted value', 252 'whole' => "data-text='Single quoted value'",252 'whole' => 'data-text="Single quoted value"', 253 253 'vless' => 'n', 254 254 ), 255 255 ), … … public function data_attribute_parsing() { 267 267 'alt' => array( 268 268 'name' => 'alt', 269 269 'value' => 'single', 270 'whole' => "alt='single'",270 'whole' => 'alt="single"', 271 271 'vless' => 'n', 272 272 ), 273 273 'id' => array( … … public function data_attribute_parsing() { 284 284 array( 285 285 'title' => array( 286 286 'name' => 'title', 287 'value' => "It's working",288 'whole' => 'title="It \'s working"',287 'value' => 'It's working', 288 'whole' => 'title="It's working"', 289 289 'vless' => 'n', 290 290 ), 291 291 ), … … public function data_attribute_parsing() { 296 296 array( 297 297 'title' => array( 298 298 'name' => 'title', 299 'value' => 'He said "hello"',300 'whole' => 'title= \'He said "hello"\'',299 'value' => 'He said "hello"', 300 'whole' => 'title="He said "hello""', 301 301 'vless' => 'n', 302 302 ), 303 303 ), … … public function data_attribute_parsing() { 327 327 328 328 yield 'invalid attribute name starting with number' => array( 329 329 '1invalid="value"', 330 array(), 330 array( 331 '1invalid' => array( 332 'name' => '1invalid', 333 'value' => 'value', 334 'whole' => '1invalid="value"', 335 'vless' => 'n', 336 ), 337 ), 331 338 ); 332 339 333 340 yield 'invalid attribute name special chars' => array( 334 341 '@invalid="value" $bad="value"', 335 array(), 342 array( 343 '@invalid' => array( 344 'name' => '@invalid', 345 'value' => 'value', 346 'whole' => '@invalid="value"', 347 'vless' => 'n', 348 ), 349 '$bad' => array( 350 'name' => '$bad', 351 'value' => 'value', 352 'whole' => '$bad="value"', 353 'vless' => 'n', 354 ), 355 ), 336 356 ); 337 357 338 358 yield 'duplicate attributes first wins' => array( … … public function data_attribute_parsing() { 355 375 356 376 yield 'malformed unclosed double quote' => array( 357 377 'title="unclosed class="test"', 358 array(), 378 array( 379 'title' => array( 380 'name' => 'title', 381 'value' => 'unclosed class=', 382 'whole' => 'title="unclosed class="', 383 'vless' => 'n', 384 ), 385 'test"' => array( 386 'name' => 'test"', 387 'value' => '', 388 'whole' => 'test"', 389 'vless' => 'y', 390 ), 391 ), 359 392 ); 360 393 361 394 yield 'very long attribute value' => array( … … public function data_attribute_parsing() { 610 643 'alt' => array( 611 644 'name' => 'alt', 612 645 'value' => '', 613 'whole' => "alt=''",646 'whole' => 'alt=""', 614 647 'vless' => 'n', 615 648 ), 616 649 'class' => array( … … public function data_attribute_parsing() { 625 658 yield 'forward slashes between attributes' => array( 626 659 'att / att2=2 /// att3="3"', 627 660 array( 628 'att' => array(661 'att' => array( 629 662 'name' => 'att', 630 663 'value' => '', 631 664 'whole' => 'att', … … public function data_attribute_parsing() { 652 685 'att' => array( 653 686 'name' => 'att', 654 687 'value' => 'val', 655 'whole' => "att='val'",688 'whole' => 'att="val"', 656 689 'vless' => 'n', 657 690 ), 658 691 'att2' => array( 659 692 'name' => 'att2', 660 693 'value' => 'val2', 661 'whole' => "att2='val2'",694 'whole' => 'att2="val2"', 662 695 'vless' => 'n', 663 696 ), 664 697 ), … … public function data_attribute_parsing() { 670 703 'att' => array( 671 704 'name' => 'att', 672 705 'value' => 'val', 673 'whole' => "att='val'",706 'whole' => 'att="val"', 674 707 'vless' => 'n', 675 708 ), 676 709 'att2' => array( 677 710 'name' => 'att2', 678 711 'value' => 'val2', 679 'whole' => "att2='val2'",712 'whole' => 'att2="val2"', 680 713 'vless' => 'n', 681 714 ), 682 715 ), … … public function data_attribute_parsing() { 688 721 'att' => array( 689 722 'name' => 'att', 690 723 'value' => 'val', 691 'whole' => "att='val'",724 'whole' => 'att="val"', 692 725 'vless' => 'n', 693 726 ), 694 727 'att2' => array( 695 728 'name' => 'att2', 696 729 'value' => 'val2', 697 'whole' => "att2='val2'",730 'whole' => 'att2="val2"', 698 731 'vless' => 'n', 699 732 ), 700 733 ), … … public function data_attribute_parsing() { 706 739 'att' => array( 707 740 'name' => 'att', 708 741 'value' => 'val', 709 'whole' => "att='val'",742 'whole' => 'att="val"', 710 743 'vless' => 'n', 711 744 ), 712 745 'att2' => array( 713 746 'name' => 'att2', 714 747 'value' => 'val2', 715 'whole' => "att2='val2'",748 'whole' => 'att2="val2"', 716 749 'vless' => 'n', 717 750 ), 718 751 ), … … public function data_attribute_parsing() { 739 772 // Malformed Equals Patterns. 740 773 yield 'multiple equals signs' => array( 741 774 'att=="val"', 742 array(), 775 array( 776 'att' => array( 777 'name' => 'att', 778 'value' => '="val"', 779 'whole' => 'att="="val""', 780 'vless' => 'n', 781 ), 782 ), 743 783 ); 744 784 745 785 yield 'equals with strange spacing' => array( 746 786 'att= ="val"', 747 array(), 787 array( 788 'att' => array( 789 'name' => 'att', 790 'value' => '="val"', 791 'whole' => 'att="="val""', 792 'vless' => 'n', 793 ), 794 ), 748 795 ); 749 796 750 797 yield 'triple equals signs' => array( 751 798 'att==="val"', 752 array(), 799 array( 800 'att' => array( 801 'name' => 'att', 802 'value' => '=="val"', 803 'whole' => 'att="=="val""', 804 'vless' => 'n', 805 ), 806 ), 753 807 ); 754 808 755 809 yield 'equals echo pattern' => array( 756 810 "att==echo 'something'", 757 811 array( 758 'att' => array(812 'att' => array( 759 813 'name' => 'att', 760 814 'value' => '=echo', 761 815 'whole' => 'att="=echo"', 762 816 'vless' => 'n', 763 817 ), 818 "'something'" => array( 819 'name' => "'something'", 820 'value' => '', 821 'whole' => "'something'", 822 'vless' => 'y', 823 ), 764 824 ), 765 825 ); 766 826 767 827 yield 'attribute starting with equals' => array( 768 828 '= bool k=v', 769 829 array( 830 '=' => array( 831 'name' => '=', 832 'value' => '', 833 'whole' => '=', 834 'vless' => 'y', 835 ), 770 836 'bool' => array( 771 837 'name' => 'bool', 772 838 'value' => '', … … public function data_attribute_parsing() { 785 851 yield 'mixed quotes and equals chaos' => array( 786 852 'k=v ="' . "' j=w", 787 853 array( 788 'k' => array(854 'k' => array( 789 855 'name' => 'k', 790 856 'value' => 'v', 791 857 'whole' => 'k="v"', 792 858 'vless' => 'n', 793 859 ), 860 '="' . "'" => array( 861 'name' => '="' . "'", 862 'value' => '', 863 'whole' => '="' . "'", 864 'vless' => 'y', 865 ), 866 'j' => array( 867 'name' => 'j', 868 'value' => 'w', 869 'whole' => 'j="w"', 870 'vless' => 'n', 871 ), 794 872 ), 795 873 ); 796 874 797 875 yield 'triple equals quoted whitespace' => array( 798 876 '===" "', 799 array(), 877 array( 878 '=' => array( 879 'name' => '=', 880 'value' => '="', 881 'whole' => '=="=""', 882 'vless' => 'n', 883 ), 884 '"' => array( 885 'name' => '"', 886 'value' => '', 887 'whole' => '"', 888 'vless' => 'y', 889 ), 890 ), 800 891 ); 801 892 802 893 yield 'boolean with contradictory value' => array( … … public function data_attribute_parsing() { 820 911 yield 'empty attribute name with value' => array( 821 912 '="value" class="test"', 822 913 array( 823 'class' => array( 914 '="value"' => array( 915 'name' => '="value"', 916 'value' => '', 917 'whole' => '="value"', 918 'vless' => 'y', 919 ), 920 'class' => array( 824 921 'name' => 'class', 825 922 'value' => 'test', 826 923 'whole' => 'class="test"', … … public function data_protocol_filtering() { 890 987 'href' => array( 891 988 'name' => 'href', 892 989 'value' => 'alert(1)', 893 'whole' => "href='alert(1)'",990 'whole' => 'href="alert(1)"', 894 991 'vless' => 'n', 895 992 ), 896 993 ), … … public function data_protocol_filtering() { 925 1022 array( 926 1023 'src' => array( 927 1024 'name' => 'src', 928 'value' => 'text/html, <script>alert(1)</script>',929 'whole' => 'src="text/html, <script>alert(1)</script>"',1025 'value' => 'text/html,<script>alert(1)</script>', 1026 'whole' => 'src="text/html,<script>alert(1)</script>"', 930 1027 'vless' => 'n', 931 1028 ), 932 1029 ),
</details>
Here are two examples from the most most popular plugins in the WP Directory search:
From YITH (this appears to be part of the yith library used in many of their plugins):
/**
* Transform attributes array to HTML attributes string.
* If using a string, the attributes will be escaped.
* Prefer using arrays.
*
* @param array|string $attributes The attributes.
* @param bool $echo Set to true to print it directly; false otherwise.
*
* @return string
* @since 3.7.0
* @since 3.8.0 Escaping attributes when using strings; allow value-less attributes by setting value to null.
*/
function yith_plugin_fw_html_attributes_to_string( $attributes = array(), $echo = false ) {
$html_attributes = '';
if ( ! ! $attributes ) {
if ( is_string( $attributes ) ) {
$parsed_attrs = wp_kses_hair( $attributes, wp_allowed_protocols() );
$attributes = array();
foreach ( $parsed_attrs as $attr ) {
$attributes[ $attr['name'] ] = 'n' === $attr['vless'] ? $attr['value'] : null;
}
}
if ( is_array( $attributes ) ) {
$html_attributes = array();
foreach ( $attributes as $key => $value ) {
if ( ! is_null( $value ) ) {
$html_attributes[] = esc_attr( $key ) . '="' . esc_attr( $value ) . '"';
} else {
$html_attributes[] = esc_attr( $key );
}
}
$html_attributes = implode( ' ', $html_attributes );
}
}
if ( $echo ) {
// Already escaped above.
echo $html_attributes; // phpcs:ignore WordPress.Security.EscapeOutput.OutputNotEscaped
}
return $html_attributes;
}
$params = wp_kses_hair( $params, array( 'http' ) );
$width = isset( $params['width'] ) ? (int) $params['width']['value'] : 0;
$height = isset( $params['height'] ) ? (int) $params['height']['value'] : 0;
$wh = '';
if ( $width && $height ) {
$wh = "&w=$width&h=$height";
}
$url = esc_url_raw( "https://www.youtube.com/watch?v={$match[3]}{$wh}" );
Trac ticket: Core-63694
wp_kses_hair()is built around an impressive state machine for parsing the$attrof an HTML tag, that is, the span of text after the tag name and before the closing>. Unfortunately, that parsing code doesn’t fully-implement the HTML specification and may be prone to mis-parsing.This patch replaces the existing state machine with a straight-forward use of the HTML API to parse the attributes for us, constructing a shell take for the
$attrstring and reading the attributes structurally. This shell is necessary because a previous stage of the pipeline has already separated what it thinks is the so-called “attribute list” from a tag.