Make WordPress Core

Opened 6 months ago

Last modified 3 months ago

#63426 new defect (bug)

wptexturize breaks markup

Reported by: madhazelnut's profile madhazelnut Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 6.6.2
Component: Formatting Keywords: wptexturize needs-patch dev-feedback
Focuses: Cc:

Description

If wptexturize is fed the following [valid](https://validator.w3.org/) HTML:

<!DOCTYPE html>
<html>
<head><title>Hello World</title></head>
<body>
<iframe srcdoc="<body></body>"></iframe>
</body>
</html>

It'll mess it up incorrectly converting the closing quote of the srdoc like such:

<iframe id="deeplawn-frame" srcdoc="<body></body>&#8221; ></iframe>

Change History (9)

#1 @sabernhardt
6 months ago

  • Component changed from General to Formatting
  • Keywords wptexturize added

#2 follow-up: @smahjoob
6 months ago

Testing Instructions

These steps define how to reproduce the wptexturize() srcdoc bug, and indicate the expected behavior.

Steps to Reproduce

In a clean install of WordPress trunk (r60231, May 11 2025) with Twenty Twenty-Five active and no plugins, load a PHP script or post that outputs:

html
Copy
Edit
<iframe srcdoc="<body></body>"></iframe>
Pass that HTML through wptexturize(), e.g. via:

php
Copy
Edit
echo wptexturize( '<iframe srcdoc="<body></body>"></iframe>' );
🐞 Observe that the output becomes:

html
Copy
Edit
<iframe srcdoc="<body></body>&#8221; ></iframe>
—the closing quote of srcdoc is replaced with &#8221;, corrupting the attribute.

Expected Results

When testing a patch to validate it works as expected:

✅ wptexturize() leaves the srcdoc attribute intact, producing:

html
Copy
Edit
<iframe srcdoc="<body></body>"></iframe>
When reproducing the bug (no patch applied):

❌ The closing " in srcdoc is not converted to a smart quote, and valid HTML remains valid.

Supplemental Artifacts

Add Inline: No image "REPLACE_WITH_DIAGNOSTIC_SCREENSHOT_URL" attached to Ticket #63426

Test Report Icons:
🐞 <= Indicates where issue (“bug”) occurs.
✅ <= Behavior is expected.
❌ <= Behavior is NOT expected.

Version 3, edited 6 months ago by smahjoob (previous) (next) (diff)

#3 in reply to: ↑ 2 ; follow-up: @madhazelnut
6 months ago

Replying to smahjoob:

Re-tested on trunk r60231 (May 11 2025) with Twenty Twenty-Five and no plugins.
Both /wp-admin/ and /wp-admin/edit.php?view=embed only highlight the correct menu item (Dashboard and Posts, respectively) with no stray current-menu-item classes. WP_DEBUG shows no related notices.
Result: Cannot reproduce the reported issue in core.

was this intended for a differnt issue?

#4 in reply to: ↑ 3 ; follow-up: @smahjoob
6 months ago

Replying to madhazelnut:

Replying to smahjoob:

Re-tested on trunk r60231 (May 11 2025) with Twenty Twenty-Five and no plugins.
Both /wp-admin/ and /wp-admin/edit.php?view=embed only highlight the correct menu item (Dashboard and Posts, respectively) with no stray current-menu-item classes. WP_DEBUG shows no related notices.
Result: Cannot reproduce the reported issue in core.

was this intended for a differnt issue?

I apologize for the mistakes. It was corrected.

#5 in reply to: ↑ 4 ; follow-up: @SirLouen
6 months ago

  • Keywords needs-patch added

Replying to smahjoob:

was this intended for a differnt issue?

I apologize for the mistakes. It was corrected.

You are welcome to submit a patch with your code. Check these instructions

@sabernhardt what is wptexturize workflow keyword? 👀

Last edited 6 months ago by SirLouen (previous) (diff)

#6 in reply to: ↑ 5 @sabernhardt
6 months ago

The workflow keywords generally should not be used for "tags" like wptexturize, but fixing wptexturize is one of the projects that spanned multiple related tickets (and the project was not completed).

This ticket was mentioned in Slack in #core-test by oglekler. View the logs.


6 months ago

#8 @SirLouen
6 months ago

  • Keywords dev-feedback added

In today's test-triage we were discussing if wptexturize uses were being taken too far, maybe. Judging by the docs, it's difficult to understand the scope of this function. It appears that it's open for too many use cases. You can use wptexturize from a whole HTML document to just for a paragraph (ranging from maximum complexity to something relatively trivial)

@oglekler was suggesting that current exclusions are still very loose. So, for example, iframe should be, by default, in the list of exclusions (default_no_texturize_tags)

I'm still not convinced for such exclusion, but I believe that docs should consider better explaining the scope of this function. Is it really intended to be a Swiss army for text styling almost any doc given? Or its better intended to just over trivial scenarios?

Probably the looseness of this function is the main reason of the chain of tickets related to wptexturize. Adding dev-feedback for further discussion about this topic.

#9 @madhazelnut
3 months ago

I've resorted to completely bypassing wptexturize for now, by calling get_content() instead of the usual the_content() for this very reason. I too find the scope of this function unclear, and thus think it should be refactored into multiple things.

Note: See TracTickets for help on using tickets.