#8912 closed defect (bug) (fixed)
wptexturize malforms HTML comments that contain HTML tags
Reported by: |
|
Owned by: |
|
---|---|---|---|
Milestone: | 4.0 | Priority: | normal |
Severity: | normal | Version: | 2.7 |
Component: | Formatting | Keywords: | has-patch wptexturize |
Focuses: | Cc: |
Description
Because it's replacing -- with #8211, a comment like <!-- whatever --> put into the HTML part of a post gets broken.
This makes it difficult for people writing special HTML in posts (like people putting in object tags, or javascript, or whatever) to do that sort of thing.
What is needed is to recognize --> as different from -- and not replace it with the en dash in that case.
Attachments (2)
Change History (43)
#3
@
15 years ago
Note: Visual mode doesn't allow HTML comments, the <>'s get altered to lt; and gt; and so on.
But in HTML mode, comments works fine in 2.9.2, at least.
#4
@
15 years ago
- Component changed from General to Formatting
- Milestone set to 3.0
- Resolution worksforme deleted
- Status changed from closed to reopened
- Version changed from 2.7 to 2.9.2
Have a confirmed failure.
This code fails:
<ul><li>Hello.</li><!--<li>Goodbye.</li>--></ul>
It gets converted to this:
<ul> <li>Hello.</li> <p><!-- <li>Goodbye.</li> <p>–></ul>
Happens on a default installation.
#5
@
15 years ago
You're including extra filters in that, Otto.
Here's what wptexturize()
returns for that example string:
`
<ul><li>Hello.</li><!--<li>Goodbye.</li>–></ul>
`
#7
@
15 years ago
- Summary changed from wp_texturize breaks HTML comments in posts to wptexturize malforms HTML comments that contain HTML tags
BTW, it would have been better to open a new ticket on this as this is a separate issue. ;)
#8
follow-up:
↓ 9
@
15 years ago
How is it a separate issue? It does exactly what I said it did last year. The comment got converted to the 8211 incorrectly. It's the same issue.
#9
in reply to:
↑ 8
@
15 years ago
Replying to otto42:
How is it a separate issue? It does exactly what I said it did last year. The comment got converted to the 8211 incorrectly. It's the same issue.
<!-- foobar -->
works though, right? That was the original issue.
The reason --
is being converted to –
in your above example is because the HTML tags in the comment are breaking wptexturize()
's comment detection (it doesn't realize it's an HTML comment). It's totally a valid issue, but this is a bug in the HTML comment detection code rather than all HTML comments.
In short, I'm just nitpicking minor technicalities. Don't mind me. :)
#14
@
14 years ago
On production sites HTML comments in the post_content seem to be extremely rare, commented HTML i.e. the example above, are virtually non existent. Currently wptexturize()
supports simple HTML comments properly:
<ul><li>Hello.</li><!--Goodbye.--></ul>
works as expected.
The only user case for supporting commented HTML seems to be when a plugin developer wants to test the plugin's output. Don't think it's worth it adding some redundant regexp that will run on hundreds of millions of posts every day just for that. Perhaps better would be to add to the plugin developers part of the codex that commented HTML is not supported in the post content.
Suggesting: wontfix.
#15
@
14 years ago
Responding to @azaozz:
I have a plugin that is affected by this in some cases. It's been an outstanding bug for a long time now, and I can't fix it within the plugin itself. See the Graceful Pull-Quotes plugin. Users can input an "alternate" text within an HTML comment, but with this bug they can't use tags at all. A similar (the same?) bug prevents HTML entities in comments -- they get malformed in the same way as tags. Non-English speakers contact me all the time with this.
#16
@
14 years ago
I can also see situations where an author wants to temporarily "deactivate", but not delete, a chunk of text, and puts it in a comment. He better not have any tags in that text! So, yes I think this is a legitimate problem.
#19
@
14 years ago
- Milestone changed from Future Release to 3.3
Moving to 3.3, as #16060 was marked 3.2-early but didn't make it into 3.2.
#21
@
13 years ago
- Keywords needs-refresh added
Since [17636], the result for my example from #16060 is a bit different:
<!-- Sample list <ul> <li>Sample item</li> </ul> -->
Output:
<!-- Sample list <ul> <li>Sample item</li> -->
Comment closing tag is preserved, but </ul>
tag is missing.
#23
@
13 years ago
- Keywords needs-refresh needs-unit-tests removed
I have a workaround. Just before the close comment tag, put another open comment tag. That second open comment tag will be commented out, but it forces the close tag to be recognized and not be reformatted as an en-dash.
Example that fails:
<!-- <b> text </b> -->
Example that works:
<!-- <b> text </b> <!-- -->
In other words, if you have comments that enclose other tags, just make sure the last tag is another open comment tag.
#25
@
13 years ago
- Cc harrismw added
- Version changed from 2.9.2 to 3.3.1
Still existing in the latest version of WP (3.3.1) which I just updated to from 2.9.something.
The code which sets it off is:
<!-- <li><a href="/news-and-events">News & Events</a>: A record of news and events related to philosophy of religion.</li> -->
I think I must have edit core before to work around this bug. Which, funny thing, with the update ...
(Also, where's the "bloody annoying" severity level, I don't think it's "major", but it's certainly not "normal")
#26
@
13 years ago
- Version changed from 3.3.1 to 2.7
Version number indicates when the bug was initially introduced/reported.
#27
follow-up:
↓ 28
@
13 years ago
- Cc curtiss@… added
I've come across a new wrinkle in this issue that I haven't seen mentioned yet. HTML comments around entire lines (or groups of lines) occasionally causes those lines to be deleted altogether (and the HTML structure to get messed up) when switching between the HTML editor and the Visual Editor.
Take the following code for example:
<ol> <li>List Item 1</li> <li>List Item 2</li> <li>List Item 3</li> </ol>
Now, add an HTML comment like so:
<ol> <li>List Item 1</li> <!-- <li>List Item 2</li>--> <li>List Item 3</li> </ol>
Then, switch to the visual editor and switch back to the HTML editor. You're left with the following:
<ol> <ol> <li>List Item 1</li> </ol> </ol> <ol> <li>List Item 3</li> </ol>
As another example, if you take the following HTML:
List Item 1 <strong>List Item 2</strong> List Item 3
Then, add an HTML comment like:
List Item 1 <!--<strong>List Item 2</strong>--> List Item 3
Then, switch back to the visual editor and back to the HTML editor again, and you're left with the following code in the HTML editor:
List Item 1 List Item 3
#28
in reply to:
↑ 27
@
13 years ago
Replying to cgrymala:
I am also seeing this, furthermore, the comments are being completely removed, even if there are no tags inside. For example, the following text, set up for use with the 'Graceful Pull-Quotes' plugn
<span class="pullquote"><!--Joan Cuneos successes, against prominent male racers in the USA, led to women being banned--></span>
completely disappears after switcing to visual then back to html. This bug would appear to completely prohibit the use of hidden text for special formatting.
#31
@
12 years ago
Got a 'weird' bug on my plug-in "Cimy User Extra Fields", at the end is due to exactly this issue. It took me 1 hour of investigation to understand what was going on.
@ericaentertainment
Thanks for the workaround, works great, but may break in the future. A real fix would be nice from WP developers, I know I can avoid HTML comments, but you can also avoid to mess with them :)
Thank you.
#32
@
12 years ago
Actually my issue was even more subtle:
basically I have a html comment in my plugin with just the plugin's name, version and author, no html tags.
So how that triggered this bug?
When used on certain themes they were applying to the_content() first wpautop() function, that transformed all new lines into html break tags and then applied wptexturize().
So in that case, even if I had a totally harmless comment, but on multiple lines, the combination of the two functions made my "end of the html comment tag" changed and then all the page broke.
This the piece that is found in some themes like: "Modular" and "Emporium" for "WooCommerce":
# Format and append to content $new_content .= wptexturize(wpautop($piece_intl));
To fix the problem I had to remove all "new lines" in my html comment.
Just to give the idea how a bug can be amplified by pure "luck" :)
#33
@
11 years ago
- Keywords needs-unit-tests removed
We already "disable" wptexturize() when we experience certain shortcodes and tags. If we added support for HTML comments — disable texturize when we see <!--
, re-enable it after -->
— I think that'd solve our problems and shouldn't be too much of a nightmare to implement.
An alternative, cheap fix would be to not do the '--'
em-dash replacement if the next character is >
. We'd never want an em-dash there.
[17636] made this patch stale.
Removing needs-unit-tests as some exist, though more would of course be nice.
#34
follow-up:
↓ 37
@
11 years ago
- Keywords needs-refresh removed
8912.diff is a cheap patch. It simply does a lookahead for >
before replacing --
with an en dash. So, the closing HTML comment is at least no longer broken. This only allows an assertion to pass, which I've broken off into a discrete test.
The bug still remains that inside HTML comments, things are still texturized. (As seen by the failed test.) Given the nature of HTML comments, this isn't a big deal, but it sure would be a nicer fix overall.
#37
in reply to:
↑ 34
@
11 years ago
Replying to nacin:
It simply does a lookahead for
>
before replacing--
Does WP allow any space between those? I believe it's permitted in HTML.
#38
@
11 years ago
8912.diff would break xn-- which is magical.
Let's visit this ticket only after closing #23185 to avoid confusion.
#39
@
11 years ago
Alternatively, we could roll this up with #12690 because there is no reason to texturize inside HTML comments.
Just tested with trunk. No problems.