Opened 18 years ago
Last modified 5 months ago
#2691 accepted defect (bug)
HTML comments in posts aren't handled properly.
Reported by: | gord | Owned by: | adamsilverstein |
---|---|---|---|
Milestone: | Future Release | Priority: | normal |
Severity: | normal | Version: | 2.8.5 |
Component: | Formatting | Keywords: | wpautop dev-feedback has-patch needs-unit-tests needs-refresh close |
Focuses: | Cc: |
Description
When an HTML comment is added in a post, autop adds paragraph ( <p> ) tags around the comment and for multi-line comments line breaks ( <br /> ) are added after every line. This should not happen in HTML comments.
This ticket is similar to #712 which was closed with wontfix. I would like to know why this isn't seen as an issue? It prevents the addition of RDF and other metadata, not to mention just plain old HTML comments in posts.
Attachments (9)
Change History (48)
#2
@
18 years ago
Maybe a bandaid on it? (e.g.)
$pee = preg_replace('!<p><!--(.*?)--></p>!ise', " '<!--' . stripslashes(clean_pre('$1')) . '-->' ", $pee);
clean_pre() will ruin any comments which are supposed to have <p> and <br /> in them, but that's an even edgier case.
Gord, you could also try a plugin replacement for wpautop -- I know Markdown handles comments fine.
#3
@
17 years ago
- Resolution set to wontfix
- Status changed from new to closed
No traction in over a year, so closing as wontfix.
Feel free to reopen if you have patches/suggestions/...
#4
@
15 years ago
- Cc vikingjs added
- Resolution wontfix deleted
- Status changed from closed to reopened
- Version changed from 2.0 to 2.8.5
I can't believe this got no traction - I use plugins that conscientiously mark where their added content begins and ends using HTML comments, and those comments are adding empty paragraphs in my posts, which in some cases affect layout and undermine the aesthetics of the page.
This issue affects anyone who uses plugins that modify the text of the blog post, e.g., SexyBookmarks.
I'll look at what it would take to patch it, but someone who knows how and where the unwanted behavior is occurring could do it in a fraction of the time.
#5
follow-up:
↓ 6
@
15 years ago
- Keywords wpautop needs-patch added; autop removed
- Milestone set to Future Release
The function responsible is called wpautop() and can be found in /wp-includes/formatting.php
#6
in reply to:
↑ 5
@
15 years ago
Replying to scribu:
The function responsible is called wpautop() and can be found in /wp-includes/formatting.php
I took a look and I'm probably not the guy to be fixing this. As far as I can tell the band-aid advised by random above is a good fix. The reason I think I'm not the guy is because I don't know why stripslashes() and clean_pre() are needed for a comment, so I bow to the superior understanding of others.
random calls his patch a band-aid, but it would be just another in a long list of band-aids in the function. The philosophy seems to be to add paragraph tags and then go back and repair special cases where they broke things. This would be just another special case. Ideally it would be better simply not to apply the wpautop routines to comments (or some of the other cases like embed tags) in the first place.
#7
@
14 years ago
- Cc mrclay added
I've created a more resilient autop implementation and a more rigorous testcase cobbled from various WP tickets. It avoids the dangerous regex operations by doing most work in a DOMDocument object.
This is undoubtedly slower than current wpautop (and may choke on extremely invalid HTML), but happily handles markup that the current wpautop breaks. Since there's no real spec for wpautop, it's hard to duplicate the style of markup wpautop outputs when it does work, but a lot of testing will surely be necessary.
#9
@
10 years ago
- Keywords dev-feedback added
Hey All,
2691.2.diff allows my unit tests for HTML comments to pass, however it caused some others related to video and embed objects to fail. Can I get another set of eyes on this to make sure I'm not missing anything?
#10
@
9 years ago
I'm working on a new patch for this and want some fresh comments regarding @collinsinternet patch which attempts to avoid adding BR directly after an HTML comment.
Is the BR after the comment part of this bug? Or is it only the BR that gets added inside a comment?
Would it be a step in the right direction for wpautop to start avoiding HTML comments but still add a BR when the comment is followed by a single newline character?
What is the desired behavior when an HTML comment is surrounded by multiple newline characters?
#11
@
8 years ago
In 2691.3.diff :
- refactor unit tests using dataProvider
- only remove the tailing </p> after an HTML comment when the paragraph starts as a comment
*adjust the regex for adding breaks after comments - exclude cases where a closing </p> tag is already present
all of the formatting group unit tests now pass. appreciate testing & feedback.
#12
@
7 years ago
2691.4.diff updates unit tests for test_multiline_comment_with_embeds
to pass after this change, removing some stray <br />
tags in the expected results.
#13
@
7 years ago
- Keywords has-patch has-unit-tests added; needs-patch needs-unit-tests removed
@nacin or @azaozz any feedback here? does 2691.4.diff look good?
This is an 11 year old ticket and we should get this fixed! This might affect stored block content for Gutenberg (which uses HTML comments to store blocks).
This ticket was mentioned in Slack in #core by adamsilverstein. View the logs.
7 years ago
This ticket was mentioned in Slack in #core by adamsilverstein. View the logs.
7 years ago
This ticket was mentioned in Slack in #core by pento. View the logs.
7 years ago
#20
@
7 years ago
- Milestone changed from 4.9 to 5.0
Punting from 4.9. This needs more testing and feedback.
#21
@
7 years ago
I've just had to develop a workaround for this long term bug.
In my opinion wpautop shouldn't create any <br /> tags when dealing with new lines following comments.
It should produce the same results as if the comment were not present.
This ticket was mentioned in Slack in #core-js by adamsilverstein. View the logs.
7 years ago
#23
@
6 years ago
Who likes tests? I like tests. I'm going to use this ticket to test a tool I'm writing, you might see attachments and comments appear, I'll delete them as I go. 🙂
#25
@
6 years ago
- Milestone changed from 5.0 to 5.1
Gutenberg doesn't need this, as autop is turned off when block comments exist in the post content.
#26
@
6 years ago
Ran into this problem today, though it was my code at first, but found this ticket. WordPress 4.9.8 and Gutenberg 4.1.1. I created a dynamic block using only a title.
$ ./bin/wp.sh post get 75846 --field=post_content <!-- wp:paragraph --> <p><del>This is a paragraph.</del></p> <!-- /wp:paragraph --> <!-- wp:client/custom-block {"title":"This is the title"} /--> <!-- wp:paragraph --> <p></p> <!-- /wp:paragraph -->
When I view source on the page:
<!-- wp:paragraph --> <p><del>This is a paragraph.</del></p> <!-- /wp:paragraph --> <!-- wp:paragraph --> <p></p> <!-- /wp:paragraph -->
Changing the priority of the do_blocks
call from 7 to 5 in the Gutenberg plugin fixes it. My assumption is autop is running before Gutenberg and stripping out the dynamic block.
#27
@
6 years ago
- Milestone changed from 5.1 to 5.2
Punting to 5.2 because we don't have complete test coverage yet. Would also like to test the last report from @GunGeekATX in 5.0.3/5.1.
#28
@
5 years ago
2691.6.diff includes updated tests.
#29
@
5 years ago
- Milestone changed from 5.2 to 5.3
Punting to 5.3 and will try to land that very early in the cycle.
#30
@
5 years ago
2691.8.diff refresh against trunk and correct a failing test, fixes for phpcs - all tests now pass - https://travis-ci.com/WordPress/wordpress-develop/builds/117891005
#32
@
5 years ago
- Keywords commit removed
Nice work on the patch, @adamsilverstein!
Is there any reason it's checking for <!-
and ->
, instead of <!--
and -->
? This seems to cause issues sometimes. For example, adding this test to data_html_comments
fails:
array(
"foo\n\n<!-- HTML comment ->\n\n still in the comment... -->",
"<p>foo</p>\n<!-- HTML comment ->\n\n still in the comment... -->",
),
Also, can you explain the if
on line 607 of formatting.php
? It seems like it will only remove the trailing </p>
if the entire content starts with <!-
, rather than if the paragraph starts with it, as the comment suggests.
#33
@
5 years ago
Also, this regex is possibly wrong.
$pee = preg_replace( "|(\-\->.*[^</p>])\n|", "$1<br />\n", $pee );
I think it's intended to match everything that isn't a </p>
tag, until it reaches a \n
? It will actually match everything that isn't one of the characters <
, /
, p
, or >
, followed by a \n
.
Given that this is a fairly simple check, I think it'd be better to do it in the $pees
loop, instead. If a $tinkle
starts and ends with HTML comment syntax, don't wrap it in <p>
tags.
#34
@
5 years ago
@pento Thanks for the feedback and regex deciphering :) I'll work on addressing your feedback and add an additional test case to cover the first issue.
#35
@
5 years ago
- Keywords needs-unit-tests added; has-unit-tests removed
- Milestone changed from 5.3 to Future Release
This ticket could use a bunch of unit tests to verify the code works as expected. Milestoning this as 'future release' until there is some more progress.
This ticket was mentioned in Slack in #core by audrasjb. View the logs.
20 months ago
#37
@
20 months ago
- Keywords needs-refresh added
This ticket was mentioned in today's Old Ticket Triage Session :)
To summarize, the ticket:
- has a good approach in the most recent patches
- needs a new, refreshed patch
- needs refreshed unit tests + more test cases (like CDATA comments)
- needs testing on both Classic and Block editors
Thanks @poena, @SergeyBiryukov, @maigret, @wongjn.
#38
@
20 months ago
Also, from @poena:
Kind of related: https://github.com/WordPress/gutenberg/issues/47212 it discusses how broken html “block markup” is handled.
#39
@
20 months ago
- Keywords close added
Frankly I'd be quite reluctant to patch autop at this stage. It is "on its way out" and chances to introduce new bugs there are pretty high.
Looking at the patch, this seems problematic:
if ( strpos( $pee, '<!--' ) !== false ) { $pee = preg_replace( '|<p>\s*<\!--|', '<!--', $pee );
It will remove <p>
from strings that start with a comment like <p><!-- 123 -->text goes here</p>
.
If this must be patched, a preferable way would be to avoid wrapping lines that contain only HTML comments. I.e. something like (untested):
- $pee .= '<p>' . trim( $tinkle, "\n" ) . "</p>\n"; + $part = trim( $tinkle, "\n" ); if ( strpos( $part, '<!--' ) === 0 && strpos( $part, '-->' ) === ( strlen( $part ) - 3 ) ) { $pee .= "$part\n"; // Don't wrap stand-alone HTML comments in <p>. } else { $pee .= "<p>$part</p>\n"; }
Support for multi-line HTML comments would have to be added similarly to how <pre>
tags are treated, i.e. completely excluded from autop.
Still, the chance to introduce regressions here even by fixing the behavior are high. Perhaps better to leave this and similar cases as-is. If a fix is desirable, better to add in a plugin so the users can enable or disable it at will.
This is what is implied: nobody who has seen this has found it important enough to spend their time rewriting autop to work around comments. The applicable mantra is "edge case." You are welcome to work up a patch and submit it for testing and consideration.