Opened 11 years ago
Last modified 7 months ago
#27350 assigned defect (bug)
Invalid HTML Output
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Future Release | Priority: | normal |
Severity: | normal | Version: | 3.8 |
Component: | Formatting | Keywords: | wpautop has-patch has-unit-tests |
Focuses: | Cc: |
Description
Paste into editor's Text tab:
<div>hello <pre>test</pre> world</div>
Preview output:
<div>hello</p> <pre>test</pre> <p>world</p></div>
Seen in 3.8 and in trunk. Doesn't seem to be an editor bug because switching tabs will improve the output. Something wrong with wpautop()?
Attachments (7)
Change History (48)
#3
@
11 years ago
Yes, it is wpautop()
wreaking havoc. FWIW, one way to prevent it is to switch to the visual tab, which inserts a huge amount of white space. The extra white space, while being offensive to those that hand code HTML, does allow wpautop()
to work correctly.
I've traced the problem to the block of code starting at line 265 of wp-includes/formatting.php. The block starting with:
$pee = preg_replace('|<p>\s*</p>|', '', $pee); // under certain strange conditions it could create a P of entirely whitespace
I've no idea how to correct for the errors though without potentially messing up other HTML.
The function essentially adds opening and closing P tags at the front and end of every line, then removes those tags not adjacent to multiple newlines. However, there's a slew of regexp replacements to account for special situations which makes the whole thing very complex and prone to unanticipated errors like you've found..
#5
@
11 years ago
Submitted the unit test for both cases discussed here. I also worked out a solution, but it fails on the other tests. I'll spend more time to understand if it is failing because of hardcoded assumptions in the prior tests or if the fix is incomplete.
#6
@
11 years ago
I took a deeper look and found that my assumptions for the unit tests were faulty. I updated the test for #27350 along with the tests I added for #18136 and #20444. For clarity Autop-27350-fixed.diff
includes the updated unit tests for all three tickets (#27350, #18136, and #20444).
I submitted the fix (formatting-27350.diff
`) I developed for this ticket (#27350) which also fixes #18136 and #20444. In addition to passing the new unit tests, it passes all existing unit tests.
I found that the $allblocks
preg fragment allowed tr
in it to match <track>
elements and the p
in the preg fragment matched <param>
elements. Appending \b
outside of the subpattern matches the word boundary to prevent unintented elements from matching the pattern.
Adding the ltrim
to the preg_split
prevents an empty element in the $pees
blocks which can cause some other odd cases (empty <p></p>
blocks).
The primary fix to the unit tests adds preg_match
inside the $pees
loop. It avoids unnecessarily wrapping block elements in <p>
tags and handles the special case of <blockquote>
tags.
I imagine this will cause a performance penalty. I have not yet run benchmarks to understand if a signficant hit is going to occur. I will report back on this.
Understanding that all of this will work will be supplanted by #25856, it is still advantageous to clean up these edge cases now and add coverage to the unit tests.
#7
@
11 years ago
I ran performance tests using a 4K word test document (with <blockquote>
, <video>
, <table>
and other block elements sprinkled throughout) that was run through wpautop
1000x. I used the exact same methodology for both tests (for
loop calling wpautop
)and captured results through xdebug profiling. The results show a modest slowdown as expected. Results:
Original:
Runtime: 15034 milliseconds
Function: wpautop
Invocation Count: 1000
Total Self Cost: 12.60
Total Inclusive Cost: 98.21
Modified:
Runtime: 18752 milliseconds
Function: wpautop
Invocation Count: 1000
Total Self Cost: 24.40
Total Inclusive Cost: 95.96
Test text is included in the wpautop_profiler_reports.zip
#8
@
11 years ago
I suggest checking the performance impact of each part of the patch. For example, the \b solution may not be optimal, but I'd bet the slowdown you're seeing is related to the multiple PCRE function calls added to the inner loop.
Also note the return value of preg_match() is never > 1.
#9
@
11 years ago
Yes, of course, simply confused the return with preg_match_all
.
As for performance, it's not much in the way of slow down at all. Even scaled up at 1000x, we're ultimately looking at an average slow down of +3ms on ~4K-word texts. It's not up to me to decide if the fixes for 3 tickets is worth +3 ms.
#10
@
11 years ago
+1 for implementing this. Looks like a well thought out solution that fixes multiple issues.
#20
@
6 years ago
- Keywords has-unit-tests added; dev-feedback needs-unit-tests removed
- Milestone set to 5.3
- Owner set to pento
- Status changed from new to assigned
27350.2.diff combines the test cases from all of the duplicate tickets. The solution takes inspiration from the patch on this ticket, as well as the attempted fixes on the other tickets.
This ticket was mentioned in Slack in #meta by dd32. View the logs.
6 years ago
#30
@
6 years ago
Sadly, the fix in [45585] had severe performance issues on PHP 7.2 and earlier. I've reverted it for now, to give it some more time to percolate.
This ticket was mentioned in Slack in #meta by ipstenu. View the logs.
6 years ago
This ticket was mentioned in Slack in #meta by sergey. View the logs.
6 years ago
#34
@
6 years ago
- Milestone changed from 5.3 to Future Release
This is one of the "peculiar ways" wpautop works... The "Text" tab in the classic editor has a visual component to it. It is not a "pure HTML" input, it converts line breaks to paragraphs and <br>
tags.
In this case the error is that the first closing </p>
should be a <br>
. It may be possible to add another regex to search for this exact case and fix it, but chances are something else will break. The bad part about such breakage is that it gets saved to the DB and then introduces problems on every page load on the front-end. As wpautop use is on its way out, not sure it's worth the risks of fixing these rare edge cases.
Here's another test case:
Preview output: