Make WordPress Core

Opened 14 years ago

Last modified 5 years ago

#14050 assigned defect (bug)

shortcode_unautop() should also remove the <br /> added after shortcodes

Reported by: aaroncampbell's profile aaroncampbell Owned by: aaroncampbell's profile aaroncampbell
Milestone: Priority: normal
Severity: normal Version: 3.0
Component: Formatting Keywords: needs-refresh wpautop
Focuses: Cc:

Description

Currently wpautop() wraps a shortcode in <p> tags as well as adding a <br /> tag after the shortcode. We then use shortcode_unautop() to remove the <p> tags, but the <br /> stays.

To replicate, just drop a few caption shortcodes into a post and set them all to align left or right. You'll see that even though they all float (assuming that's how your theme handles them) they stair step down because of the extra <br />

I'm not a regex expert so someone should probably double check my patch, but it seems to work for me.

Attachments (10)

14050.001.diff (586 bytes) - added by aaroncampbell 14 years ago.
14050.002.diff (586 bytes) - added by aaroncampbell 13 years ago.
shortcode_unautop-unit-tests.diff (908 bytes) - added by aaroncampbell 13 years ago.
Unit Tests
t14050-shortcode-unautop-upgrade-tests.diff (2.1 KB) - added by lkraav 10 years ago.
t14050-shortcode-unautop-upgrade.diff (4.6 KB) - added by lkraav 10 years ago.
is-your-regular-expressions-workspace-better.png (90.7 KB) - added by lkraav 10 years ago.
I'd really like to know what your toolset is, whoever you are deciding on whether to merge the work. Maybe this RegexBuddy setup demo I did my patches on is helpful to someone.
plugin.php (3.3 KB) - added by lkraav 10 years ago.
Drop-in plugin v1 to demonstrate and test current patchwork
14050-unittests.diff (860 bytes) - added by MikeHansenMe 9 years ago.
see #30284
14050.diff (615 bytes) - added by jorbin 9 years ago.
Unit test removed as a part of #30284
14050-tests.diff (1.1 KB) - added by aaroncampbell 9 years ago.

Download all attachments as: .zip

Change History (47)

#1 @hakre
14 years ago

  • Keywords has-patch dev-feedback needs-testing needs-unit-tests added

Please take note of existing wpautop() tickets that are releated to the <p> and <br> tag. I just compiled a list of related tickets in here: #13340 maybe it's of use, and I think there is one ticket related to unautop() already but don't blame me if not.

#2 @nacin
13 years ago

  • Milestone changed from Awaiting Review to Future Release

#3 @aaroncampbell
13 years ago

  • Cc aaroncampbell added
  • Keywords 3.2-early added

#4 @bobbingwide
13 years ago

Hi, I'm new to this Trac system but I have a couple of comments.

  1. This problem is documented as having been fixed in WP 2.5.1.

extract from http://codex.wordpress.org/Shortcode_API#Output

wpautop recognizes shortcode syntax and will attempt not to wrap p or br tags around shortcodes that stand alone on a line by themselves. Shortcodes intended for use in this manner should ensure that the output is wrapped in an appropriate block tag such as <p> or <div>.

Note: in WordPress 2.5.0, shortcodes were parsed before post formatting was applied, so the shortcode output HTML was sometimes wrapped in p or br tags. This was incorrect behaviour that has been fixed in 2.5.1.

  1. Shouldn't someone update the above text to say it's still a problem? Or is there some etiquette that says WordPress should not air its dirty washing.
  2. I've applied the fix and it's fine for me.
  3. I'm not sure if I agree with the comments in #13340 as they suggest that this fix will need a lot of regression testing. I agree that there may be some sites that will appear to break by the removal of the unwanted <br />'s BUT in my opinion their continued presence is particularly annoying given the above documentation.
  4. I would like to see this fix as a candidate for 3.0.6 (3.2 is far too late)
  5. If not then I would like to see the workaround documented. Actually I've had a go myself. See http://www.bobbingwidewebdesign.com/2011/02/21/shortcode-breaks-and-wordpress-bug-14050/

#5 @aaroncampbell
13 years ago

  1. This is similar to the problem that was fixed in 2.5.1, but it's not quite the same thing. At that point all shortcode content was being run through autop and it caused ALL THE CONTENT of the shortcode to be treated accordingly. Thus if your shortcode returned 3 paragraphs of text all three paragraphs would be wrapped in p tags. Right now if you return the same three paragraphs, they are left untouched except that a br tag is added at the end.
  2. You are more than welcome to update the codex as long as you are clear in describing what exactly is happening, AND clear that this behavior *may* change in the future (as this ticket is not yet "accepted" you shouldn't say *will*).
  3. I'm glad this fix works for you!
  4. In #13340 hakre is talking about changes to wpautop() and in this ticket we're not suggesting a change to that. Instead I want to make an adjustment to shortcode_unautop(). The former is used in many places and a small change can GREATLY affect things on the front end of a thousands (or even hundreds of thousands) of sites. It's used on such a vast array of content that it's almost mind blowing to consider. However, shortcode_unautop() is used far less, and since it's specifically meant to undo what wpautop() does to shorcodes, I think it will be safe to effect this change.
  5. This fix will definitely not make it into 3.0.6 because there won't be a 3.0.6 (and if for some reason there is, it will only be for security issues). Likewise, this was punted from 3.1, which I agree with. There just isn't time to get something like this in there. Likewise it probably won't go in 3.1.1 because while I consider it a bug, it's not a regression. This is how it's worked ever since shortcode_unautop() was introduced. In my opinion 3.2 is the perfect time to try to get this fixed.
  6. I think the article you linked to is fine. When you update the codex to point out that you get a trailing br on shortcodes, make sure to mention that while this *may* be fixed in the future, right now you can put the shortcodes on the same line as a workaround.
Version 0, edited 13 years ago by aaroncampbell (next)

#6 @aaroncampbell
13 years ago

  • Milestone changed from Future Release to 3.2

#7 @aaroncampbell
13 years ago

  • Keywords 3.2-early removed

#8 @aaroncampbell
13 years ago

Same patch, just refreshed for current trunk.

#9 @aaroncampbell
13 years ago

  • Keywords needs-unit-tests removed

Before patch:
Tests: 27, Assertions: 51, Failures: 3, Skipped: 4.

After patch:
Tests: 27, Assertions: 51, Failures: 2, Skipped: 4.

The two that are failing were failing when I first set everything up. I'm not sure if we need to look into them or if they were already like that (test_utf8_whitespace_1 and test_utf8_whitespace_2)

Last edited 13 years ago by aaroncampbell (previous) (diff)

#10 @aaroncampbell
13 years ago

  • Keywords 3.3-early added; needs-testing removed
  • Milestone changed from 3.2 to Future Release

Moving to 3.3. See also #15600

#11 @pauldewouters
12 years ago

  • Cc pauldewouters added

#12 @aaroncampbell
12 years ago

  • Keywords needs-refresh added; has-patch removed

#13 @MikeHansenMe
11 years ago

  • Cc mdhansen@… added

Is this still a problem? I have been trying to reproduce but do not get a <br/> anywhere in the returned shortcode or after the shortcode. Just wondering if the patch needs refreshed again or if the ticket is obsolete.

#14 follow-up: @SergeyBiryukov
11 years ago

shortcode_unautop() was rewritten in [18952], however the associated test still fails:

1) Tests_Shortcode::test_multiple_shortcode_unautop
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'[footag]
-[footag]
+'<p>[footag]<br />
+[footag]</p>
 '

#16 @aaroncampbell
11 years ago

  • Keywords 3.3-early removed
  • Milestone changed from Future Release to 3.7

#17 follow-up: @nacin
10 years ago

  • Milestone changed from 3.7 to Future Release

Still needs a patch after [18952].

#18 @lkraav
10 years ago

  • Cc leho@… added

#19 in reply to: ↑ 14 @lkraav
10 years ago

Replying to SergeyBiryukov:

shortcode_unautop() was rewritten in [18952], however the associated test still fails:

1) Tests_Shortcode::test_multiple_shortcode_unautop
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
-'[footag]
-[footag]
+'<p>[footag]<br />
+[footag]</p>
 '

This is really bugging a few of my projects, so I'm trying to attack the following. Not sure yet if the underlying problem is the same as for the quoted.

2) Tests_Shortcode::test_shortcode_unautop_html
Failed asserting that two strings are equal.
--- Expected
+++ Actual
@@ @@
 '<h1>Fun, safe, small group tours &amp; no roughing it</h1>
-
-[footag]
-
-Enjoy guided day walks in extraordinary natural areas. Return to comfortable accommodation and delicious meals each evening. No heavy backpack, no camping.
-
-[/footag]'
+[footag]</p>
+<p>Enjoy guided day walks in extraordinary natural areas. Return to comfortable accommodation and delicious meals each evening. No heavy backpack, no camping.</p>
+<p>[/footag]
+'

EDIT phpunit --group 14050 | colordiff helps.

Last edited 10 years ago by lkraav (previous) (diff)

#20 @lkraav
10 years ago

Here's a very nice online regex tool, that lets you permalink the whole setup. My test from comment:19 is at http://www.phpliveregex.com/p/29E

EDIT I found it challenging trying to get that shortcode_unautop regex string back into one piece. Is there some best practice for that, or do we just temporarily say echo $pattern for copy-pasting?

EDIT2 Whoever is working on this (type of stuff), get RegexBuddy. It really helps getting a grip on this monster.

Last edited 10 years ago by lkraav (previous) (diff)

#21 in reply to: ↑ 17 @lkraav
10 years ago

Replying to nacin:

Still needs a patch after [18952].

New regex from [18952] doesn't match shortcodes with attributes at all, while the previous short one does.

<p>[hypertag class="foo"]Bar[/hypertag]</p>

Is this a regression then?

#22 @lkraav
10 years ago

Above is a squash of the following 4 commits.

* e18aa9f - (HEAD, master) Formatting: shortcode_unautop() should support shortcode attributes, see #14050 (17 minutes ago) <Leho Kraav>
* f05a03e - Formatting: shortcode_unautop() clarifying comment, see #14050 (17 minutes ago) <Leho Kraav>
* 2bddf79 - Formatting: shortcode_unautop() should support shortcodes on separate lines, see #14050 (17 minutes ago) <Leho Kraav>
* 2ac5bcd - Formatting: shortcode_unautop should support [footag /], see #14050 (18 minutes ago) <Leho Kraav>
* 6927ba5 - (origin/master, origin/HEAD) Fix JSHint errors in widgets.js. (16 hours ago) <Sergey Biryukov>
...

Unit tests are separate from SVN because I haven't found a a git repo for them yet, and didn't feel like cloning it myself.

Putting a near-full working day into this made me understand better what exactly is going on here. Regex from [18952] is missing many things that are standard operation in WordPress for a long time and even seems to regress on a couple of features. Seems kind of important to me, because HTML output will be broken if the user makes the slightest "mistake" in his shortcode entry process. I'm not even sure what the "safe" way to enter shortcodes is right now.

To me, double-linebreak

[foo]

content

[/foo]

[bar]

[proton]

seems the cleanest and wpautop() is also agreeing with this statement so far (#24085).

Funny thing is, I really don't know what to do with that original test case from comment:14. If someone could state the rules / algorithm for it in plain English, I could probably patch the regex up for that, too.

@lkraav
10 years ago

I'd really like to know what your toolset is, whoever you are deciding on whether to merge the work. Maybe this RegexBuddy setup demo I did my patches on is helpful to someone.

@lkraav
10 years ago

Drop-in plugin v1 to demonstrate and test current patchwork

#23 follow-up: @lkraav
10 years ago

Just wondering, for this to ever move forward, who is the core guy/girl capable of leading it?

I'm activating my hotfix plugin on all my new sites in development and it is performing without mistakes. Obviously I'd like to get off that maintenance train eventually..

Talking about my own patchwork, this could also be moved forward one patch at a time to minimize breakage surface. I built the improvements independent of each other to simply support additional different ways of entering shortcodes.

Wondering if in the meanwhile I should upload the drop-in to Plugins repo and try to get a bigger testing crowd going.

#24 in reply to: ↑ 23 @samuelsidler
10 years ago

Replying to lkraav:

Just wondering, for this to ever move forward, who is the core guy/girl capable of leading it?

Probably @aaroncampell or @nacin.

#25 @nacin
10 years ago

  • Keywords wpautop added

#26 @szepe.viktor
10 years ago

@lkraav Thank you for your work!
I use it as a "mu plugin".

#27 @aaroncampbell
9 years ago

With the latest patch, the second unit test in shortcode_unautop-unit-tests.diff fails.

#28 @lkraav
9 years ago

apparently svn -> github sync is down and won't be fixed until weekend (source: slack). would've taken a look at this already. or do you have the failure covered on your own @aaroncampbell?

@jorbin
9 years ago

Unit test removed as a part of #30284

#29 @aaroncampbell
9 years ago

Now that the unit test is removed, I kind of like the idea of reworking it to trim the content before comparing anyway. If the function is ever reworked to not leave behind a newline, the test would fail, but since we're dealing with HTML we really shouldn't care if there's an extra newline.

This ticket was mentioned in Slack in #core by wonderboymusic. View the logs.


9 years ago

#31 @wonderboymusic
9 years ago

  • Keywords dev-feedback removed
  • Owner set to aaroncampbell
  • Status changed from new to assigned

#32 @aaroncampbell
9 years ago

So, I didn't forget about this ticket and I haven't been ignoring it. It's just that the issues here are somewhat more complex that I remembered them being. I'm working up some more extensive tests to show what I mean, but here are the basics:

  • wpautop() is complex (and pees all over the place), but is blissfully unaware of shortcodes
  • It's bliss, unfortunately, causes inconsistencies in how it handles them.
  • shortcode_unautop() uses mostly the logic of the shortcode processor (very similar regex) to try to undo what wpautop() does, but it's definitely not succeeding

Here are the three main scenarios, one of which (likely the most common usecase) works:

  • When a shortcode has an empty line or the start or end of the post before or after it (so two new lines before or after), it is wrapped in a <p> tag. When shortcode_unautop() is run, the <p> tag is removed.
  • A shortcode that is immediately followed by another shortcode (no new line) results in both shortcodes being wrapped in a <p> tag (<p>[shortcode1][shortcode2]</p>). When shortcode_unautop() is run, the <p> tag is not removed.
  • A shortcode that is followed by another shortcode on the next line (single new line) results in both shortcodes being wrapped in a single <p> tag, with a <br> between them (<p>[shortcode1]<br />[shortcode2]</p>). When shortcode_unautop() is run, neither the <p> tag nor the <br> are removed.

This ticket was mentioned in Slack in #core by aaroncampbell. View the logs.


9 years ago

#34 follow-up: @squarestar
9 years ago

Edit: I had orignally commented that szepe.viktor's mu-plugin regex worked for me without realising that it was Ikraav's work and had already been tested. Don't mind me.

Last edited 9 years ago by squarestar (previous) (diff)

#35 in reply to: ↑ 34 @lkraav
9 years ago

Replying to squarestar:

Edit: I had orignally commented that szepe.viktor's mu-plugin regex worked for me without realising that it was Ikraav's work and had already been tested. Don't mind me.

Yeah so it seems.

I'm not sure where this ticket is going so far. Comments have been general in nature and I haven't seen anything on the specific patchwork I provided. No idea why any piece of it hasn't been going towards landing, despite solid tests provided. My guess is @aaroncampbell has found holes in or is trying to optimize on what I have already done. I'd be interested in knowing what improvements fit inbetween current trunk and my step by step incrementals. Are we rebuilding the whole thing from scratch (seems unlikely) or maybe @aaroncampbell is still working on wrapping his mind around the regex monster.. I know it took me a while to attack it, too.

#36 @aaroncampbell
9 years ago

I put together a few tests in 14050-tests.diff to showcase the various issues. Without the patch from @lkraav (t14050-shortcode-unautop-upgrade.diff) three of the four tests fail, although the fourth is very minor and the test could probably be amended to consider it a pass (but these are set to "best case scenario). With the patch, the same three tests fail in the same ways.

Are there other situations, not covered by the tests I just added here, that the patch covers?

#37 @szepe.viktor
9 years ago

1

Block-level HTML element generating shortcodes are differently unautop-d from inline element generating ones.

2

In case of block-level element generating shortcodes you can use them in each other on one line or on new lines.

[block1][block2]
content
[/block2][/block1]
[block1]
[block2]
content
[/block2]
[/block1]

3

https://gist.github.com/szepeviktor/5df48166063ecb66bdc0
becomes
https://gist.github.com/szepeviktor/4d105d8efbf5f035dc39

I usually get stray </p>-s:

[block class="bigpink"]
<ol>
<li>Mivel
<div class="bigpink"></p>
<ol>
<li>Mivel

using shortcode-unautop.php mu-plugin.

Last edited 9 years ago by szepe.viktor (previous) (diff)
Note: See TracTickets for help on using tickets.