Make WordPress Core

Opened 20 years ago

Closed 18 years ago

#178 closed defect (bug) (fixed)

WP doesn't close tags when breaking a page with < !-- more -- >

Reported by: rq's profile rq Owned by:
Milestone: Priority: normal
Severity: trivial Version: 2.0.3
Component: General Keywords:
Focuses: Cc:

Description

If I have a situation like following:

< p > text < /p >
< ul >

< li > item1 < /li >
< li > item2 < /li >
< !-- more -- >
< li > item3 < /li >
.....

< /ul >

wordpress then doesn't close the < ul > tag when displaying a stripped message. XHTML becomes invalid then.

I haven't tested, but I suspect that same things happen when you break a post into pages.

Attachments (2)

0000178-balance-pre-more.diff (463 bytes) - added by rq 20 years ago.
0000178-balance-pre-more2.diff (412 bytes) - added by rq 20 years ago.

Download all attachments as: .zip

Change History (19)

#2 @majelbstoat
20 years ago

This is actually a bit more difficult than a tweak I think. The following link takes you to a preliminary solution that closes tags before the 'more' tag. It involves iterating through the first half of the content looking for unclosed tags and then closing them just before the more... link.

A few notes:

  1. It only works with well-formed XHTML, though that shouldn't be a problem. There is basic checking for well-formed ness but nothing significant.
  2. It handles self closing tags.
  3. It must be processed at display-time rather than post-time.
  4. Works against the latest CVS available.
  5. wpautop seems to screw up when 'more' is used (I think this was documented elsewhere)

And a question; Why is the invisible 'more' anchor currently inserted if the entire text is being displayed? This breaks certain structures irretrievably (ie ordered list, which even if sanitised will have to start again from 1. in the second half of the content).

Anyway, the patch as it stands at the moment is at:

http://www.jamietalbot.com/wp-hacks/template-functions-post-more-patch.phps

Comments and further discussion invited!

#3 @chuyskywalker
20 years ago

Should this be the responsibility of WP?
Perhaps a bit easier than parsing the syntax would be:

< ul >
< li >item< /li >
< li >item< /li >

< !--more:section-- >
< /ul >
< !--more:section-- >

< li >item< /li >
< li >item< /li >
< li >item< /li >
< /ul >

This would be a lot easier, faster, and more useful. I mean, it'd be a heck of a lot less server load, and thus quicker. In addition you could use it to do more than just closing tags - you could have shifting content - something I've seen requested.

edited on: 08-02-04 09:40
(OK, screw this...i can't get this damn thing to display the code right.)

edited on: 08-02-04 09:41

#4 @majelbstoat
20 years ago

I would say that this should be the responsibility of WP. After all, WP provides the 'more' tag, and the second listed feature of WP is 'full standards compliance.' I don't think it would be right to offer features on a 'you can't have it both ways' basis.

You're definitely right about it being slower, but the server load really shouldn't be all that intense. As far as I can see a lot of processing is performed just as the content is displayed, - one more function that maintains standards compliance would seem to be a good trade off.

One solution is to only apply the function if the "Correct incorrectly nested tags" box is checked. Maybe changing its name to something like "Attempt To Correct Your XHTML" (or something better) would work?

Another solution would be to calculate the required closing tags at post time rather than run time and then store them somewhere for instant retrieval. That way the overhead only occurs once, and at a time when the user can expect a slight delay (pings, tag balancing etc...) The downside of this of course is that the extra tags must be stored somewhere, which will increase the db size.

As for the more-section suggestion, I think this is a good idea. But then, I'm a programmer and I don't mind checking for closed tags, and applying (albeit rudimentary) flow control. If you showed your suggestion to a non-programmer, I'm not sure if they'd understand what it was supposed to do. WP should be all about easiness in my opinion (look at the 'friendly' install routine for instance).

A wider problem is that the HTML nesting code doesn' reliably work. I think a substantial rethink and rewrite of code correction is required to get an elegant solution. What do you think? All in all, it needs a bit more discussion I think.

Another question... What is content shifting? It sounds interesting...

edited on: 08-02-04 14:49

#5 @coffee2code
20 years ago

I've attached a diff file, balance-pre-more.diff, that addresses the one-line addition to the function get_the_content() in wp-includes/template-functions-post.php to handle balancing pre-'more' text. Simply need to send the pre-'more' text to balanceTags(). The call only occurs when a post does in fact have the 'more' split and only when the more link would be displayed. It's recommended that the fixed balanceTags() be used as the older balanceTags() would be more likely to improperly balance if the 'more' splits certain HTML tags just so.

My further explanation of the code change: http://www.coffee2code.com/archives/2004/08/03/patch-balancing-pre-more-tags

-Scott
http://www.coffee2code.com

#6 @anonymousbugger
20 years ago

I've just applied your patch (well, did it manually, huh)... However, it leaves out ONE error on my page: adds an unneeded < /p > tag. Furthermore, i'm not sure if that's related, but i was somehow getting an unneeded < / > tag in the SOURCE of my post just between the last < /li > and < /ul >. And I seemed to be unable to erase it.. Weird... Had to go and change it in a database manually... And still i'm getting that "invalid < /p >" error... ;/

http://validator.w3.org/check?uri=http%3A%2F%2Frq.online.lt%2Farchives%2F2004%2F06%2F24%2F

edited on: 09-02-04 12:12

#7 @anonymousbugger
20 years ago

This patch is best used in conjunction with the patch I attached to bug 0000053, http://mosquito.wordpress.org/bug_view_page.php?bug_id=0000053

It is the balanceTags() function that is causing the problems you described. The patch for bug 0000053 should fix those.

-Scott
http://www.coffee2code

edited on: 09-02-04 12:58

#8 @rq
20 years ago

Thanks for that, i've patched my functions-formatting.php now. However, I'm still getting that invalid </p> tag when displaying http://rq.online.lt/archives/2004/06/24/.

and, furthermore, when i open the post http://rq.online.lt/archives/2004/06/24/siek-tiek-fotkiu/ itself, i'm getting even more invalidity, because the WP replaces the < !-- more --> comment with this:

< p>< a id="more-31">< /a>< /p>

which shows up in the middle of the < ul>< /ul> where it's illegal.

Btw, i'm getting quite upset because it seems like developers do NOT apply patches from mosquito. In fact, i'm starting to think they don't look at it at all... :(

edited on: 09-02-04 13:47

#9 @coffee2code
20 years ago

I attached a very slightly modified version of the .diff. The one-line fix for this bug report needed to also have a newline after the potentially balanced pre-'more' text so that wpautop doesn't misbehave and insert a < /p > without a < p >. This problem was noticed by rq.

However, rq, I don't think WP will be able to do much about the full post having the more's anchor inserted within the < ul > you split with < !--more-- >. If anything, you could try to split the post differently:

< p > text < /p >
< ul >

< li > item1 < /li >
< li > item2 < !-- more -- >< /li >
< li > item3 < /li >
.....

< /ul >

The tag balancer will add the < /li > and the < /ul > to balance the pre-'more'. When the full text is shown, the anchor will now be safely located within an < li >.

#10 @rq
20 years ago

coffee2code: i'm not sure, but is an < a> tag illegal inside a < ul>?
i mean, would it still not validate, if we would simply skip the < p> and < /p> tags?

#11 @majelbstoat
20 years ago

This doesn't exactly solve the problem of the unbalanced < !-- more > attributes, but a plugin I wrote does correctly balance a lot of XHTML errors, and attributes. This includes an 'a' inside a 'ul' tag which, you are right, is invalid according to the specs.

Have a look at X-Valid here: http://www.jamietalbot.com/wp-hacks/ if you're interested...

#12 @coffee2code
20 years ago

rq: With my suggestion, the result would look like this on the permalink page:

< p > text < /p >
< ul >

< li > item1 < /li >
< li > item2 < a id="more-31" >< /a>< /li >
< li > item3 < /li >
.....

< /ul >

which validates

#13 @anonymousbugger
20 years ago

yes, but that's a workaround and not a solution :/
Well, thanks, anyways. I'll use that suggestion in future.

#14 @dmclark
20 years ago

This patch was fine for me (note it is now in template-functions-post.php).
thanks

#15 @coffee2code
20 years ago

  • Patch set to Yes

#16 @markjaquith
19 years ago

  • Resolution set to wontfix
  • Status changed from new to closed

There are very differing opinions on this, but there seems to be a slight lean towards the idea that people shouldn't expect to be able to split a page in the middle of an open tag and have it validate. Re-open if you have a good argument to the contrary and we'll resume discussion.

#17 @mat8iou
18 years ago

  • Resolution wontfix deleted
  • Status changed from closed to reopened
  • Version changed from 1.2 to 2.0.3

I don't understand why this is set to wontfix. Depending on the theme used, it does not necessarily cause minor validation errors, but can completely alter the look of all following posts on the front page. For instance, many of my posts consist mainly of a long blockquote. If the tag is not closed properly after more (within the blockquote) then all items following on the page are indented as part of the quote.
My current reason for coming back to this issue though, is that I previously been using the patch detailed above with no problems. In 2.0.x however, it appears to work, but when the patch is applied, the more link text appears at the end of the full post, as well at the end of the abridged version of the post.

#18 @mdawaffe
18 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed
Note: See TracTickets for help on using tickets.