Opened 6 years ago

Last modified 3 years ago

#4298 accepted defect (bug)

wpautop bugs

Reported by: Denis-de-Bernardy Owned by: markjaquith
Priority: low Milestone: Future Release
Component: Formatting Version: 2.7
Severity: minor Keywords: wpautop needs-patch early dev-feedback
Cc:

Description

wpautop should at least ignore multiline html tags, and possibly ignore any area enclosed within html.

<table
 style="width: 120px; height: 92px; text-align: left; margin-left: auto; margin-right: auto;"
 border="1" cellpadding="7" cellspacing="2">
  <tbody>
    <tr>
      <td><small><small style="font-family: Arial;">Once
I saw it here, I instantly knew what I wanted. I love my new woven wood
shades.</small></small><br>
      <small><small>- Ginny Good</small></small></td>
    </tr>
  </tbody>
</table>

gets formatted as:

<table<br />
 style="width: 120px; height: 92px; text-align: left; margin-left: auto; margin-right: auto;"<br />
 border="1" cellpadding="7" cellspacing="2"><br />
<tbody>
<tr>
<td><small><small style="font-family: Arial;">Once<br />
I saw it here, I instantly knew what I wanted. I love my new woven wood<br />
shades.</small></small><br /><br />
      <small><small>- Ginny Good</small></small></td>

</tr>
</tbody>
</table>

Attachments (2)

4298.diff (1.2 KB) - added by Denis-de-Bernardy 4 years ago.
4298.2.diff (1.5 KB) - added by Denis-de-Bernardy 4 years ago.
new patch on pre_ filters

Download all attachments as: .zip

Change History (32)

  • Milestone changed from 2.2.1 to 2.4
  • Component changed from Administration to Template

Reformatting tags like this before auto-formatting might work:

$str = preg_replace("/

(<[>]+)
\n
([
>]*>)
/x", , $str);

D.

  • Priority changed from normal to low
  • Severity changed from normal to minor

It should occur after escaping stuff we don't want to autop, though, i.e. scripts, objects, etc.

  • Milestone changed from 2.5 to 2.6
  • Keywords has-patch tested dev-feedback added
  • Milestone changed from 2.9 to 2.8
  • Version changed from 2.2 to 2.7

comment:8   ryan4 years ago

  • Component changed from Template to Formatting
  • Owner anonymous deleted
  • Keywords needs-patch added; has-patch tested dev-feedback removed
  • Milestone changed from 2.8 to Future Release

better approach would be to apply this before saving the post's contents and excerpts.

code I use in my plugin:

		add_filter('content_save_pre', array('sem_fixes', 'fix_wpautop'), 0);
		add_filter('excerpt_save_pre', array('sem_fixes', 'fix_wpautop'), 0);
		add_filter('pre_term_description', array('sem_fixes', 'fix_wpautop'), 0);
		add_filter('pre_user_description', array('sem_fixes', 'fix_wpautop'), 0);
		add_filter('pre_link_description', array('sem_fixes', 'fix_wpautop'), 0);

	#
	# fix_wpautop()
	# http://core.trac.wordpress.org/ticket/4298
	#
	
	function fix_wpautop($content) {
		$content = str_replace(array("\r\n", "\r"), "\n", $content);
		
		while ( preg_match("/<[^<>]*\n/", $content) ) {
			$content = preg_replace("/(<[^<>]*)\n+/", "$1", $content);
		}
		
		return $content;
	} # fix_wpautop()

new patch on pre_ filters

  • Keywords has-patch tested commit added; needs-patch removed
  • Milestone changed from Future Release to 2.8
  • Owner set to Denis-de-Bernardy
  • Status changed from new to accepted

Before we carry on modifying autop I would really like to see a set of test cases written using the wordpress-tests setup so that we can really get a defined behaviour for this function and iron out all the bugs ;-)

I'm not sure we should make any changes to it's behaviour without these tests.

  • Milestone changed from 2.8 to Future Release

agreed to a large extent. but I won't be writing them (I currently have a miles long list of things to do already).

punting this to future pending the unit tests.

  • Milestone changed from Future Release to 2.8

doh, wait. I thought this was another ticket.

it's not a wpautop ticket, or rather it is, but it fixes a genuine bug in it. you don't need a unit test for this one. the stuff in the patch is done by the latest tinymce already. it's the php version that has a problem.

it merely turns stuff like:

<p
  class="foo">

... which gets wrecked by wpautop, into:

<p class="foo">

it has been working successfully for over a year on thousands of sites, and bugs would have been reported long ago if any were present.

the while loop stands for: until there is no open tag (<) followed by a newline char, turn the newline char into a space.

until there is no open tag (<) followed by a newline char before it's closed, turn the newline char into a space. even.

the ticket that needs unit tests is #3833. this one needs absolutely none of it. it's a bug, and the suggested patch is absolutely harmless.

  • Keywords early added
  • Milestone changed from 2.8 to 2.9

can wait until 2.9

still applies clean

  • Owner Denis-de-Bernardy deleted
  • Status changed from accepted to assigned

Another long-player in the formatting bug league, +1 for getting a
definition, unit-tests and merging with related ones.

Versus: #3833, #4152

  • Milestone changed from 2.9 to 3.0
  • Owner set to markjaquith
  • Status changed from assigned to accepted

Early 3.0, for sure. Just hadn't seen this until now.

  • Keywords needs-patch added; has-patch tested commit removed

A quick note on this one, because an issue was brought to my attention a few weeks ago. My patch actually breaks things such as:

<?php
$foo = $bar;
?>

What I'm currently doing, in case it inspires a better solution:

	/**
	 * fix_wpautop()
	 *
	 * @param string $content
	 * @return string $content
	 **/

	function fix_wpautop($content) {
		$content = str_replace(array("\r\n", "\r"), "\n", $content);
		
		if ( !preg_match("/<[a-z][^<>]*\n/i", $content) )
			return $content;
		
		global $sem_fixes_escape;
		$sem_fixes_escape = array();
		
		$content = preg_replace_callback("/<\?php.+\?>/is", array('sem_fixes_admin', 'escape_php_callback'), $content);
		
		while ( preg_match("/<[a-z][^<>]*\n/i", $content) ) {
			$content = preg_replace("/(<[a-z][^<>]*)\n+/i", "$1", $content);
		}
		
		if ( $sem_fixes_escape )
			$content = str_replace(array_keys($sem_fixes_escape), array_values($sem_fixes_escape), $content);
		
		return $content;
	} # fix_wpautop()
	
	
	/**
	 * escape_php_callback()
	 *
	 * @param array $match
	 * @return string $out
	 **/

	function escape_php_callback($match) {
		global $sem_fixes_escape;
		
		$tag_id = "----sem_fixes_escape:" . md5($match[0]) . "----";
		$sem_fixes_escape[$tag_id] = $match[0];
		
		return $tag_id;
	} # escape_php_callback()

# http://core.trac.wordpress.org/ticket/4298
add_filter('content_save_pre', array('sem_fixes_admin', 'fix_wpautop'), 0);
add_filter('excerpt_save_pre', array('sem_fixes_admin', 'fix_wpautop'), 0);
add_filter('pre_term_description', array('sem_fixes_admin', 'fix_wpautop'), 0);
add_filter('pre_user_description', array('sem_fixes_admin', 'fix_wpautop'), 0);
add_filter('pre_link_description', array('sem_fixes_admin', 'fix_wpautop'), 0);

comment:25 follow-up: ↓ 26   azaozz3 years ago

Think this is mostly fixed now. It can only happen when using the HTML editor. By default the visual editor removes all line breaks (except in <pre> tag) as they are insignificant and the JS version of autop removes any breaks inside tags when switching HTML to visual. What remains is perhaps post by email, atompub, etc.

comment:26 in reply to: ↑ 25   Denis-de-Bernardy3 years ago

Replying to azaozz:

Think this is mostly fixed now. It can only happen when using the HTML editor. By default the visual editor removes all line breaks (except in <pre> tag) as they are insignificant and the JS version of autop removes any breaks inside tags when switching HTML to visual. What remains is perhaps post by email, atompub, etc.

The user who reported it was pasting HTML code from Dreamweaver, I think, into the HTML editor. It was a real annoyance (for him anyway). You're welcome to close if you're not interested in fixing it; only a handful of users reported the issue to me, and I've a plugin that adds the above-mentioned filters.

@andrew: shall we close, or should I massage a patch?

  • Keywords dev-feedback added

@andrew/azzozz: please decide.

There are several tickets about autop and HTML block elements that can contain other blocks. Think we should handle them together but first will need well made test cases as @westi suggests. Without them don't think we will be able to do this properly.

  • Keywords wpautop added
  • Milestone changed from 3.0 to Future Release

Moving to future, we need test cases and maybe we can take care of the whole group of wpautop tickets.

Note: See TracTickets for help on using tickets.