Make WordPress Core

Opened 10 years ago

Closed 10 years ago

#28937 closed defect (bug) (duplicate)

Post content randomly disappears in editor after save

Reported by: chaoix's profile chaoix Owned by:
Milestone: Priority: normal
Severity: normal Version: 3.9.1
Component: Formatting Keywords: wpautop
Focuses: Cc:

Description

Running Wordpress on PHP 5.3 and IIS 7.5.

Content randomly disappears after saving posts. I wrote a plugin as workaround for the issue until this is fixed in core.

<?php
/*
Plugin Name: TinyMCE and wpautop Fix
Description: Fixing a bug in the wpeditor and wpautop that causes content to randomly disappear.
Version: 1.0.0
Author: Matthew Sigley
License: GPL2
*/

defined( 'WPINC' ) || header( 'HTTP/1.1 403' ) & exit; // Prevent direct access

//Tinymce Fix
if( is_admin() ) {
        add_filter('the_editor_content', 'tinymce_fix_filters', 1);
        function tinymce_fix_filters($content) {
                remove_filter('the_editor_content', 'wp_richedit_pre');
                remove_filter('the_editor_content', 'wp_htmledit_pre');
                
                // 'html' is used for the "Text" editor tab.
                if ( 'html' === wp_default_editor() )
                        add_filter('the_editor_content', 'wp_htmledit_pre_fixed');
                else
                        add_filter('the_editor_content', 'wp_richedit_pre_fixed');
                
                return $content;
        }
        
        function wp_richedit_pre_fixed($text) {
                if( empty($text) ) return $text;
                
                $output = convert_chars($text);
                $output = wpautop($output);
                $output = htmlspecialchars($output, ENT_NOQUOTES, get_option( 'blog_charset' ) );
                return $text;
        }
        
        function wp_htmledit_pre_fixed($output) {
                if ( !empty($output) )
                        $output = htmlspecialchars($output, ENT_NOQUOTES, get_option( 'blog_charset' ) ); // convert only < > &
                
                return $output;
        }
}
//wpautop fix
add_action('wp_loaded', 'wpautop_fix_filters');
function wpautop_fix_filters() {
        $filters = array('term_description' => 10,
                                        'the_content' => 10,
                                        'the_excerpt' => 10,
                                        'comment_text' => 30);
        foreach( $filters as $filter => $priority ) {
                remove_filter($filter, 'wpautop', $priority);
                add_filter($filter, 'wpautop_fixed', $priority);
        }
}

function wpautop_fixed($pee, $br = true) {
        $pre_tags = array();

        if ( trim($pee) === '' )
                return '';

        $pee = $pee . "\n"; // just to make things a little easier, pad the end

        if ( strpos($pee, '<pre') !== false ) {
                $pee_parts = explode( '</pre>', $pee );
                $last_pee = array_pop($pee_parts);
                $pee = '';
                $i = 0;

                foreach ( $pee_parts as $pee_part ) {
                        $start = strpos($pee_part, '<pre');

                        // Malformed html?
                        if ( $start === false ) {
                                $pee .= $pee_part;
                                continue;
                        }

                        $name = "<pre wp-pre-tag-$i></pre>";
                        $pre_tags[$name] = substr( $pee_part, $start ) . '</pre>';

                        $pee .= substr( $pee_part, 0, $start ) . $name;
                        $i++;
                }

                $pee .= $last_pee;
        }
        
        $pee = preg_replace('|<br />\s*<br />|', "\n\n", $pee);
        // Space things out a little
        $allblocks = '(?:table|thead|tfoot|caption|col|colgroup|tbody|tr|td|th|div|dl|dd|dt|ul|ol|li|pre|form|map|area|blockquote|address|math|style|p|h[1-6]|hr|fieldset|noscript|legend|section|article|aside|hgroup|header|footer|nav|figure|details|menu|summary)';
        $pee = preg_replace('!(<' . $allblocks . '[^>]*>)!', "\n$1", $pee);
        $pee = preg_replace('!(</' . $allblocks . '>)!', "$1\n\n", $pee);
        $pee = str_replace(array("\r\n", "\r"), "\n", $pee); // cross-platform newlines
        
        if ( strpos( $pee, '</object>' ) !== false ) {
                // no P/BR around param and embed
                $pee = preg_replace( '|(<object[^>]*>)\s*|', '$1', $pee );
                $pee = preg_replace( '|\s*</object>|', '</object>', $pee );
                $pee = preg_replace( '%\s*(</?(?:param|embed)[^>]*>)\s*%', '$1', $pee );
        }

        if ( strpos( $pee, '<source' ) !== false || strpos( $pee, '<track' ) !== false ) {
                // no P/BR around source and track
                $pee = preg_replace( '%([<\[](?:audio|video)[^>\]]*[>\]])\s*%', '$1', $pee );
                $pee = preg_replace( '%\s*([<\[]/(?:audio|video)[>\]])%', '$1', $pee );
                $pee = preg_replace( '%\s*(<(?:source|track)[^>]*>)\s*%', '$1', $pee );
        }

        $pee = preg_replace("/\n\n+/", "\n\n", $pee); // take care of duplicates
        // make paragraphs, including one at the end
        $pees = preg_split('/\n\s*\n/', $pee, -1, PREG_SPLIT_NO_EMPTY);
        $pee = '';

        foreach ( $pees as $tinkle ) {
                $pee .= '<p>' . trim($tinkle, "\n") . "</p>\n";
        }
        
        $pee = preg_replace('|<p>\s*</p>|', '', $pee); // under certain strange conditions it could create a P of entirely whitespace
        $pee = preg_replace('!<p>([^<]+)</(div|address|form)>!', "<p>$1</p></$2>", $pee);
        $pee = preg_replace('!<p>\s*(</?' . $allblocks . '[^>]*>)\s*</p>!', "$1", $pee); // don't pee all over a tag
        $pee = preg_replace("|<p>(<li.+?)</p>|", "$1", $pee); // problem with nested lists
        $pee = preg_replace('|<p><blockquote([^>]*)>|i', "<blockquote$1><p>", $pee);
        $pee = str_replace('</blockquote></p>', '</p></blockquote>', $pee);
        $pee = preg_replace('!<p>\s*(</?' . $allblocks . '[^>]*>)!', "$1", $pee);
        $pee = preg_replace('!(</?' . $allblocks . '[^>]*>)\s*</p>!', "$1", $pee);
        
        if ( $br ) {
                $pee = preg_replace_callback('/<(script|style).*?<\/\\1>/s', '_autop_newline_preservation_helper', $pee);
                //Added u flag to the regex in the next line
                $pee = preg_replace('|(?<!<br />)\s*\n|u', "<br />\n", $pee); // optionally make line breaks
                $pee = str_replace('<WPPreserveNewline />', "\n", $pee);
        }

        $pee = preg_replace('!(</?' . $allblocks . '[^>]*>)\s*<br />!', "$1", $pee);
        $pee = preg_replace('!<br />(\s*</?(?:p|li|div|dl|dd|dt|th|pre|td|ul|ol)[^>]*>)!', '$1', $pee);
        $pee = preg_replace( "|\n</p>$|", '</p>', $pee );

        if ( !empty($pre_tags) )
                $pee = str_replace(array_keys($pre_tags), array_values($pre_tags), $pee);
        
        return $pee;
}

Change History (6)

#1 @SergeyBiryukov
10 years ago

  • Keywords reporter-feedback added

Content randomly disappears after saving posts. I wrote a plugin as workaround for the issue until this is fixed in core.

Could you please describe what causes the issue, how to reproduce it on a clean install, and what exactly your plugin does to fix it?

Last edited 10 years ago by SergeyBiryukov (previous) (diff)

#2 @chaoix
10 years ago

For wpauto_p, the issue appears to be related to character set encoding. Adding a 'u' flag to the regex in this line fixed the function:

//Added u flag to the regex in the next line
$pee = preg_replace('|(?<!<br />)\s*\n|u', "<br />\n", $pee); // optionally make line breaks

The bigger problem in the wpauto_p function is it uses a ton of preg_replace calls and doesn't check for a NULL return from any of them so if any of the regex errors out, all of the content comes back as NULL.

This maybe related to running it under IIS, although I am not positive. The easiest way to break the TinyMCE was to add extra line breaks in between text and images. We are using Otto's TinyMCE advanced plugin to preserve the whitespace in our HTML, which made this issue easier to reproduce.

My fix for TinyMCE was to edit out the 'richedit_pre' and 'htmledit_pre' filters because the content wasn't coming back from them. There is definitely a better solution, but this was my workaround.

#3 @iseulde
10 years ago

  • Component changed from Editor to Formatting
  • Keywords reporter-feedback removed

#4 @miqrogroove
10 years ago

  • Keywords wpautop added

I'd guess this is related to the \s bug in PCRE.

#5 @miqrogroove
10 years ago

See also #27733 and note we can't fix this with \u due to the desire to remain compatible with other character encodings.

#6 @miqrogroove
10 years ago

  • Milestone Awaiting Review deleted
  • Resolution set to duplicate
  • Status changed from new to closed

Duplicate of #27733.

Description was too vague to find any new bugs here.

Note: See TracTickets for help on using tickets.