﻿id	summary	reporter	owner	description	type	status	priority	milestone	component	version	severity	resolution	keywords	cc
5678	Respectfully strip newlines in some importers	jdub	hansengel	"Filing this as an enhancement because it could do with some discussion and insight from wiser and more experienced heads before being labelled ""defect"". :-)

I noticed while helping some users import their blogs that importers of HTML content (such as the RSS importer) don't tidy up superfluous newlines in the import format, which results in unnecessary {{{<br/>}}} elements after {{{wpautop()}}} filtering for display. They turn up in the editor too, which reinforces the problem.

I've adapted one of the filter functions to strip superfluous newlines, and changed my RSS importer to use it. The results have been warmly welcomed by users, who no longer have to clean up their imported blog content. ;-)

{{{strip_newlines()}}} should probably go into {{{wp-includes/formatting.php}}}, if there isn't already a function that already serves this purpose. I couldn't find one, so I adapted this. 

Given that similar HTML block/inline-savvy string-replacement code exists in other formatting functions, perhaps there's an opportunity for some refactoring here? I feel kind of silly proposing a function that is almost entirely duplicated from other code in the core.

I've used it immediately before the ""Clean up content"" section in {{{wp-admin/import/rss.php}}}'s {{{get_posts()}}}, and in an Advogato importer that I've written (which also uses HTML as the content format).

{{{
function strip_newlines($text) {
	// Respectfully strip unnecessary newlines
	$textarr = preg_split(""/(<[^>]+>)/Us"", $text, -1, PREG_SPLIT_DELIM_CAPTURE);
	$stop = count($textarr); $skip = false; $output = ''; // loop stuff
	for ($ci = 0; $ci < $stop; $ci++) {
		$curl = $textarr[$ci];
		if (! $skip && isset($curl{0}) && '<' != $curl{0}) { // If it's not a tag
			$curl = preg_replace('/[\n\r]+/', ' ', $curl);
		} elseif (strpos($curl, '<code') !== false || strpos($curl, '<pre') !== false || strpos($curl, '<kbd') !== false || strpos($curl, '<style') !== false || strpos($curl, '<script') !== false) {
			$next = false;
		} else {
			$next = true;
		}
		$output .= $curl;
	}
	return $output;
}
}}}

Thoughts?"	enhancement	accepted	normal	WordPress.org	Import	2.5	normal		needs-patch has-docs	
