#10376 closed enhancement (fixed)
White space should be collapsed after tags are stripped in wp_trim_excerpt
Reported by: | sojweb | Owned by: | |
---|---|---|---|
Milestone: | 3.0 | Priority: | normal |
Severity: | normal | Version: | 2.8.1 |
Component: | Formatting | Keywords: | has-patch wp_trim_excerpt |
Focuses: | Cc: |
Description
I've a minor edit I'd like to propose: When wp_trim_excerpt strips tags from the post content, it leaves the white space surrounding those tags, and then pulls the first 55 characters, which could be blank. So, for a lot of my posts that had a big set of HTML tags at the beginning, the excerpt would be empty. The work around is easy: collapse the white space (which the browser would do anyway). I hope it seems logical. Diff file attached.
Attachments (3)
Change History (28)
#3
@
15 years ago
- Cc scribu@… added
- Milestone changed from Future Release to 2.8.2
Wouldn't a simple trim() do the job?
#4
@
15 years ago
I agree, trim() would be much simpler and probably more efficient. Make sure it's after the strip_tags(), otherwise there could be whitespace in the middle being left there.
#6
@
15 years ago
No. a trim would not work.
the issue here, is that
$in = "A<test> <test> <test> B"; $in = strip_tags($in); echo $in; //is now 3 spaces between a and b
when you explode by spaced, you end up with spaces being words, ie.
$in = " "; $in = explode(' ', $in); var_dump($in); --- array 0 => string '' (length=0) 1 => string '' (length=0) 2 => string '' (length=0) 3 => string '' (length=0)
This is what i'd use myself to get the words:
$a = " This is a test"; preg_match_all('|\b([^\W]+?)\b|', $a, $words); $words = $words[1]; result: array 0 => string 'This' (length=4) 1 => string 'is' (length=2) 2 => string 'a' (length=1) 3 => string 'test' (length=4)
#7
@
15 years ago
Considering that strip_tags is run after wpautop() and the other filters on 'the_content'
, the pseudo-excerpt will always be one line text. If we want to avoid normalizing white space for the whole post, perhaps can use preg_split:
$words = preg_split('/\s+/', $text, $excerpt_length + 1, PREG_SPLIT_NO_EMPTY);
#8
@
15 years ago
Probably shouldn't count on wpautotop() running, as a lot of people (myself included) have it disabled, but preg_split would be just as good, just don't forget to strip tags from $text; otherwise you will probably end up with broken HTML. I've attached a diff file that uses it.
#9
@
15 years ago
- Milestone changed from 2.8.2 to 2.9
@sojweb: why the move of the strip_tags call in the patch?
#10
follow-up:
↓ 11
@
15 years ago
@Denis-de-Bernardy: You mean why remove it from its own line? It makes for one less variable assignment... I suppose I did it more out of habit than anything else. It doesn't make any real difference.
#11
in reply to:
↑ 10
@
15 years ago
Replying to sojweb:
@Denis-de-Bernardy: You mean why remove it from its own line? It makes for one less variable assignment... I suppose I did it more out of habit than anything else. It doesn't make any real difference.
It makes things less readable? :-)
D.
#13
follow-up:
↓ 14
@
15 years ago
Yayaya, and "If it was hard for me to write it, it should be hard for them to understand it!" Still... :-)
#14
in reply to:
↑ 13
@
15 years ago
Replying to Denis-de-Bernardy:
Yayaya, and "If it was hard for me to write it, it should be hard for them to understand it!" Still... :-)
Ha, exactly!
No, I'm totally with you... readability is worth the extra line.
#17
@
15 years ago
alternative implementation:
http://svn.wp-plugins.org/sem-fancy-excerpt/trunk/sem-fancy-excerpt.php
#18
@
15 years ago
- Keywords needs-patch added; has-patch removed
btw, the current patch is not utf8-safe.
#19
@
15 years ago
see http://core.trac.wordpress.org/ticket/11669#comment:17 for the above-mentioned utf8 problem.
#21
@
15 years ago
- Milestone changed from 3.1 to 3.0
I've added a utf8-safe version of the suggested patch, in case anyone's willing to test this.
*That should read: "first 55 words" in the description