Make WordPress Core

Opened 19 years ago

Closed 19 years ago

Last modified 18 years ago

#1268 closed enhancement (fixed)

new excerpt function suggestion

Reported by: denis-de-bernardy's profile Denis de Bernardy Owned by:
Milestone: Priority: normal
Severity: normal Version: 1.5.1
Component: Template Keywords:
Focuses: Cc:

Description

/*

  • sem_fancy_excerpt()
  • -------------------
  • Creates a fancy default excerpt

/

function sem_fancy_excerpt( $excerpt )
{
global $post;

$excerpt_length = 30;

if ( $excerpt == )
{

$excerpt = $post->post_content;
$excerpt = apply_filters( 'the_content', $excerpt );
$excerpt = strip_tags( $excerpt );
$excerpt = preg_replace( "/(\w+(\W+\w+){".($excerpt_length-1)."}[\.]*\.)[$]+/", "$1 (...)", $excerpt );

}
return $excerpt;
} end sem_fancy_excerpt()

remove_filter('get_the_excerpt', 'wp_trim_excerpt');
add_filter('get_the_excerpt', 'sem_fancy_excerpt');

Attachments (1)

sem-smart-excerpt.php (3.9 KB) - added by Denis de Bernardy 19 years ago.

Download all attachments as: .zip

Change History (13)

#1 @Denis de Bernardy
19 years ago

  • Patch set to No

#2 @Denis de Bernardy
19 years ago

the above works a bit like wp_trim_excerpt, but returns full sentences.

#3 @Denis de Bernardy
19 years ago

preg_replace( "/\W*(\w+(\W+\w+){".($excerpt_length-1)."}[\.]*\.)[$]+/", "$1 (...)", $excerpt );

(just in case)

#4 @Denis de Bernardy
19 years ago

I added the file to the alpha version of my plugin. I left the test code at the end of it, should you want to play with it.

The backlink_excerpt method is extremely forgiving to html errors and will eat all but the worst coded html pages.

I think there is room for a few more enhancements. In particular, if you have:

  • upper text
  • <a ...>blah blah</a>
  • lower text

The excerpt will be 'blah blah', whereas it would make more sense to have the upper and lower text. But presumably, it remains good enough for now.

#5 @Denis de Bernardy
19 years ago

a few dozen test cases later, it is:

$excerpt = preg_replace( "/\W*(\w+(\W+\w+){".($excerpt_length-1)."}((?!(\.|\n|\r)).)*(\.|\n|\r))(\w|\W)+/", "$1 (...)", $excerpt );

rather than /(\w+(\W+\w+){".($excerpt_length-1)."}[\.]*\.)[$]+/

#6 @Denis de Bernardy
19 years ago

/\W*(\w+(\W+\w+){".($excerpt_length-1)."}((?!(!|\?|\.|\n|\r)).)*(!|\?|\.|\n|\r))(\w|\W)+/

even. ;)

edited on: 04-20-05 21:11

#7 @MC_incubus
19 years ago

Probably not something to be considered for WP 1.5.1 (as it's a feature, and a regex feature, at that!)

Isn't this essentially in plugin form anyway?

#8 @Denis de Bernardy
19 years ago

the fancy excerpt is a plugin, yes. however, the attached file contains the following function, that Matt requested the other day in the hackers list:

function sem_backlink_excerpt( $text, $link )
{

clean up the text

$text = preg_replace( "/<!DOC/", "<DOC", $text ); strip_tags bug
$text = preg_replace( "/[\s\r\n\t]+/", " ", $text );
normalize spaces
$text = preg_replace( "/ <(h1|h2|h3|h4|h5|h6|p|th|td|li|dt|dd|pre|caption|input|textarea|button|body)[>]*>/", "\n\n", $text );
$text = strip_tags( $text, "<title><a>" ); just keep the tags we need

echo Markdown( $text );

split into paragraphs

$p = explode( "\n\n", $text );

fetch the title

$title = preg_replace( "/[<]*<[>]*title[>]*>([<]*)<\/[>]*title[>]*>(\w|\W)*/", "$1", $p[0] );

fetch the first paragraph with the link

$sem_regexp_pb = "/(
/|
\|\*|\?|\+|\.|\|
$|\(|\)|\[|\]|\
\{|\})/";

$sem_regexp_fix = "

$1";
$link = preg_replace( $sem_regexp_pb, $sem_regexp_fix, $link );

for ( $i = 0; $p[$i] && !$excerpt; $i++ )
{

if ( preg_match( "/<a[>]+".$link."[>]*>/", $p[$i] ) )
{

$context = preg_replace( "/.*<a[>]+".$link."[>]*>([>]+)<\/a>.*/", "$1", $p[$i] );
$excerpt = trim( strip_tags( $p[$i] ) );

}

}

return the result
return array( $title, $context, $excerpt );
}
end sem_backlink_excerpt ()

edited on: 04-21-05 09:58

#9 @Denis de Bernardy
19 years ago

sem_backlink_excerpt() dines on an html page and an url. it returns an array with:

  • the html page's title
  • the excerpt, = the first paragraph that contains the link (rather than a more or less random string containing it)
  • the context = the link text

at the moment the context is the link's text. in the future, i'll likely change this to be the full sentence. likewise, i might change the function to return the paragraphs before and after, when the paragraph is a mere sentence.

from visiting wp source, i think using the function requires rewriting some areas of code related to fetching pingback excerpts. and that you do not necessarily require the $context variable.

i do, however, for two plugins i've in my pipe. thus, it would be nice if there were some hook i can catch to bypass it in the future.

#10 @matt
19 years ago

  • Resolution set to fixed
  • Status changed from new to closed

Rolled in some of this code and it seems to be working well. Thanks! :)

#11 @matt
19 years ago

  • Milestone set to 1.5.2

#12 @(none)
18 years ago

  • Milestone 2.0 deleted

Milestone 2.0 deleted

Note: See TracTickets for help on using tickets.