Opened 18 years ago
Closed 17 years ago
#3495 closed defect (bug) (fixed)
Pingback excerpt-fetching code looks for a link to the SOURCE site instead of the TARGET site.
Reported by: | markjaquith | Owned by: | markjaquith |
---|---|---|---|
Milestone: | 2.1 | Priority: | high |
Severity: | normal | Version: | 2.0.5 |
Component: | XML-RPC | Keywords: | has-patch |
Focuses: | Cc: |
Description
xmlrpc.php:1216-1218
$sem_regexp_pb = "/(\\/|\\\|\*|\?|\+|\.|\^|\\$|\(|\)|\[|\]|\||\{|\})/"; $sem_regexp_fix = "\\\\$1"; $link = preg_replace( $sem_regexp_pb, $sem_regexp_fix, $pagelinkedfrom );
So now $link is a regex-safe version of the SOURCE URL.
xmlrpc.php:1221-1232
foreach ( $p as $para ) { if ( $finished ) continue; if ( strstr( $para, $pagelinkedto ) ) { $context = preg_replace( "/.*<a[^>]+".$link."[^>]*>([^>]+)<\/a>.*/", "$1", $para ); $excerpt = strip_tags( $para ); $excerpt = trim( $excerpt ); $use = preg_quote( $context ); $excerpt = preg_replace("|.*?\s(.{0,100}$use.{0,100})\s|s", "$1", $excerpt); $finished = true; } }
The SOURCE URL's paragraphs are iterated. Once one is found that contains the TARGET URL, it (mistakenly) looks for a link to the SOURCE URL and uses that as context for the excerpt. It doesn't find it, of course. But it doesn't really matter, because even if the context regex used the TARGET URL as it should, the excerpt regex matches the whole paragraph.
I thought I was going crazy when I saw this code... was pretty sure that I was missing something.
This dates back all the way to [2619]
Patch coming.
Attachments (2)
Change History (17)
#1
@
18 years ago
- Milestone changed from 2.0.6 to 2.1
- Owner changed from anonymous to markjaquith
- Status changed from new to assigned
Uploaded: pingback_excerpt.001.diff
- use
preg_quote()
instead of wacky manual escaping - use the target URL for context
- mark the link text to prevent more than one occurrence of that string from interfering
- make the context-finding regex lazy
- trim link text if it is more than 100 characters long
There. Now that should be functioning like it was intended to function. You get a nice excerpt with an absolute upper limit of 300 chars instead of the 1000+ char monstrosities that frequently showed up with the broken code.
I'll let this sit a day or two for auditing.
#5
@
18 years ago
- Milestone changed from 2.1 to 2.0.7
Too late for 2.0.6, but definitely a 2.0.7 candidate.
#6
@
18 years ago
- Keywords bg|has-patch added
- Milestone changed from 2.0.7 to 2.0.8
- Resolution fixed deleted
- Status changed from closed to reopened
Reopening so we could maybe get this into 2.0.8
#7
@
18 years ago
- Milestone changed from 2.0.8 to 2.0.9
Want to see what 2.1 users think of this before getting it into 2.0.x
#8
@
18 years ago
- Keywords has-patch added; bg|has-patch removed
- Milestone changed from 2.0.9 to 2.0.10
2.0.9 is out, so bumping the milestone to 2.0.10.
#10
follow-up:
↓ 11
@
18 years ago
@foolswisdom:
Instead of bumping these tickets after each security release. Wouldn't it make more sense if we milestone them as "2.0.eventually" and set the final correct milestone when they're commited? It would save us some work. :)
#11
in reply to:
↑ 10
@
18 years ago
Replying to Nazgul:
Wouldn't it make more sense if we milestone them as "2.0.eventually"
True enough. Next time, or maybe Mark J will do that ;-)
#12
@
18 years ago
Note that when a milestone is marked as completed, trac offers to move all open bugs to another milestone. It's not difficult to bulk move items from one milestone to another.
Patch for trunk