WordPress.org

Make WordPress Core

Opened 11 years ago

Closed 11 years ago

#11830 closed defect (bug) (fixed)

Link Moderation is Broken

Reported by: miqrogroove Owned by:
Milestone: 3.0 Priority: high
Severity: major Version: 2.7
Component: Comments Keywords: has-patch tested featured
Focuses: Cc:

Description

It's unclear if the patch for #8627 worked on the 2.7 branch. I was only able to test 2.8.4 and 2.9.1 for now. Moderation of links in comments is broken in both branches due to quote slashing. It never works unless I include a bare URL like www.google.com

Attachments (2)

link-moderation.patch (763 bytes) - added by miqrogroove 11 years ago.
Allow slashes before quotes in link moderation (and require space after element name)
11830.diff (740 bytes) - added by Denis-de-Bernardy 11 years ago.

Download all attachments as: .zip

Change History (27)

@miqrogroove
11 years ago

Allow slashes before quotes in link moderation (and require space after element name)

#1 @miqrogroove
11 years ago

  • Keywords has-patch needs-testing added

#2 @miqrogroove
11 years ago

  • Summary changed from Link Moderation Broken by #8627? to Link Moderation is Broken

#3 @miqrogroove
11 years ago

I think it would be just as effective to use stripslash, but I will wait for feedback on the first patch.

#4 @Denis-de-Bernardy
11 years ago

there are issues in the regexp. see #11833, for starters. this garbage, in particular:

<
a	
<a	
	href	=	
	http://foo:bar@foo.com#?#

<a
	href	=	http://foo.com?\"'

>test</a>

test6

test5

results in the following after getting stripslashed with filters applied:

<
a	
<a	
	href	=	
	http://foo:bar@foo.com#?#

<a>test</a>

test6

test5

<p><<br />
a<br />
<a<br />
	href	=<br />
	<a  href="http://foo:bar@foo.com#?#" rel="nofollow">http://foo:bar@foo.com#?#</a></p>
<p><a>test</a></p>
<p>test6</p>
<p>test5</p>

... so imo, we really ought to harden the beasty a little bit.

as an aside, I'd like to suggest:

  1. we start by stripslashing the mess
  2. we remove_filter whatever code annoys us (I'm looking at you, smileys)
  3. we apply filters
  4. and *then* we regexp the mess with something robust

#5 @Denis-de-Bernardy
11 years ago

err, I'm sorry, that actually is without filters, and then with filters. but I still think the suggestion has its value. ;-)

#6 @miqrogroove
11 years ago

Denis that seems like a good separate bug. Have you found any problem with the moderation patch itself? :P I mean, if your tests are being link-moderated correctly then that would mean the patch works for this ticket.

#7 @Denis-de-Bernardy
11 years ago

I still haven't broken it, no. But I'll try a bit harder tomorrow. It's getting a bit too late for me to think about creative ways to break kses.

#8 @miqrogroove
11 years ago

I just installed and tested a copy of version 2.7.1. Link moderation is also broken in that branch now.

#9 @miqrogroove
11 years ago

  • Version changed from 2.7.1 to 2.7

Setting version to 2.7 since the original patch never worked.

#10 @miqrogroove
11 years ago

  • Keywords tested added; needs-testing removed

Patch has now been tested on versions 2.7.1, 2.8.4, and 2.9.1, for cases with bare URLs, XHTML A elements, and no URLs.

#11 @miqrogroove
11 years ago

  • Keywords bug-hunt added

Denis mentioned he was looking for architectural problems. I think trying to spam filter ambiguously slashed inputs qualifies for that. :)

#12 @Denis-de-Bernardy
11 years ago

  • Keywords featured added; bug-hunt removed

#13 @nacin
11 years ago

  • Milestone changed from 2.9.2 to 3.0

#14 @nacin
11 years ago

(In [13150]) Adjust regex for counting links in a comment when checking if it needs to be held for moderation. Also fix notice in wp_new_comment(). see #11830

#15 @nacin
11 years ago

After talking with miqrogroove, I've committed a simplified regex. We just need to account for anchors that have an href attribute, and anything more and we're grasping at straws.

It looks like the problems pointed out here are not directly related. Another ticket?

#16 follow-up: @miqrogroove
11 years ago

I liked a[^>]+ more than a [^>]*

Let's be sure to check what's going on with unfiltered_html.

#17 in reply to: ↑ 16 @nacin
11 years ago

Replying to miqrogroove:

I liked a[^>]+ more than a [^>]*

Both were improvements in that they require something after the tag name. The latter simply makes sure it's a space right after the tag name.

The former would hypothetically match, say, an <abbr> tag with an href attribute. Not that you'd see that (other than xhtml2).

#18 @miqrogroove
11 years ago

Okay, unfiltered_html users don't run check_comment() at all, so the space is not a concern. kses always provides a space AFAIK.

Problem solved.

Did you still want to keep this open? Any thoughts on #11684?

#19 @nacin
11 years ago

  • Resolution set to fixed
  • Status changed from new to closed

#20 @Denis-de-Bernardy
11 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

with the change, we're no longer catching:

<a
href="foo">

the attached patch fixes that.

#21 @Denis-de-Bernardy
11 years ago

based on the discussion above, kses *should* sanitize the spaces, but I think we ought to make sure the regexp works even if kses doesn't sanitize the thingy.

#22 @nacin
11 years ago

(In [13801]) Fix notice without breaking comment notifications. See r13150, see #11830

#23 @dd32
11 years ago

Denis's patch here does fix the use-case provided. (However only \b is needed,
b is not)

I'd like to see the testdata that this function has been run against however (before/after), its hard to tell what changes are needed here to retain backwards compat.

Take this for example:

<a	
	href	=	
	http://foo:bar@foo.com#?#

<a
	href	=	http://foo.com?\"'

>test</a>

thats detected as one link by the current code. However, if we change it to be non-greedy: '/<a\b[^>]*?href/i' (addition of ? after *) it detects it as 2 links.

#24 @nacin
11 years ago

Looks like this last part is also mentioned in #11833. Alternative would be [<>] instead of [>]. There's a few ways to skin this cat.

This did exist in the old regex, I lifted [>] from there. Is this safe to close as fixed, and handle the rest in the other ticket?

#25 @nacin
11 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed
Note: See TracTickets for help on using tickets.