WordPress.org

Make WordPress Core

Opened 7 years ago

Closed 6 years ago

#4570 closed defect (bug) (fixed)

Comment link cannot contain IRIs

Reported by: link92 Owned by: nbachiyski
Milestone: 2.7 Priority: high
Severity: critical Version: 2.0.10
Component: General Keywords: has-patch tested
Focuses: Cc:

Description

If you try and create a comment with a link such as "http://www.詹姆斯.com/" it is rewritten to "http://www..com/", which isn't overly useful.

Attachments (1)

4570.diff (1.5 KB) - added by guillep2k 6 years ago.

Download all attachments as: .zip

Change History (20)

comment:1 Nazgul7 years ago

  • Milestone set to 2.4 (future)

comment:2 josephscott7 years ago

  • Cc josephscott added

I've confirmed in -trunk that if you put the example link above in the URL field of the comment form that strips the domain out. I'll see what I can do to track it down.

comment:3 foolswisdom7 years ago

  • Priority changed from normal to high
  • Severity changed from normal to critical

comment:4 ryan7 years ago

clean_url() doesn't handle IRIs.

comment:5 josephscott7 years ago

The function that is eating this is clean_url() in wp-includes/formatting.php. It doesn't support characters outside of US-ASCII (as Ryan mentioned). I didn't see and obvious/easy/well tested solution to this issue and given the potential risks (we have to filter URLs) this will have stay a 2.4 target at this point I think.

comment:6 ryan7 years ago

I'll add a filter to clean_url() so that some someone can write a plugin as a stop-gap while we figure out the best approach.

comment:7 ryan7 years ago

(In [5952]) Add clean_url filter. see #4570

comment:8 ryan7 years ago

Oops, accidentally committed a taxonomy change along with that. Ignore that, formatting.php has the change relevant to this ticket.

comment:9 ryan7 years ago

Added clean_url filter that accepts the cleaned url and the original url as args. A plugin can take the original url, filter it in an IRI friendly way, and pass it back.

comment:10 darkfate6 years ago

  • Resolution set to fixed
  • Status changed from new to closed

comment:11 westi6 years ago

  • Resolution fixed deleted
  • Status changed from closed to reopened

Not sure this is currently fixed - just filterable leaving open for now.

comment:12 yogipress6 years ago

did we resolve this?

if not,
do we want to create a function is_iri( ) to check before we take original url then filter it?

comment:13 nbachiyski6 years ago

The problem is not only with IRIs, the current scheme doesn't allow other widely used characters, like parens (#5668). IMHO we should allow any characters and just escape them accordingly -- #5663.

comment:14 lloydbudd6 years ago

  • Keywords needs-patch added
  • Milestone changed from 2.5 to 2.6
  • Owner changed from anonymous to nbachiyski
  • Status changed from reopened to new

comment:15 guillep2k6 years ago

By RFC 3987 (http://tools.ietf.org/html/rfc3987#section-2.1), the unicode characters outside the US-ASCII repertoire are not reserved, and therefore cannot be used for syntactical purposes. This should make them suitable for any URL and shouldn't pose a security problem (at least I hope so!).

I don't see how Wordpress can process these characters if it's not set up for using UTF-8, but anyway here's a patch. It should be thoroughly tested, though, because I had to modify the make_clickable() presentation filter as well (used to filter comment_text when displaying comments).

guillep2k6 years ago

comment:16 guillep2k6 years ago

  • Cc guillep2k added
  • Keywords has-patch added; needs-patch removed

comment:17 guillep2k6 years ago

  • Keywords needs-testing added

comment:18 nbachiyski6 years ago

  • Keywords tested added; needs-testing removed
  • Milestone changed from 2.9 to 2.7

I wrote a couple of tests (search for test_iri) and the proposed patch passes them and all the previous make_clickable() tests pass too.

comment:19 ryan6 years ago

  • Resolution set to fixed
  • Status changed from new to closed

(In [8525]) Allow IRIs. Props guillep2k. fixes #4570

Note: See TracTickets for help on using tickets.