Opened 6 years ago
Last modified 6 years ago
#45702 new defect (bug)
make_clickable() doesn't handle linked text being a URL with spaces within it
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Future Release | Priority: | low |
Severity: | trivial | Version: | |
Component: | Formatting | Keywords: | needs-unit-tests needs-patch |
Focuses: | Cc: |
Description
As reported in https://meta.trac.wordpress.org/ticket/3998
<a href="https://codex.wordpress.org/Roles and Capabilities">https://codex.wordpress.org/Roles and Capabilities</a>turns into:
<a href="https://codex.wordpress.org/Roles and Capabilities" rel="nofollow"></a><a href="https://codex.wordpress.org/Roles" rel="nofollow">https://codex.wordpress.org/Roles</a> and CapabilitiesLooks like either
make_clickable()
orbbp_make_clickable()
is trying to make the URL clickable without taking the existing<a>
tag into account.
As I've commented on the ticket:
This is a problem in bbp_make_clickable()
, but is also present in make_clickable()
.
Given the input string of <a href="https://codex.wordpress.org/Roles and Capabilities">https://codex.wordpress.org/Roles and Capabilities</a>
both will return the invalid output.
Both contain the following to correct it:
return preg_replace( '#(<a([ \r\n\t]+[^>]+?>|>))<a [^>]+?>([^>]+?)</a></a>#i', "$1$3</a>", $r );
But as the resulting HTML is the following it's not matched (due to the </a></a>
- which assumes that neither the linked text never has spaces or isn't a URL)
<a href="https://codex.wordpress.org/Roles and Capabilities"><a href="https://codex.wordpress.org/Roles" rel="nofollow">https://codex.wordpress.org/Roles</a> and Capabilities</a>
Adjusting the regular expression to the following does work however for this specific case:
return preg_replace( '#(<a([ \r\n\t]+[^>]+?>|>))<a [^>]+?>([^>]+?)</a>([^<]*)</a>#i', "$1$3$4</a>", $r );
As a work around, you can remove the spaces from the linked text, which avoids it:
<a href="https://codex.wordpress.org/Roles and Capabilities">https://codex.wordpress.org/Roles and Capabilities</a>
.
Also created https://bbpress.trac.wordpress.org/ticket/3237
Change History (4)
#3
@
6 years ago
Just wanted to note this comment of mine from the bbPress ticket, as it applies equally as well to WordPress's implementation (which has diverged from what is used in bbPress it appears)
Just want to note that my proposed solution will break if there's a URL and HTML in the linked text, but that's even more edge-casey than the reported problem:
<a href="https://testing/path/">http://testing/<strong>path</strong>/</a>
becomes
<a href="https://testing/path/"><a href="http://testing/" rel="nofollow">http://testing/</a><strong>path</strong>/</a>
The regex in question isn't designed to prevent every case, just the most obvious ones IMHO.
A better fix, could be to skip linking inside of <a>
tags through the same logic applied for <script, style, code, and pre>
tags above it: https://bbpress.trac.wordpress.org/browser/trunk/src/includes/common/formatting.php?rev=6885&marks=365-374#L344
#4
@
6 years ago
- Milestone changed from Awaiting Review to Future Release
Here's another variation on the problem I've run across:
wp> make_clickable( '<p><a href="https://search.google.com/structured-data/testing-tool#url=">https://search.google.com/s...</a>' ); => string(249) "<p><a href="https://search.google.com/structured-data/testing-tool#url="><a href="https://search.google.com/s" rel="nofollow">https://search.google.com/s</a>...</a>"
Replacing ...
with …
fixes it:
wp> make_clickable( '<p><a href="https://search.google.com/structured-data/testing-tool#url=">https://search.google.com/s…</a>' ); => string(192) "<p><a href="https://search.google.com/structured-data/testing-tool#url=">https://search.google.com/s…</a>"
Fixed in bbPress (for 2.6) here: https://bbpress.trac.wordpress.org/changeset/6885
Props @dd32!