Opened 5 years ago
Last modified 5 years ago
#46886 new defect (bug)
wp_targeted_link_rel adds the rel attribute when the link has data-target=""
Reported by: | jakeparis | Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | 5.1 |
Component: | Formatting | Keywords: | has-unit-tests |
Focuses: | Cc: |
Description
To reproduce:
Create a link element (easiest to see bug in classic editor):
<a href="https://example.com" data-target="thisandthat">click here</a>
Save the post, and the link will have the rel="noopener noreferrer" attribute.
Attachments (2)
Change History (8)
#3
follow-up:
↓ 4
@
5 years ago
46886-2.diff contains the unit-test in 46886.diff and the modification mentioned above, that makes sure the space prefixed target attribute: ' target'
, is contained in the string before passing it into the regex callback.
#4
in reply to:
↑ 3
@
5 years ago
Replying to birgire:
46886-2.diff contains the unit-test in 46886.diff and the modification mentioned above, that makes sure the space prefixed target attribute:
' target'
, is contained in the string before passing it into the regex callback.
This can still pass through false positives so we could look at the regex closer.
#5
@
5 years ago
For the background story, this was added in #43187.
By pre-checking for ' target'
in 46886-2.diff, the following will not go to the regex callback:
<a href="https://example.com" data-target="thisandthat">click here</a>
but this would:
<a href="https://example.com" data-target="This is targetting all">click here</a>
One could additionally adjust the regex from:
|<a\s([^>]*target\s*=[^>]*)>|i
to:
|<a\s+((?:[^>]*?\s+)?target\s*=[^>]*)>|i
to account for the white space in front of the target
attribute. Here (?:)
means a non-capturing group. This answer by plalx was helpful. Tested \s
instead of \s+
in the changed regex for few cases without problem.
But this will not stop edge cases, where ' target='
is part of the attribute value, like:
<a href="https://example.com" data-target="This is target=One">click here</a>
from getting the rel
part added.
We note that target=_blank
, without "
or '
, is matched by the current regex, so it could make solving the edge case more involved.
So at minimum, I think we could consider adding the extra space in the pre-check, to reduce the false positives.
Changing an existing regex should be approached with care, but if we go that route it would most likely require some extra unit-tests.
One could also look further into checks within the callback.
#6
@
5 years ago
I noticed a shortcoming in 46886-2.diff that the pre-check of ' target'
will not pick up tabs and other non-space whitespaces in front of the target
attribute.
The current pre-check of '<a '
has similar shortcomings, as it only assumes a space after the <a
part.
A workaround for that could be to precheck 'target'
and '<a'
and capture the first whitespace group with:
|<a(\s+)((?:[^>]*?\s+)?target\s*=[^>]*)>|i
and adjust the regex callback accordingly.
Thanks for the ticket @jakeparis
46886.diff adds a failing unit test for this case.
Skimming through the regex
'|<a\s([^>]*target\s*=[^>]*)>|i'
, withinwp_targeted_link_rel()
:https://core.trac.wordpress.org/browser/tags/5.1.1/src/wp-includes/formatting.php#L3041
it seems to allow the
target
attribute to be part of another string, thus picking updata-target
as a false positive.But at the moment it seems tempting to just replace:
stripos( $text, 'target' ) !== false
with:
stripos( $text, ' target' ) !== false
within
wp_targeted_link_rel()
here:https://core.trac.wordpress.org/browser/tags/5.1.1/src/wp-includes/formatting.php#L3040
At least the unit tests in
Tests_Targeted_Link_Rel
seems to like it :-)