Opened 7 years ago
Last modified 7 years ago
#46886 new defect (bug)
wp_targeted_link_rel adds the rel attribute when the link has data-target=""
| Reported by: |
|
Owned by: | |
|---|---|---|---|
| Milestone: | Awaiting Review | Priority: | normal |
| Severity: | normal | Version: | 5.1 |
| Component: | Formatting | Keywords: | has-unit-tests |
| Focuses: | Cc: |
Description
To reproduce:
Create a link element (easiest to see bug in classic editor):
<a href="https://example.com" data-target="thisandthat">click here</a>
Save the post, and the link will have the rel="noopener noreferrer" attribute.
Attachments (2)
Change History (8)
#3
follow-up:
↓ 4
@
7 years ago
46886-2.diff contains the unit-test in 46886.diff and the modification mentioned above, that makes sure the space prefixed target attribute: ' target', is contained in the string before passing it into the regex callback.
#4
in reply to:
↑ 3
@
7 years ago
Replying to birgire:
46886-2.diff contains the unit-test in 46886.diff and the modification mentioned above, that makes sure the space prefixed target attribute:
' target', is contained in the string before passing it into the regex callback.
This can still pass through false positives so we could look at the regex closer.
#5
@
7 years ago
For the background story, this was added in #43187.
By pre-checking for ' target' in 46886-2.diff, the following will not go to the regex callback:
<a href="https://example.com" data-target="thisandthat">click here</a>
but this would:
<a href="https://example.com" data-target="This is targetting all">click here</a>
One could additionally adjust the regex from:
|<a\s([^>]*target\s*=[^>]*)>|i
to:
|<a\s+((?:[^>]*?\s+)?target\s*=[^>]*)>|i
to account for the white space in front of the target attribute. Here (?:) means a non-capturing group. This answer by plalx was helpful. Tested \s instead of \s+ in the changed regex for few cases without problem.
But this will not stop edge cases, where ' target=' is part of the attribute value, like:
<a href="https://example.com" data-target="This is target=One">click here</a>
from getting the rel part added.
We note that target=_blank, without " or ', is matched by the current regex, so it could make solving the edge case more involved.
So at minimum, I think we could consider adding the extra space in the pre-check, to reduce the false positives.
Changing an existing regex should be approached with care, but if we go that route it would most likely require some extra unit-tests.
One could also look further into checks within the callback.
#6
@
7 years ago
I noticed a shortcoming in 46886-2.diff that the pre-check of ' target' will not pick up tabs and other non-space whitespaces in front of the target attribute.
The current pre-check of '<a ' has similar shortcomings, as it only assumes a space after the <a part.
A workaround for that could be to precheck 'target' and '<a' and capture the first whitespace group with:
|<a(\s+)((?:[^>]*?\s+)?target\s*=[^>]*)>|i
and adjust the regex callback accordingly.
Thanks for the ticket @jakeparis
46886.diff adds a failing unit test for this case.
Skimming through the regex
'|<a\s([^>]*target\s*=[^>]*)>|i', withinwp_targeted_link_rel():https://core.trac.wordpress.org/browser/tags/5.1.1/src/wp-includes/formatting.php#L3041
it seems to allow the
targetattribute to be part of another string, thus picking updata-targetas a false positive.But at the moment it seems tempting to just replace:
stripos( $text, 'target' ) !== falsewith:
stripos( $text, ' target' ) !== falsewithin
wp_targeted_link_rel()here:https://core.trac.wordpress.org/browser/tags/5.1.1/src/wp-includes/formatting.php#L3040
At least the unit tests in
Tests_Targeted_Link_Relseems to like it :-)