#53290 closed defect (bug) (fixed)
Prevent tagging certain links in comments with rel="nofollow ugc"
Reported by: | galbaras | Owned by: | jorbin |
---|---|---|---|
Milestone: | 6.2 | Priority: | normal |
Severity: | normal | Version: | 5.7.2 |
Component: | Comments | Keywords: | |
Focuses: | Cc: |
Description
Assuming that user generated content links should not be followed is generally reasonable. However, there are some exceptions, some of which can be overcome by filters, and some, for which there are none.
- The function
get_comment_author_link()
shouldn't modify the author link indiscriminately. Although its output can be filtered, it would be better to prevent link tagging for specific authors. This can be achieved by adding the following code on line 223 of wp-includes/comment-template.php:
if ( empty( $url ) || 'http://' === $url || in_array( get_comment_author( $comment_ID ), apply_filters( 'comment_authors_link_allowed', [] ) ) {
return = $author;
}
- The function
wp_rel_ugc()
shouldn't modify the author link indiscriminately either. It should get the comment ID and use thecomment_link_allowed_authors
filter as well to prevent some authors' links from being modified. The top of the function then becomes:
function wp_rel_ugc( $text, $comment_ID ) {
if ( in_array( get_comment_author( $comment_ID ), apply_filters( 'comment_authors_link_allowed', [] ) ) {
return $text;
}
- The function
wp_rel_callback()
should accept a list of allowed domains, which should include the local domain by default. Line 3075 of wp-includes/formatting.php should become:
if ( ! empty( $atts['href'] ) && ! in_array( $atts['href'], apply_filters( 'comment_link_allowed_domains', [ get_option( 'home' ) ) ) ) {
These changes will provide flexibility that's currently missing in the comment link tagging functionality.
Change History (10)
#2
@
23 months ago
While working on #56444 I opened a PR that I think addresses issues raised here as well. See https://core.trac.wordpress.org/ticket/56444#comment:8
#3
@
22 months ago
I believe I've tackled points 2 & 3 in conjuction with #56444.
I'm rereading this and trying to wrap my head around point 1:
The function get_comment_author_link() shouldn't modify the author link indiscriminately. Although its output can be filtered, it would be better to prevent link tagging for specific authors. This can be achieved by adding the following code on line 223 of wp-includes/comment-template.php
The suggested added code wouldn't actually do much to allow filtering of the rel attribute.
As the code stands today, comment author links are always allowed (if they're supplied) and they're always rendered with rel="external nofollow ugc"
. I believe the spirit of this issue would be to create a filterable allowlist of comment authors. Comment authors present in that last would NOT have ugc added to their comment author links.
Additionally, the "external" and "ugc" attributes should not be added by default, instead they should only be added when applicable.
I've added some new code to address this in the way I understand it, see https://github.com/WordPress/wordpress-develop/pull/3825
After rewriting I started to get to work on some tests but I don't see any existing unit tests in place for this function. I assume it's "safe" to add a new file at tests/phpunit/tests/comment/getCommentAuthorLink.php
where new tests for this function may be written. I've started baseline tests but ran into a semi-related bug. I'm not sure if I should open a new issue for this or not or simply address the issue:
If you call get_comment_author_link()
and pass no $comment_id
(and there's no $GLOBALS['comment']
set) the filter on the function's return errors since $comment
isn't an object, $comment->comment_ID
can't be accessed.
I think it would make sense to have an early return immediately after the first line of the function (if get_comment()
returns null). But I'm not sure this is the "standard" convention for a function like this to return empty with no filter in this kind of a scenario. Or maybe this is "known" and "expected" and should be ignored.
In any event, if my solution to point 1 seems sound I'll get some tests written in the next few days.
Thanks
#4
@
22 months ago
I think your approach should work well.
One point of efficiency may be to use a static variable for the internal hosts list, rather than applying a filter to a function within a function every time wp_internal_hosts()
is called.
Optimising even further, the code from wp_internal_hosts()
can be used directly in wp_is_internal_link()
to set a static $internal_hosts
, instead of applying the array_map()
on every call.
So something like this:
<?php function wp_is_internal_link( $link ) { static $internal_hosts; $link = strtolower( $link ); if ( in_array( wp_parse_url( $link, PHP_URL_SCHEME ), wp_allowed_protocols(), true ) ) { if ( empty( $internal_hosts ) { $internal_hosts = array_map( 'strtolower', apply_filters( 'wp_internal_hosts', array( wp_parse_url( home_url(), PHP_URL_HOST ) ) ) ); } return in_array( wp_parse_url( $link, PHP_URL_HOST ), $internal_hosts, true ); } return false; }
Looking at wp_allowed_protocols()
, that's exactly how protocols are stored, so this should be consistent, as well as efficient.
#5
@
22 months ago
Thanks @galbaras this makes sense to me. I'll update per your feedback. I noticed a few issues while working on this so I paused to open #57615 and I'll return to this shortly.
#6
@
21 months ago
- Milestone changed from Awaiting Review to 6.3
- Owner set to jorbin
- Status changed from new to assigned
This will be coming in the same commit as #56444 so moving to 6.3
Just leaving a comment to support these changes. Internal links shouldn't be nofollow, even if left by a user.
Think of it as me as the author replying to a user in the comments, and pointing them towards a different page on the website. This is an internal link, posted by the author - shouldn't be "nofollow ugs"
Second scenario - a user replies to another user with a link that leads to a different page on the website. Even though this is user generated content, it's still an internal link and shouldn't be nofollow.