Opened 5 months ago

Last modified 5 months ago

#22876 new defect (bug)

Wrong robots meta output

Reported by: joostdevalk Owned by:
Priority: high Milestone: Awaiting Review
Component: Template Version: 3.5
Severity: major Keywords:
Cc:

Description (last modified by joostdevalk)

in default-filters.php, we add a new action:

if ( isset( $_GET['replytocom'] ) )
	add_action( 'wp_head', 'wp_no_robots' );

the issue is, wp_no_robots outputs noindex, nofollow. The noindex is fine, the nofollow stops any link equity from that URL, so that's actually a bad idea. Just noindex alone would be fine. Setting to priority high and severity to major as this is basically a regression.

Change History (9)

  • Description modified (diff)
  • Description modified (diff)

And actually, rel="canonical" should fix this altogether, so there's really no need to do this at all.

comment:4 follow-up: ↓ 5   ashfame5 months ago

+1 for adding rel="canonical" to fix this.

comment:5 in reply to: ↑ 4   joostdevalk5 months ago

Replying to ashfame:

+1 for adding rel="canonical" to fix this.

No need to add it, it's already there.

This isn't new to 3.5 — wp_no_robots() here went in in 3.3. Happy to make any adjustments in 3.6.

This thread from Google:

http://productforums.google.com/forum/#!msg/webmasters/0sqRrolO_Ss/igOdQIjwKdEJ

shows they'd prefer just a canonical, I just checked with Bing and Duane Forrester and he said the non noindex, pure canonical approach would be better for sites too:

https://twitter.com/DuaneForrester/status/278871145946701825

in reply to:

https://twitter.com/yoast/status/278867897726693377

So let's do that regardless of all else. I do also think we need to discuss whether the replytocom variable should really be in core anyway, as it leads to tremendous crawl actions on high comment volume blogs, without any extra value. Some logs from larger WordPress installs on how this parameter impacts crawl behavior would be helpful in that regard.

I'm convinced on removing the nofollow directive. Am unconvinced that removing noindex is a good idea. I don't want these URLs showing up in search engines, period. So I suggest we make a wp_robots_noindex() or similar function and have the replytocom URLs use that.

But based on a discussion I had with joostdevalk, I'd like to separately consider dropping these no-JS comment reply URLs altogether. You can discuss that on #22889.

Note: See TracTickets for help on using tickets.