WordPress.org

Make WordPress Core

Opened 9 months ago

Closed 8 months ago

#52713 closed defect (bug) (fixed)

Reverse logic in wp_robots function and filter

Reported by: Cybr Owned by: flixos90
Milestone: 5.7.1 Priority: normal
Severity: normal Version: 5.7
Component: General Keywords: has-patch has-unit-tests commit fixed-major dev-reviewed
Focuses: Cc:

Description

From https://developers.google.com/search/docs/advanced/crawling/special-tags:

In the case of conflicting robots (or googlebot) meta tags, the more restrictive tag applies. For example, if a page has both max-snippet:50 and nosnippet tags, the nosnippet tag will apply.

In WP 5.7 RC2, in wp_robots(), the following code does the complete opposite, where now the less restrictive tag applies:

// Don't allow mutually exclusive directives.
if ( ! empty( $robots['follow'] ) ) {
	unset( $robots['nofollow'] );
}
if ( ! empty( $robots['nofollow'] ) ) {
	unset( $robots['follow'] );
}

I consider this harmful, and I urge swapping the statements. One plugin might not want a page have its links followed, whereas another could allow it at a lower priority still.

I find it curious the archive and follow tags are considered for the overriding, but index is not...?

Moreover, the same Google Developers-page states:

The default values are index, follow and don't need to be specified.

But now, since wp_robots() permits and encourages those non-directives, the header will become filled with redundant information. I suggest not acknowledging index/follow/archive/all el al. at all.

Perhaps it would be best to remove the "Don't allow mutually exclusive directives"-block entirely.

Change History (10)

#1 @peterwilsoncc
9 months ago

  • Milestone changed from Awaiting Review to 5.7.1

I agree it makes sense to defer to the least permissive option if conflicting values are specified. @flixos90 do you have thoughts/objections around this?

I don't think it's a blocker for release so putting this on to the 5.7.1 milestone for now.

As for displaying default values such as index, I lean toward keeping them as it indicates the robots tag is been included as expected and accounts for search engines/archivists that have different defaults than Google.

#2 @flixos90
9 months ago

  • Keywords needs-patch added
  • Owner set to flixos90
  • Status changed from new to assigned

@Cybr @peterwilsoncc I agree the implementation here is flawed, the ticket raises valid points.

Perhaps it would be best to remove the "Don't allow mutually exclusive directives"-block entirely.

I think this suggestion makes most sense. Adding specific precedences and directives here increases the maintenance burden for little benefit, and I also agree with the assessment that it might (unintentionally) be interpreted as some kind of guidance.

So I'm leaning towards removing this block entirely and leaving responsibility to the individual filter callbacks, since it is infeasible to centrally control that completely anyway.

I don't think it's a blocker for release so putting this on to the 5.7.1 milestone for now.

Agreed, while it's worth fixing, it is not critical. Let's fix it for 5.7.1.

This ticket was mentioned in PR #1075 on WordPress/wordpress-develop by felixarntz.


9 months ago

  • Keywords has-patch has-unit-tests added; needs-patch removed

#4 @peterwilsoncc
9 months ago

  • Keywords needs-refresh added

Adding refresh keyword following notes on pull request, they're only minor changes.

#5 @prbot
9 months ago

felixarntz commented on PR #1075:

@peterwilsoncc Thanks, updated!

#6 @peterwilsoncc
8 months ago

  • Keywords commit added; needs-refresh removed

The linked pull request all looks dandy, ready for commit.

#7 @peterwilsoncc
8 months ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 50566:

Robots: Remove contradictory directive check in wp_robots().

Removes the mutually exclusive directives check in wp_robots(), ie allow both follow and nofollow to be specified and for archive and noarchive to be specified.

This fixes a bug in which WordPress would defer to the most permissive over the least permissive. When contradictory instructions are included, WordPress will defer to the search engine's or archivist's resolution policy: generally this is to observe the least, not most permissive.

Props Cybr, flixos90.
Fixes #52713.

#8 @peterwilsoncc
8 months ago

  • Keywords fixed-major dev-feedback added
  • Resolution fixed deleted
  • Status changed from closed to reopened

Reopening to for committing to 5.7 branch upon second committers sign-off.

#9 @SergeyBiryukov
8 months ago

  • Keywords dev-reviewed added; dev-feedback removed

[50566] looks good to backport.

#10 @peterwilsoncc
8 months ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 50587:

Robots: Remove contradictory directive check in wp_robots().

Removes the mutually exclusive directives check in wp_robots(), ie allow both follow and nofollow to be specified and for archive and noarchive to be specified.

This fixes a bug in which WordPress would defer to the most permissive over the least permissive. When contradictory instructions are included, WordPress will defer to the search engine's or archivist's resolution policy: generally this is to observe the least, not most permissive.

Props Cybr, flixos90, SergeyBiryukov.
Merges [50566] to the 5.7 branch.
Fixes #52713.

Note: See TracTickets for help on using tickets.