#51511 closed feature request (fixed)
Introduce Robots API and Media Search Engine Visibility setting
Reported by: | flixos90 | Owned by: | flixos90 |
---|---|---|---|
Milestone: | 5.7 | Priority: | normal |
Severity: | normal | Version: | |
Component: | General | Keywords: | has-patch has-unit-tests commit has-dev-note |
Focuses: | Cc: |
Description
As proposed in the "Enhancing image preview: core proposal" announcement post, this ticket aims at introducing the following:
- A simple filter-based Robots API to centrally manage content of the
robots
meta tag injected into the page. - A setting to toggle whether search engines are allowed to display large media from the site.
- A
max-image-preview:large
robots directive which will be injected into therobots
meta tag based on the new setting.
There are a couple of extra requirements:
- The new Robots API should by default not include any directives (i.e. no
robots
meta tag would be printed). All WP core directives should be injected via their own filter callback functions. - The default behavior of which directives core injects should mirror core's behavior of today (with the only exception being the new conditional
max-image-preview:large
directive). More technically speaking, today'swp_head
action callbacks to renderrobots
meta tags should becomewp_robots
filter callbacks instead. - The setting that toggles the new directive should be exposed as a checkbox in Settings > Reading, together with the existing checkbox to control search engine visibility.
- The setting should be enabled by default. However, in addition to relying on the setting, the
max-image-preview:large
directive should only be injected if the site is also allowing search engine indexing. More technically speaking,blog_public
takes precedence over the new setting. - An admin pointer should inform users about the new setting, its default and what this means for WordPress behavior.
Side note: The filter-based Robots API this ticket aims to introduce should furthermore address #20037, which also requests robots
customization, just a bit less comprehensively.
Attachments (2)
Change History (25)
This ticket was mentioned in PR #595 on WordPress/wordpress-develop by felixarntz.
4 years ago
#1
- Keywords has-patch has-unit-tests added; needs-patch needs-unit-tests removed
#2
@
4 years ago
- Keywords needs-copy-review added
Above you see two screenshots with the relevant UI this feature exposes. It would be great to get feedback for the copy.
adamsilverstein commented on PR #595:
4 years ago
#3
@felixarntz - Overall looks really good! The approach makes sense and a filter based API fits in nicely with how other things worth in core. I do have one concern about the UI and what happens when both checkboxes are checked (below)...
### Testing:
When I tested this code I noticed the new robots tagging by default:
Then I tried editing the site visibility settings under Settings->Reading:
When I checked the first checkbox Discourage search engines from indexing this site
, the tag changed as expected.
When I checked only the second box Discourage search engines from displaying large previews of this site’s media
no robots tag was added, meaning the default rule applies and larger media sizes are not shared.
When I checked both options, I still got the <meta name='robots' content='noindex, nofollow' />
tag (same as only the first box checked. I assume this is expected?
If the first box overwrites the second, maybe the second box should be disabled if the first box is checked?
--
I tested using the wp_robots
filter to adjust the max preview size and that worked asexpected:
{{{php
add_filter( 'wp_robots',
function( $args) {
$argsmax-image-preview? = "medium";
return $args;
}
);
}}}
felixarntz commented on PR #595:
4 years ago
#4
@adamsilverstein
When I checked both options, I still got the <meta name='robots' content='noindex, nofollow' /> tag (same as only the first box checked. I assume this is expected?
Yes indeed, my thinking is that if search engines are discouraged, it doesn't make sense to tell search engines at the same time that they can use large images.
If the first box overwrites the second, maybe the second box should be disabled if the first box is checked?
That makes sense. Do you think we can just add some simple inline JS to toggle visibility? How is that handled in other cases? We'd also need to account for JS disabled probably, so I was thinking this might add too much complexity?
felixarntz commented on PR #595:
4 years ago
#5
@adamsilverstein In https://github.com/WordPress/wordpress-develop/pull/595/commits/8cdc18dac06b7919e3c245b934a41819a15c515f, I added simple JS logic to disable the checkbox when it's not applicable. This is similar to how the dropdowns for "Homepage" and "Posts page" are handled on the same admin screen - and I agree, conditionally disabling instead of hiding is less visually impactful and avoids layout shifting.
#6
@
4 years ago
I'm all for a core approach to control robots meta tags, I do have some concerns about the introduction of a secondary UI checkbox for an individual search engine aspect though, as it is a bit vague on what this box truly does, and it feels a bit counter to the core philosophy, although that may just be my interpretation here, so mentioning it for the sake of completeness
From the perspective of a slightly technical individual, two very likely user scenarios come to mind, although there are likely others:
- I've allowed search engines with one checkbox being unticked, then I tick this other one because I don't want anyone taking my full size images.
- I've ticked this new box, because I only want thumbnails to show in search engine image searches.
How can we make this less vague, without overburdening users, if we are to retain this new UI element I must admit I've got no good answer right now, but I think this is a discussion that should be had, to avoid any confusion or misunderstandings.
#7
@
4 years ago
I don't have a fully formed opinion about the robots API side of things but as for media use in search engines, I am not convinced of the utility of turning this on for the majority of sites. It's making a lot of assumptions about how people use media and technical implementation/spec aside it seems odd to say that this has core importance but not Open Graph tags for social media cards. I can see that it is not the same thing from an output and standards perspective, but with the added UI for all users this is exposing something that is fairly insider and a significant portion of our user base see those as similar items, if not the same.
Another question I have is - why would a user ever want to discourage large media usage for search engines? What would the problem be with large media that they would want to turn that off?
#8
@
4 years ago
- Keywords needs-dev-note added
- Milestone changed from Awaiting Review to 5.7
@helen @Clorith I'm happy to revise whether we'd need this checkbox and (if we do) how to make its purpose more clear. I agree that this is a control for something quite specific that we generally try to avoid in core. Similarly though, there was a strong push to expose a UI control for this as a follow-up on the original announcement post (which didn't initially include that part).
Another question I have is - why would a user ever want to discourage large media usage for search engines? What would the problem be with large media that they would want to turn that off?
I'd say that 99% of users would not want to discourage large media usage. But then there's a fraction (likely mostly larger publishers) that for legal reasons would discourage it (copyright on those images). The announcement post and this post linked from there provide some more context.
If we knew that it's indeed primarily larger publishers that would benefit from the option to discourage, I think it would be easy to argue that a filter alone is sufficient (assuming they have active development resources). But for smaller individually managed sites having the checkbox might be useful. Alternatively, we could avoid the checkbox and instead point to some one-liner plugin for those sites that prefer to discourage.
This ticket was mentioned in PR #702 on WordPress/wordpress-develop by felixarntz.
4 years ago
#9
- Based on https://github.com/WordPress/wordpress-develop/pull/595, but without the support for
max-image-preview:large
directive, and without themedia_search_engine_visibility
setting related to that. - This reduces the PR to only focus on introducing the Robots API. The
max-image-preview:large
piece could then be added separately as a follow-up.
Trac ticket: https://core.trac.wordpress.org/ticket/51511
#10
@
4 years ago
I've opened a new PR (see above) based on the other one, which only focuses on the Robots API, so that we can review this part first and get it ready, allowing for a more focused conversation.
Both PRs have been refreshed against latest trunk.
#11
@
4 years ago
I've refreshed the PR for the Robots API to apply cleanly against latest trunk
. With the reviews from the previous PR I think this should be good to go soon. Would be great to get some additional eyes on this though!
Once the Robots API PR is committed, we can continue discussing introduction of the max-image-preview:large
directive, including a smaller PR to review for that.
#12
@
4 years ago
- Keywords commit added; needs-copy-review removed
@helen @francina Following up on the previous comments, it looks like it would be most beneficial to WordPress users to not include any UI to opt out of allowing search engines to use large image previews, for the following reasons:
- As long as a site should be surfaced in search results (via the existing checkbox), allowing for large image previews results in a better user experience.
- The only reason to opt out of large image previews by search engines would be for sites with special copyright requirements, e.g. sites that sell these images.
- In other words, any UI introduced for this would only benefit a fraction of WordPress sites, which is certainly lower (likely much lower) than 20%.
- For those sites that would like to opt out of large image previews in search engines, a filter is available to do so (see below).
- The Yoast SEO plugin, which is used by millions of sites, is following a similar approach, opting in to large image previews by default and allowing to opt out with a filter.
A one line-plugin could be used for sites that would like to opt out of large image previews for search engines:
<?php remove_filter( 'wp_robots', 'wp_robots_max_image_preview_large' );
I've updated the PR accordingly to remove the option and UI around it.
I've also refreshed the Robots API-only version of it, which has been reviewed multiple times and is good to go. I'll commit that one later today so that we can solely focus on the max-image-preview:large
part separately after that.
felixarntz commented on PR #595:
4 years ago
#14
After https://github.com/WordPress/wordpress-develop/commit/176a1f53f04cde92e7297b5214c03beb9e2ba5c8, this PR is now refreshed against latest trunk
so that it only includes the bits related to introducing the max-image-preview:large
directive.
felixarntz commented on PR #595:
4 years ago
#17
Committed via https://core.trac.wordpress.org/changeset/50078.
#18
@
4 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
Reopening for dev note.
#19
@
4 years ago
- Resolution set to fixed
- Status changed from reopened to closed
Incorrectly reopened this because of needs-dev-note
- that keyword is sufficient though.
#20
@
4 years ago
@hellofromtonya I added some Robots API QA testing points below:
New default Robots behavior
- Ensure that for a site where search engines are not discouraged a Robots meta tag with
max-image-preview:large
directive is present in the frontend.
New API usage
Test the following three cases individually:
- Activate a custom one-line plugin with
remove_all_filters( 'wp_robots' );
. Ensure that the frontend now does not display any Robots meta tag. - Activate a custom one-line plugin with
remove_filter( 'wp_robots', 'wp_robots_max_image_preview_large' );
. Ensure that the frontend now does not display any Robots meta tag (since that is the only directive added by default). - Activate a custom one-line plugin with
add_filter( 'wp_robots', function( $robots ) { $robots['follow'] = true; return $robots; } );
. Ensure that the frontend now does not only include the defaultmax-image-preview:large
Robots directive, but also includesfollow
within the same Robots meta tag.
Prevent breakage
- Ensure that, when the checkbox to discourage the site from being indexed by search engines is enabled, the frontend includes a
noindex,nofollow
directive in the Robots meta tag like before. - Ensure that, within the Customizer preview, the site includes a
noindex
directive in the Robots meta tag like before. - Ensure that the WordPress login page (
wp-activate.php
) includes anoindex,noarchive
directive in the Robots meta tag, as well as a<meta name='referrer' content='strict-origin-when-cross-origin' />
tag, like before. - Multisite: Ensure that the site activation page (
wp-activate.php
), where a newly registered user can confirm their newly created site, includes anoindex,noarchive
directive in the Robots meta tag, as well as a<meta name='referrer' content='strict-origin-when-cross-origin' />
tag, like before.
wp_robots
filter.wp_robots
filter callback functions mirroring existing core behavior.wp_head
action hook callbacks related to "robots" meta tag withwp_robots
filter hook callbacks.wp_head
action hook callback functions related to "robots" meta tag, referencing the respective filter hook callback functions as replacement.max-image-preview:large
robots directive, and another filter function to inject it based on whether bothblog_public
andmedia_search_engine_visibility
settings are truthy.media_search_engine_visibility
setting, with the setting having a default value of1
(enabled).blog_public
checkbox, it displays the inverse value, with the checkbox label being phrased as "Discourage search engines ...".Trac ticket: https://core.trac.wordpress.org/ticket/51511