WordPress.org

Make WordPress Core

Opened 2 years ago

Closed 2 years ago

Last modified 22 months ago

#33156 closed enhancement (fixed)

Allow admin-ajax crawling

Reported by: joostdevalk Owned by: SergeyBiryukov
Milestone: 4.4 Priority: normal
Severity: normal Version:
Component: General Keywords: 2nd-opinion has-patch
Focuses: Cc:

Description

As plugins are using admin-ajax.php on the frontend, we should add

Allow: /admin/admin-ajax.php

To the default robots.txt to prevent Google from sending out million of emails, see this article: https://www.seroundtable.com/google-warning-googlebot-css-js-20665.html

Attachments (2)

33156.patch (544 bytes) - added by dmchale 2 years ago.
Add "Allow" for /wp-admin/admin-ajax.php to the end of the default generated robots.txt file
33156.diff (436 bytes) - added by markjaquith 2 years ago.

Download all attachments as: .zip

Change History (20)

This ticket was mentioned in Slack in #core by ocean90. View the logs.


2 years ago

@dmchale
2 years ago

Add "Allow" for /wp-admin/admin-ajax.php to the end of the default generated robots.txt file

#2 follow-up: @dmchale
2 years ago

Joost, is there value in removing the Disallow of /wp-admin entirely? I know you've recommended that in the past - but do you think that would be preferable behavior for Core, or no? Just leave it like this with an exception in place for admin-ajax?

#3 in reply to: ↑ 2 ; follow-up: @peterwilsoncc
2 years ago

For what it's worth, I'd rather allow everything in robots.txt and use noindex,nofollow meta tags for private sites & preventing indexing of wp-admin. Google recommends it as a more effective method for preventing indexing.

Replying to dmchale:
The WordPress coding standards are to use tabs not spaces for indention. Would you mind refreshing the patch?

@markjaquith
2 years ago

#4 in reply to: ↑ 3 @dmchale
2 years ago

Replying to peterwilsoncc:

For what it's worth, I'd rather allow everything in robots.txt and use noindex,nofollow meta tags for private sites & preventing indexing of wp-admin. Google recommends it as a more effective method for preventing indexing.

That was the solution I was alluding to in my comment to Joost above. Back in February, he recommended getting rid of the /wp-admin block entirely. But I didn't want to create that patch without a conversation happening first, either, since that wasn't his suggestion as the OP on this ticket. Would be very easy though, we'd just have to remove everything in the "else" side of the $public check. A default file would still be returned, albeit nearly blank, and we still have the ability to write the Disallow / if the site isn't in public mode.

Replying to peterwilsoncc:

The WordPress coding standards are to use tabs not spaces for indention. Would you mind refreshing the patch?

Thanks for the heads up. New install of PHPStorm on this pc, and I forgot to turn my whitespace highlighting on. Fixed now, shouldn't happen again. :) Since Mark already submitted one with tabs, I won't clutter things up with another copy.

#5 follow-up: @pavelevap
2 years ago

admin-ajax.php is a PHP file and Google notified about CSS and JS?

#6 in reply to: ↑ 5 @dmchale
2 years ago

Replying to pavelevap:

admin-ajax.php is a PHP file and Google notified about CSS and JS?

Google has a problem with it when theme or plugin authors do something like this... :)

<link rel='stylesheet' id='style-css' href='http://mydomain.com/wp-admin/admin-ajax.php?action=style' type='text/css' media='all' />

I'm sure there's other use cases where it's causing problems as well, but this one in particular has hit a number of my client sites who are using purchased themes.

Last edited 2 years ago by dmchale (previous) (diff)

#7 follow-up: @knutsp
2 years ago

-1

I don't think this should be in core. Themes should not depend on, or access, /wp-admin. If they do, they should fix the "crawlablity" of it through hooks. Core may offer an ajax endpoint outside /wp-admin, if necessary.

One day, for some, it should be possible to delete /wp-admin and install or use an alternative admin through WP REST API. In the mean time, find another solution to this problem.

#8 in reply to: ↑ 7 @dmchale
2 years ago

Replying to knutsp:

-1

I don't think this should be in core. Themes should not depend on, or access, /wp-admin. If they do, they should fix the "crawlablity" of it through hooks. Core may offer an ajax endpoint outside /wp-admin, if necessary.

One day, for some, it should be possible to delete /wp-admin and install or use an alternative admin through WP REST API. In the mean time, find another solution to this problem.

Right now Core only offers ajax functionality through /wp-admin. Your proposal to CHANGE that fact is a much different discussion, IMO. https://codex.wordpress.org/AJAX_in_Plugins "Note 2: Both front-end and back-end Ajax requests use admin-ajax.php [...]"

#9 follow-up: @johnbillion
2 years ago

  • Keywords 2nd-opinion has-patch added; needs-patch removed

AJAX needs to go via wp-admin for authenticated requests. A front-end AJAX handler was attempted in #12400 but pulled out.

What might be the downside of allowing admin-ajax.php to be crawled? Any chance of unwanted content appearing in SERPs?

#10 in reply to: ↑ 9 @dmchale
2 years ago

Replying to johnbillion:
Any chance of unwanted content appearing in SERPs?

admin-ajax has @header( 'X-Robots-Tag: noindex' ); already, so no content found there should appear in any SERPs.

This ticket was mentioned in Slack in #core by dmchale. View the logs.


2 years ago

#12 @SergeyBiryukov
2 years ago

  • Milestone changed from Awaiting Review to 4.4

This ticket was mentioned in Slack in #core by sergey. View the logs.


2 years ago

This ticket was mentioned in Slack in #core by sergey. View the logs.


2 years ago

#15 @SergeyBiryukov
2 years ago

  • Owner set to SergeyBiryukov
  • Status changed from new to assigned

#16 @SergeyBiryukov
2 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 34985:

In do_robots(), allow crawling for admin-ajax.php, since it's often used on front-end.

Props dmchale, joostdevalk.
Fixes #33156.

#17 follow-up: @Hube2
2 years ago

Didn't know if I should start a new ticket or not and couldn't find one covers it. The order that WP is outputting the allow/disallow rules

Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

may not compatible with all crawlers. To be comparable with all crawlers the order of these rules needs to be reversed.

See: https://en.wikipedia.org/wiki/Robots_exclusion_standard#Allow_directive.

I can't find any information that contradicts what is presented in the wiki article.

#18 in reply to: ↑ 17 @rdela
22 months ago

Opened a new ticket about order.

Replying to Hube2:

Didn't know if I should start a new ticket or not and couldn't find one covers it. The order that WP is outputting the allow/disallow rules

Disallow: /wp-admin/
Allow: /wp-admin/admin-ajax.php

may not compatible with all crawlers. To be comparable with all crawlers the order of these rules needs to be reversed.

See: https://en.wikipedia.org/wiki/Robots_exclusion_standard#Allow_directive.

I can't find any information that contradicts what is presented in the wiki article.

Note: See TracTickets for help on using tickets.