Make WordPress Core

Opened 3 years ago

Closed 12 months ago

Last modified 12 months ago

#56595 closed enhancement (fixed)

Add a Site Health check for a non-virtual robots.txt file

Reported by: zodiac1978's profile zodiac1978 Owned by: jorbin's profile jorbin
Milestone: 6.8 Priority: normal
Severity: normal Version:
Component: Site Health Keywords: has-patch
Focuses: Cc:

Description

At WordCamp Nederland 2022 @joostdevalk held a talk about unnecessary bot traffic and how to prevent it.

One slide caught my interest:
https://docs.google.com/presentation/d/13Ngq-T2Qdbz1b8apUiioTCBmcsB5s411xBKcklmKyNQ/edit#slide=id.g152f65bfa26_0_87

Blocking those unneeded bots is easy in theory (there is a filter available to change the virtual robots.txt file), but is not easy to build, because we need to look at many use cases and edge cases.

For high traffic sites, it would be better to have a non-virtual robots.txt file, to prevent PHP/WordPress handling this.

But if we create a robots.txt file it is easily missed that now WordPress is not handling it anymore.

Therefore I suggest adding a check to Site Health if there is non-virtual robots.txt file in the root directory.

Maybe we could also add the content of this file in the info area and/or in the tools section of the plugin.

Happy to work on a patch if this idea gets confirmation.

Attachments (5)

56595.diff (1.2 KB) - added by zodiac1978 2 years ago.
Adding a robots.txt check to the debug data in the Site Health feature
virtual-robotstxt.png (14.8 KB) - added by zodiac1978 2 years ago.
Debug info if there is no pyhsical robots.txt file available
physical-robotstxt.png (15.8 KB) - added by zodiac1978 2 years ago.
Debug info if there is a pyhsical robots.txt file available
debug-info.png (11.2 KB) - added by zodiac1978 2 years ago.
Debug info as shown in the clipboard data
Bildschirmfoto 2025-02-26 um 17.07.24.png (87.0 KB) - added by zodiac1978 12 months ago.
As the link to the slide is broken, here it is as image

Download all attachments as: .zip

Change History (20)

#1 @zodiac1978
3 years ago

  • Type changed from defect (bug) to enhancement

#2 @zodiac1978
2 years ago

  • Keywords has-patch added; needs-patch removed

Maybe we could also add the content of this file in the info area and/or in the tools section of the plugin.

Since version 1.7.0 there is a robots.txt viewer in the Health Check plugin available:
https://github.com/WordPress/health-check/commit/015347bc7cf1b5fb81281add0a7db5d4f7b5de66

Now I would like to add the according debug data.

@zodiac1978
2 years ago

Adding a robots.txt check to the debug data in the Site Health feature

@zodiac1978
2 years ago

Debug info if there is no pyhsical robots.txt file available

@zodiac1978
2 years ago

Debug info if there is a pyhsical robots.txt file available

@zodiac1978
2 years ago

Debug info as shown in the clipboard data

This ticket was mentioned in Slack in #core-site-health by zodiac1978. View the logs.


15 months ago

This ticket was mentioned in Slack in #core by zodiac1978. View the logs.


12 months ago

#5 @audrasjb
12 months ago

  • Milestone changed from Awaiting Review to 6.8
  • Owner set to audrasjb
  • Status changed from new to accepted

Moving for 6.8 consideration

#7 @audrasjb
12 months ago

  • Keywords dev-feedback removed

I added a PR with your patch as it didn't apply anymore against trunk @zodiac1978 :)

#8 @zodiac1978
12 months ago

Thanks @audrasjb !

I was just on my way to look if a refresh is necessary. You beat me to it. Thanks again!

This ticket was mentioned in Slack in #core by zodiac1978. View the logs.


12 months ago

@zodiac1978
12 months ago

As the link to the slide is broken, here it is as image

#10 @jorbin
12 months ago

I'm not sure physical is the best way to refer to a digital file existing otherwise this looks good to me. How about changing it to be static for the files that live on the file system and and dynamic for the WordPress generated ones?

#11 @audrasjb
12 months ago

👆 The above wording proposal makes a lot of sense to me.

#12 @jorbin
12 months ago

  • Owner changed from audrasjb to jorbin
  • Status changed from accepted to assigned

I've updated the PR and intend to commit it tomorrow if there is no objection. As a part of this, I also added a check to make sure that rewrite rules allow before saying that WP can serve the dynamic file.

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


12 months ago

#14 @jorbin
12 months ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 59890:

Site Health: Add a robots.txt check to the server data.

Provide a bit of information about robots.txt to help people understand if the file is generated by WordPress.

Props zodiac1978, audrasjb, joostdevalk, jorbin.
Fixes #56595.

#15 @SergeyBiryukov
12 months ago

In 59894:

Site Health: Fix typo in the robots.txt check messages.

Follow-up to [59890].

See #56595.

Note: See TracTickets for help on using tickets.