Opened 10 years ago
Closed 9 years ago
#35819 closed enhancement (invalid)
Robots.txt exclusion rule order reversal
| Reported by: |
|
Owned by: | |
|---|---|---|---|
| Milestone: | Priority: | normal | |
| Severity: | normal | Version: | 4.4.2 |
| Component: | General | Keywords: | |
| Focuses: | administration | Cc: |
Description
This is a follow-up to #33156.
Replying to comment:17 Hube2:
Didn't know if I should start a new ticket or not and couldn't find one covers it. The order that WP is outputting the allow/disallow rules
Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.phpmay not compatible with all crawlers. To be comparable with all crawlers the order of these rules needs to be reversed.
See: Wikipedia – Robots exclusion standard: Allow directive
I can't find any information that contradicts what is presented in the wiki article.
Google’s example is in Allow then Disallow order.
Pattern-matching rules to streamline your robots.txt code
User-agent: * Allow: /*?$ Disallow: /*?
Google's robots.txt Tester
confirms both orders work for googlebot.
Change History (2)
Note: See
TracTickets for help on using
tickets.
According to Goolge's robots.txt specifications page ... (emphasis mine)
Therefore
Allow: /wp-admin/admin-ajax.phpwill trumpDisallow: /wp-admin/no matter which order they are listedPersonally, before making any changes I would also like to see other evidence which corroborates the Wikipedia article's assertion that
by standard implementation the first matching robots.txt pattern always wins. There is no citation on that claim and I'd love to see something more concrete than one editor's choice of phrasing.