Opened 7 years ago
Closed 7 years ago
#35819 closed enhancement (invalid)
Robots.txt exclusion rule order reversal
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Priority: | normal | |
Severity: | normal | Version: | 4.4.2 |
Component: | General | Keywords: | |
Focuses: | administration | Cc: |
Description
This is a follow-up to #33156.
Replying to comment:17 Hube2:
Didn't know if I should start a new ticket or not and couldn't find one covers it. The order that WP is outputting the allow/disallow rules
Disallow: /wp-admin/ Allow: /wp-admin/admin-ajax.phpmay not compatible with all crawlers. To be comparable with all crawlers the order of these rules needs to be reversed.
See: Wikipedia – Robots exclusion standard: Allow directive
I can't find any information that contradicts what is presented in the wiki article.
Google’s example is in Allow then Disallow order.
Pattern-matching rules to streamline your robots.txt code
User-agent: * Allow: /*?$ Disallow: /*?
Google's robots.txt Tester
confirms both orders work for googlebot.
Change History (2)
Note: See
TracTickets for help on using
tickets.
According to Goolge's robots.txt specifications page ... (emphasis mine)
Therefore
Allow: /wp-admin/admin-ajax.php
will trumpDisallow: /wp-admin/
no matter which order they are listedPersonally, before making any changes I would also like to see other evidence which corroborates the Wikipedia article's assertion that
by standard implementation the first matching robots.txt pattern always wins
. There is no citation on that claim and I'd love to see something more concrete than one editor's choice of phrasing.