WordPress.org

Make WordPress Core

Opened 3 months ago

Last modified 7 weeks ago

#51233 new enhancement

Remove "wp-" Prefix From /wp-sitemap.xml

Reported by: teamdnk Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 5.5.1
Component: Sitemaps Keywords: needs-patch close
Focuses: Cc:

Description

The XML Sitemap should be retrieved via /sitemap.xml instead of wp-sitemap.xml. I understand the original concern regarding the decision to prefix it (being potential plugin conflicts). However, plugin authors can simply disable the core sitemap which is typical behavior throughout WordPress.

Considering that bots, crawlers, and some SEO tools still request to /sitemap.xml by default, a core implementation should hold to common code standards similar to something like /robots.txt.

Change History (7)

#1 @Clorith
3 months ago

Hiya,

So the actual address of the sitemap isn't a problem, you will notice that the robots.txt file references the location of the sitemap (this is done in cases where the sitemap is not using just sitemap.xml as the name), and the robots file is checked before any other action by the search engines when indexing a site).

As such, I'm not sure this is something that really needs to change, and it does ensure backwards compatibility this way, for sites that may not have an up to date plugin that generates sitemaps for example.

#2 @teamdnk
3 months ago

Right, I understand that the location is in the robots.txt file but we still receive many requests for /sitemap.xml even with our robots.txt specifying a different filename. In regards to backward compatibility, even with prefixing the filename there are still conflicts with existing SEO plugins or tools that use the sitemap query_var.

Prefixing the query_var makes sense as it is more programmatic than pretty links. What I am proposing is that we keep the core sitemap as /sitemap.xml and use another method of being backward compatible. I am looking into some possibilities but even something that checks if the ^sitemap\.xml$ rewrite rule exists may allow it to be backwards compatible.

Last edited 3 months ago by teamdnk (previous) (diff)

#3 follow-up: @dd32
3 months ago

but we still receive many requests for /sitemap.xml even with our robots.txt specifying a different filename

Would you be able to post some of the user-agents of those requests, so as to determine if it's widespread for legitimate search engine indexers to ignore the robots header?

#4 @dd32
3 months ago

  • Focuses coding-standards removed

#5 in reply to: ↑ 3 ; follow-up: @teamdnk
3 months ago

Would you be able to post some of the user-agents of those requests, so as to determine if it's widespread for legitimate search engine indexers to ignore the robots header?

Currently, we have our sitemap located at /sitemap.xml so I don't have any recent logs but bingbot/2.0; +http://www.bing.com/bingbot.htm is one that requested the sitemap from a different location than what was specified in robots.txt.

I would like to say that the core XML Sitemap component was originally developed using /sitemap.xml which is the natural approach. It was changed after 128 was submitted which has an understandable reason. We always prefer extending the core functionality before grafting anything into it so it was exciting to see xml sitemaps developed. It makes sense that the core should use /sitemap.xml instead of bending around existing plugins (especially if there is another way to be backwards compatible).

Last edited 3 months ago by SergeyBiryukov (previous) (diff)

#6 in reply to: ↑ 5 @teamdnk
3 months ago

128 The file-name(s) should be also filterable, to help other plugins, that use the new core feature, to be consistent.

As the original proposal suggested, at the very least, the prefix should be filterable so it can be either changed or filtered out. This is important for us as our company focuses heavily on extending the core rather than going to 3rd party plugins. Likewise I think it would be equally important to other developers who want to use or move to the core sitemap component.

#7 @swissspidy
7 weeks ago

  • Keywords close added

Personally I don't really see a need for changing the prefix just because some bots try to directly access that URL.

Note that WordPress automatically redirects /sitemap.xml to /wp-sitemap.xml anyway.

A filter for the URL / prefix is not really doable as rewrite rules are persisted in the database. But you can easily add your own rewrite rules, remove the existing ones, and use the wp_sitemaps_stylesheet_index_url filter to use your own custom URL.

Note: See TracTickets for help on using tickets.