Make WordPress Core

Opened 4 years ago

Last modified 20 months ago

#51117 accepted defect (bug)

Sitemap & XSL requests execute main query for home page.

Reported by: peterwilsoncc's profile peterwilsoncc Owned by: pbiron's profile pbiron
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 5.5
Component: Sitemaps Keywords: needs-patch
Focuses: performance Cc:

Description (last modified by SergeyBiryukov)

When requesting a sitemap, the main query is run with the parameters for the is_home() request by default. The same is true for the stylesheet.

  • On a site without posts this can return a false 404 error for the sitemap
  • On a site without a persistent cache, this can result in 10 unneeded database queries.

Database log for requesting the first page of the posts sitemap is attached, other sitemap requests look similar up until the sitemap specific DB query is reached,

Proposal:

  • Prevent main query on sitemaps
  • Move stylesheet to a hard coded xsl file if possible

Attachments (1)

sitemaps.log (9.7 KB) - added by peterwilsoncc 4 years ago.

Download all attachments as: .zip

Change History (10)

#1 @SergeyBiryukov
4 years ago

  • Description modified (diff)
  • Summary changed from Sitemap & XLS requests execute main query for home page. to Sitemap & XSL requests execute main query for home page.

#2 @pbiron
4 years ago

  • Owner set to pbiron
  • Status changed from new to accepted

I've noticed this as well and thought "why is it bothering to run the main query when the results of that aren't used?".

But I'd point out that sitemaps aren't the only "rewrite rule-only" request that does this: https://example.com/robots.txt and https://example.com/feed run the main query as well.

While looking at this, I was happy to find that the REST API does not run the main query: it hooks into parse_request and short circuits the process when handling REST requests.

I haven't looked in detail yet, but am pretty sure sitemaps, robots.txt and feeds could do the same thing. I wonder whether it would be beneficial to introduce a general mechanism that all such "rewrite rule-only" requests could build on?

#3 follow-up: @peterwilsoncc
4 years ago

@pbiron I've added some notes to #51912 (as reported on a duplicate ticket) as this can cause false 404 status codes on valid sitemap pages.

I'll leave you to decide if this should be closed as a duplicate of the other ticket.

#4 in reply to: ↑ 3 @pbiron
4 years ago

Replying to peterwilsoncc:

@pbiron I've added some notes to #51912 (as reported on a duplicate ticket) as this can cause false 404 status codes on valid sitemap pages.

Thanx for the ping...I had forgotten about this ticket.

Not sure when I'll have time to look into this more, but I'll do my best to at least investigate how the REST API does the short circuit and see how easy it will be to at least do it for sitemaps (and leave robots and feeds for later).

#5 @tigerfinch
4 years ago

I've added a potential fix at https://core.trac.wordpress.org/ticket/51912#comment:7, if that's any help?

#6 @RavanH
2 years ago

This plugin should resolve the issue https://wordpress.org/plugins/xml-sitemaps-manager/ which applies the solution proposed by @Tkama here https://core.trac.wordpress.org/ticket/51912#comment:9

Please let me know if there are still problems after applying the fix.

#7 follow-up: @Cybr
20 months ago

I expect class WP_Query to have an $is_sitemap property just like it has $is_robots and $is_favicon.

That parameter needs to be tested against before setting is_home to true in method WP_Query::parse_query().

With this ticket resolved, we can undo the patch [53548].

#8 in reply to: ↑ 7 @SergeyBiryukov
20 months ago

  • Keywords needs-patch added

Replying to Cybr:

I expect class WP_Query to have an $is_sitemap property just like it has $is_robots and $is_favicon.

That parameter needs to be tested against before setting is_home to true in method WP_Query::parse_query().

Makes sense to me 👍

#9 @RavanH
20 months ago

Currently using the following as a patch (via plugin) solving all issues. The functions is_sitemap() and is_sitemap_stylesheet() are not technically needed in this context but seem logical to have available (and use)...

Note that hooking wp_sitemaps_loaded() into parse_request will render the sitemap (or stylesheet) early and then exit, thereby skipping the unneeded main query altogether.

add_action( 'parse_request', 'wp_sitemaps_loaded' );

if ( ! function_exists( 'wp_sitemaps_loaded' ) ) :
	/**
	 * Loads the WordPress XML Sitemap Server
	 *
	 * @see https://core.trac.wordpress.org/ticket/51912
	 *
	 * @since 1.0
	 *
	 * @param  WP       $wp       Current WordPress environment instance.
	 * @global WP_Query $wp_query WordPress Query.
	 * @return void
	 */
	function wp_sitemaps_loaded( $wp ) {
		global $wp_query;

		/**
		 * Whether this is a Sitemap Request.
		 *
		 * @see https://core.trac.wordpress.org/ticket/51543
		 * @since 1.0
		 * @var bool
		 */
		$wp_query->is_sitemap = ! empty( $wp->query_vars['sitemap'] );

		/**
		 * Whether this is a Sitemap Stylesheet Request.
		 *
		 * @since 1.0
		 * @var bool
		 */
		$wp_query->is_sitemap_stylesheet = ! empty( $wp->query_vars['sitemap-stylesheet'] );

		if ( ! is_sitemap() && ! is_sitemap_stylesheet() ) {
			return;
		}

		// Prepare query variables.
		$query_vars = $wp_query->query_vars;
		$wp_query->query_vars = $wp->query_vars;

		// Render the sitemap.
		wp_sitemaps_get_server()->render_sitemaps();

		// Still here? Then it was an invalid sitemap request after all. Undo everything and carry on...
		$wp_query->is_sitemap = false;
		$wp_query->is_sitemap_stylesheet = false;
		$wp_query->query_vars = $query_vars;
	}
endif;

if ( ! function_exists( 'is_sitemap' ) ) :
	/**
	 * Determines whether the query is for the sitemap.
	 *
	 * For more information on this and similar theme functions, check out
	 * the {@link https://developer.wordpress.org/themes/basics/conditional-tags/
	 * Conditional Tags} article in the Theme Developer Handbook.
	 *
	 * @see https://core.trac.wordpress.org/ticket/51543
	 *
	 * @since 1.0
	 *
	 * @global WP_Query $wp_query WordPress Query object.
	 * @return bool Whether the query is for the sitemap.
	 */
	function is_sitemap() {
		global $wp_query;

		if ( ! isset( $wp_query ) ) {
			_doing_it_wrong( __FUNCTION__, translate( 'Conditional query tags do not work before the query is run. Before then, they always return false.' ), '3.1.0' );
			return false;
		}

		return property_exists( $wp_query, 'is_sitemap' ) ? $wp_query->is_sitemap : false;
	}
endif;

if ( ! function_exists( 'is_sitemap_stylesheet' ) ) :
	/**
	 * Determines whether the query is for the sitemap stylesheet.
	 *
	 * For more information on this and similar theme functions, check out
	 * the {@link https://developer.wordpress.org/themes/basics/conditional-tags/
	 * Conditional Tags} article in the Theme Developer Handbook.
	 *
	 * @since 1.0
	 *
	 * @global WP_Query $wp_query WordPress Query object.
	 * @return bool Whether the query is for the sitemap stylesheet.
	 */
	function is_sitemap_stylesheet() {
		global $wp_query;

		if ( ! isset( $wp_query ) ) {
			_doing_it_wrong( __FUNCTION__, translate( 'Conditional query tags do not work before the query is run. Before then, they always return false.' ), '3.1.0' );
			return false;
		}

		return property_exists( $wp_query, 'is_sitemap_stylesheet' ) ? $wp_query->is_sitemap_stylesheet : false;
	}
endif;

Last edited 20 months ago by RavanH (previous) (diff)
Note: See TracTickets for help on using tickets.