WordPress.org

Make WordPress Core

Opened 10 months ago

Last modified 3 months ago

#51912 reopened defect (bug)

Sitemap pages 404 with more than one page

Reported by: loranrendel Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 5.5
Component: Sitemaps Keywords: has-patch
Focuses: Cc:

Description

When there are more than 1 page in sitemap there may be an error: the sitemap will be provided with right content but 404 code.

For example I will decrease max url count from 1000 to 2 in new test WordPress installation with 9 posts:

<?php
add_filter('wp_sitemaps_max_urls', function () {
    return 2;
});

https://testwp.xpor.org/wp-sitemap-posts-post-1.xml 200
https://testwp.xpor.org/wp-sitemap-posts-post-2.xml 404

WP_Query dump:

WP_Query Object
(
    [query] => Array
        (
            [paged] => 2
            [sitemap] => posts
            [sitemap-subtype] => post
        )

    [query_vars] => Array
        (
            [paged] => 2
            [sitemap] => posts
            [sitemap-subtype] => post
            [error] => 
            [m] => 
            [p] => 0
            [post_parent] => 
            [subpost] => 
            [subpost_id] => 
            [attachment] => 
            [attachment_id] => 0
            [name] => 
            [pagename] => 
            [page_id] => 0
            [second] => 
            [minute] => 
            [hour] => 
            [day] => 0
            [monthnum] => 0
            [year] => 0
            [w] => 0
            [category_name] => 
            [tag] => 
            [cat] => 
            [tag_id] => 
            [author] => 
            [author_name] => 
            [feed] => 
            [tb] => 
            [meta_key] => 
            [meta_value] => 
            [preview] => 
            [s] => 
            [sentence] => 
            [title] => 
            [fields] => 
            [menu_order] => 
            [embed] => 
            [category__in] => Array
                (
                )

            [category__not_in] => Array
                (
                )

            [category__and] => Array
                (
                )

            [post__in] => Array
                (
                )

            [post__not_in] => Array
                (
                )

            [post_name__in] => Array
                (
                )

            [tag__in] => Array
                (
                )

            [tag__not_in] => Array
                (
                )

            [tag__and] => Array
                (
                )

            [tag_slug__in] => Array
                (
                )

            [tag_slug__and] => Array
                (
                )

            [post_parent__in] => Array
                (
                )

            [post_parent__not_in] => Array
                (
                )

            [author__in] => Array
                (
                )

            [author__not_in] => Array
                (
                )

            [ignore_sticky_posts] => 
            [suppress_filters] => 
            [cache_results] => 1
            [update_post_term_cache] => 1
            [lazy_load_term_meta] => 1
            [update_post_meta_cache] => 1
            [post_type] => 
            [posts_per_page] => 10
            [nopaging] => 
            [comments_per_page] => 50
            [no_found_rows] => 
            [order] => DESC
        )

    [tax_query] => WP_Tax_Query Object
        (
            [queries] => Array
                (
                )

            [relation] => AND
            [table_aliases:protected] => Array
                (
                )

            [queried_terms] => Array
                (
                )

            [primary_table] => wp_posts
            [primary_id_column] => ID
        )

    [meta_query] => WP_Meta_Query Object
        (
            [queries] => Array
                (
                )

            [relation] => 
            [meta_table] => 
            [meta_id_column] => 
            [primary_table] => 
            [primary_id_column] => 
            [table_aliases:protected] => Array
                (
                )

            [clauses:protected] => Array
                (
                )

            [has_or_relation:protected] => 
        )

    [date_query] => 
    [request] => SELECT SQL_CALC_FOUND_ROWS  wp_posts.ID FROM wp_posts  WHERE 1=1  AND wp_posts.post_type = 'post' AND (wp_posts.post_status = 'publish' OR wp_posts.post_status = 'private')  ORDER BY wp_posts.post_date DESC LIMIT 10, 10
    [posts] => Array
        (
        )

    [post_count] => 0
    [current_post] => -1
    [in_the_loop] => 
    [comment_count] => 0
    [current_comment] => -1
    [found_posts] => 0
    [max_num_pages] => 0
    [max_num_comment_pages] => 0
    [is_single] => 
    [is_preview] => 
    [is_page] => 
    [is_archive] => 
    [is_date] => 
    [is_year] => 
    [is_month] => 
    [is_day] => 
    [is_time] => 
    [is_author] => 
    [is_category] => 
    [is_tag] => 
    [is_tax] => 
    [is_search] => 
    [is_feed] => 
    [is_comment_feed] => 
    [is_trackback] => 
    [is_home] => 
    [is_privacy_policy] => 
    [is_404] => 1
    [is_embed] => 
    [is_paged] => 
    [is_admin] => 
    [is_attachment] => 
    [is_singular] => 
    [is_robots] => 
    [is_favicon] => 
    [is_posts_page] => 
    [is_post_type_archive] => 
    [query_vars_hash:WP_Query:private] => bcf5fd65d0a7962d637cd5cb9d865508
    [query_vars_changed:WP_Query:private] => 
    [thumbnails_cached] => 
    [stopwords:WP_Query:private] => 
    [compat_fields:WP_Query:private] => Array
        (
            [0] => query_vars_hash
            [1] => query_vars_changed
        )

    [compat_methods:WP_Query:private] => Array
        (
            [0] => init_query_flags
            [1] => parse_tax_query
        )

)

Change History (9)

#1 @SergeyBiryukov
10 months ago

  • Component changed from General to Sitemaps
  • Summary changed from Sitemap pages 404 to Sitemap pages 404 with more than one page

#2 @peterwilsoncc
10 months ago

  • Version changed from 5.5.3 to 5.5

Thank you for your report.

I am able to reproduce this from version 5.5, I'll bring this to the attention of the sitemap maintainers for their review.

This ticket was mentioned in Slack in #core-sitemaps by peterwilsoncc. View the logs.


10 months ago

This ticket was mentioned in Slack in #core-sitemaps by peterwilsoncc. View the logs.


5 months ago

#5 @peterwilsoncc
5 months ago

I think the root cause of this is #51117.

The report in #53095 suggests this is a more serious problem for sites with a custom post type that contains more post objects than there are native WordPress post post objects.

As the sitemaps execute the main (is_home()) query, that query is used to determine the the pages status. On sites with multiple pages in the site map, then the page parameter is passed to the is_home query causing a file not found error if the frontend would not have the same number of pages.

Consider the following site:

  • 9 post objects
  • 3500 custom post type objects

Page two of the CPT sitemap will 404 as there is only a single page of posts according the the posts per page setting in the dashboard.

The same will apply for user and taxonomy site maps if there are more authors or terms than posts.

#6 @peterwilsoncc
5 months ago

#53095 was marked as a duplicate.

#7 @tigerfinch
4 months ago

  • Keywords has-patch added
  • Resolution set to invalid
  • Status changed from new to closed

(Apologies if I'm tagging this wrong... I've tagged has-patch as I've got a potential resolution)

The solution is – when a sitemap is being generated – to force the main query to use the post_type of the sitemap.

As an interim solution for users, I found this worked in theme/plugin code:

  add_filter('pre_get_posts', function($query) {
    global $wp_query;
    if ($wp_query->query['sitemap'] === 'posts')
      $query->set('post_type', $wp_query->query['sitemap-subtype']);
    return $query;
  });

As a core fix for this, we could alter the function function register_rewrites() in class-wp-sitemaps.php to pass the requested custom post type to the main query:

// Register routes for providers.
add_rewrite_rule(
  '^wp-sitemap-([a-z]+?)-([a-z\d_-]+?)-(\d+?)\.xml$',
  'index.php?sitemap=$matches[1]&sitemap-subtype=$matches[2]&paged=$matches[3]',
  'top'
);

to

// Register routes for providers.
add_rewrite_rule(
  '^wp-sitemap-([a-z]+?)-([a-z\d_-]+?)-(\d+?)\.xml$',
  'index.php?sitemap=$matches[1]&sitemap-subtype=$matches[2]&post_type=$matches[2]&paged=$matches[3]',
  'top'
);

#8 @tigerfinch
4 months ago

  • Resolution invalid deleted
  • Status changed from closed to reopened

Whoops, didn't mean to close the issue, apologies

#9 @Tkama
3 months ago

I described the problem and temporary solution here https://wp-kama.com/handbook/sitemap/bag-404-pagination

The core solution is to move Sitemap init from template_redirect hook to parse_request hook, as it's done for REST API.

Note: See TracTickets for help on using tickets.