WordPress.org

Make WordPress Core

Opened 6 months ago

Last modified 6 months ago

#52998 new enhancement

Add a filter to paginate_links to allow bypassing merging query params

Reported by: jonoaldersonwp Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version:
Component: General Keywords: seo
Focuses: performance Cc:

Description (last modified by jonoaldersonwp)

The paginate_links function constructs the HTML markup used in archive pagination.

As part of this, any parameters present in the requesting URL are merged into the pagination links (ref).

This is a safeguard to ensure that any logic which relies on $_GET parameters in a paginated series still functions as expected as the user navigates through paginated states. E.g., a plugin may use query parameters to describe a date range in an archive; such as /archive/page/2/?date_from=2001-12-20&date_to=2020-05-14.

I'd like for theme/plugin authors to be able to disable/bypass this 'merging' process.

The problem(s)

The current behaviour makes it possible to request (paginated) archive templates with arbitrary, meaningless query parameters (e.g., https://wordpress.org/support/topic-tag/iframe/page/5/?cats=yes&dogs=no&cake=please&utm_tracking=broken). Those parameters persist through all subsequent paginated requests.

This bypasses caches, breaks digital analytics, and inflates crawling from third party systems (search engines, social media, etc), as it 'creates' many additional URLs.

Furthermore, because WordPress doesn't natively output canonical URL tags on paginated archives, this can be extremely destructive from an SEO perspective; particularly on large sites, where the 'existence' of many such URLs compounds the problems above. And whilst the presence of a canonical tag would mitigate some of that damage, it wouldn't address those other problems.

A potential solution

I propose that we should introduce a new filter into paginate_links which allows theme/plugin authors to bypass the code block which merges in query parameters.

This could be used by sites/themes/plugins when they're confident that they don't need to persist queries.

The filter should accept a simple boolean value, which acts as a switch to skip the merging logic.

Alternatively:

  • The filter could define a list of allowed/expected query params, or;
  • The filter could restrict valid params to only include registered query vars.

A note on the existing paginate_links filter

Note that a paginate_links filter already exists at the end of the function, but this only allows for manipulating the href attribute of the link. Whilst this could technically be used to strip parameters via brute force, this is cumbersome, and requires knowledge of the correct canonical URL (which WordPress doesn't natively/reliably have).

Change History (2)

#1 @jonoaldersonwp
6 months ago

  • Description modified (diff)

#2 @jonoaldersonwp
6 months ago

  • Description modified (diff)
Note: See TracTickets for help on using tickets.