Opened 5 years ago
Last modified 4 years ago
#48223 new defect (bug)
parse_request(): When request has multiple matching rewrite rules, and matched rule returns 404 - iterate to next rewrite rule
Reported by: | apedog | Owned by: | |
---|---|---|---|
Milestone: | Awaiting Review | Priority: | normal |
Severity: | normal | Version: | |
Component: | Rewrite Rules | Keywords: | has-patch dev-feedback |
Focuses: | Cc: |
Description
A request might have multiple matching rewrite rules.
parse_request()
will take the top rule and attempt to parse it. If that rule results in an empty query, is_404()
is set.
In the case of a 404 error on the first matched rule, parse_request()
should iterate to the next matched rule and attempt the next query.
Example:
A request might match all these rules=>queries:
(.?.+?)(?:/([0-9]+))?/?$
=>pagename=$matches[1]&page=$matches[2]
([^/]+)(?:/([0-9]+))?/?$
=>name=$matches[1]&page=$matches[2]
If first query pagename=$matches[1]&page=$matches[2]
returns 404, parse_request()
should attempt the second query before returning a 404 error.
Attachments (4)
Change History (10)
#3
@
5 years ago
- Keywords has-patch dev-feedback added
I've written a rough patch that attempts to fix this issue.
The following changes have been made to class-wp.php
:
Added variable $all_matched_rules
- array of all possible rewrite rule matches. This is unused but persistent.
Added variable $matched_rules
- array of matched rules to loop over.
Function parse_request()
creates array of all possible rewrite rule matches and pops first rule off $this->matched_rules
array, populating $this->matched_rule
.
On filter 'pre_handle_404'
added a function maybe_try_another_matched_rule()
that tests if other possible rewrite rules exist for this request. If so, function parses request and queries posts again for next matched rule available. It does so recursively until $matched_rules
array is empty. In which case it defaults to default 404 handling.
This patch is more of a proof-of-concept. And would obviously need some iterations if accepted. The code is convoluted and is redundant in some areas. This can be fixed, with zero damage to logic, by making some additional changes to parse_request()
.
@SergeyBiryukov @johnbillion
#4
@
4 years ago
So in attempting to write a patch for this ticket I ended up having to re-factor class WP. So the attached patch is a little "bigger" than just a patch for this ticket and might justify opening a separate ticket for.
I've added a patch (48223.wp-refactor.patch) that does multiple things.
- Refactors class WP
- Introduced a new class Request_Environment
- Added code to allow parsing multiple rules if the first one returns empty/404
Request_Environment
Request_Environment is the same as the old WP environment class but is decoupled from $_SERVER
, $_GET
, $_POST
, global $wp_the_query
etc.
It does not send headers. It does not set globals.
Request_Environment can ideally be used within a virtual WordPress environment, to parse any request string, without setting globals, without sending headers and using global $wp_query.
It does not accesses $_SERVER
and global $wp_query
directly, but instead uses instance variables $this->server
, $this->wp_query
etc.
It introduces a new method setup_request_environment()
that accepts $_SERVER
(or a similarly formatted array), $_GET
, $_POST
(or equivalent array), a WP_Query
object and a WP_Rewrite
instance. And stores them to local/instance variables.
Refactor class WP
WP is now an extension of Request_Environment.
It uses the new setup_request_environment()
method to populate instance variables with actual globals $_SERVER
, $_GET
, global $wp_query
, $wp_rewrites
.
It also sends headers and sets globals after parsing is done as before.
All methods that were available before are still available now. Either in class WP
or in class Request_Environment
.
All hooks that were available before are still available now. Except some minor changes detailed below.
parse_request()
In order to better work with the WP environment object and parse_request()
method, I've broken it up into 3 separate chunks/methods - detailed in the section below.
Multiple Rules Matching
In order to accomodate multiple rule matching (the main impetus behind this exercise), I've had to break up parse_request()
into smaller more managable chunks:
_match_rewrite_rules()
: gets all rules that match request. Does not break after first match and keeps them in an array._match_single_rule()
: same algorithm as before._parse_request()
: actual parsing of the current rule being matched. same as before.
I've added a hook 'match_multiple_rules'
and a new function _reparse_request().
If match_multiple_rules
is set to true
, and the first query_posts()
returns empty or 404, _reparse_request()
will clear the WP_Query object, will parse the next matching rule and query_posts again.
Notable Changes
I've changed the order of query_posts() and send_headers(). So headers are only sent after query_posts ( and reparse_request ) have run. This means some hooks will now fire in a slightly different order. This might cause problems in some rare edge-cases. But I believe this is a minor change that should not break any existing installaiton or plugin.
But if testing proves this assumption wrong - reverting it shouldn't be too hard to do. It will just require slightly repetitive code.
unset($_GET['error'])
now executes inside setup_globals()
- after _reparse_request()
finishes - and not during parse_request()
as before.
This might need some further iterations. But it generally works. I'd love to get some feedback on this. And am willing to shepherd this ticket and make changes if necessary. I know this might be/seem to be a big change so any and all feedback is appreciated.
@SergeyBiryukov I know you've made some changes to class-wp recently. This applies cleanly to that AFAICT. But obviously this needs scrutiny.
@johnbillion - You've expressed some interest in this type of functionality. This is what I came up with. I'd greatly appreciate your input as well.
Case in point:
A database was created before custom post types existed, has the following hierarchical pages published:
recipes
andrecipes/chocolate-cake
At a later point a
Recipe
custom post type was registered with archive/front slug'recipes'
and used for all new recipes. However the old pages were not converted to the new custom post types, and are still regular WP pages.As it stands, the published page
recipes/chocolate-cake
cannot be reached, neither through the permalink nor using?p=123
notation. Instead returning a 404 page.