Make WordPress Core

Opened 14 years ago

Closed 13 years ago

Last modified 13 years ago

#12935 closed enhancement (wontfix)

Evolve the URL routing system

Reported by: mikeschinkel's profile mikeschinkel Owned by: ryan's profile ryan
Milestone: Priority: normal
Severity: normal Version: 3.0
Component: Permalinks Keywords:
Focuses: Cc:

Description

As WordPress grows into a proper CMS with MultiSite and Custom Post Types and Custom Taxonomies there are frequent opportunities for URL conflict that didn't occur as often with just Posts and Pages.

Here's a simple example, let's say a company uses WordPress for their website and they offer development services as well as training on their website:

http://example.com/services/vmware/
http://example.com/training/vmware/

Currently it's not possible for an end user to get that result, the best they can do is this:

http://example.com/services/vmware/
http://example.com/training/vmware-2/

One key problem is WordPress considers both post_names and URLs as globally unique and the current system of using full path regexes makes it very hard to discover paths from the routes, it's only possible to test paths against the routes.

I'd like to propose we add a backward compatible layer that allows WordPress and its theme and plugin developers to transition to a system where post_names are no longer required to be unique and URL routing would be specified on a path segment basis. This would mean that WordPress would maintain an internal tree structure of URL routes and it would traverse the tree instead of scanning a global list of URL routes.

The overhead for smaller sites should be trivial but for larger sites that need to scan many URL routes it could make it easier to scale in size and complexity. It would also make special cases much easier to implement robustly without as much fear of breaking something that already exists. This proposed solution would make it possible to more easy to discover and test routes. It would also make it possible to display a URL sitemap and to enable plugins to easily let users edit the site map and thus define URL routings. It would also open up lots of flexibility in areas or URL routing where before it was just too difficult to address.

I'm envisioning this URL routing to have a lower priority than existing URL routing and by nature the community would slowly transition to this new system. WordPress would first check the older URL routing and if no matches then the new URL routing would be used. Core could transition all it's routes to the new system sometime in 3.x and then plugins could start transitioning over as well until, over time, most plugins and themes support the new system. It would also be nice to have an advanced option (maybe via a define()d constant) that would disable the older URL routing system for a given site so that users, themers and plugin developers could ensure their plugins are using the new system only.

This is potentially a big change so while I'm very willing to tackle this with community support I don't want to write code yet but I do want to start the discussion. However, here is a psuedo-code that illustrates the structure conceptually. Note the example includes a custom post type and also a custom URL of the structure to illustrate flexibility:

http://example.com/2010/top10/

/*
This code demonstrates ONE potential structure for visualization and NOT how most route code would be built.
*/
function set_routes() {
  global $wp_routes;

  $post_routes = array('%postname%'=> array('type'=>'post'));

  $year_routes = array(
    'top10'   => array('type'=>'query','query'=>'post_type=post&taxonomy=post_tag&term=top10'),
    '%month%' => array('type'=>'query','query'=>'post_type=post&year=%_parent%&month=%month%','match'=>'/[1-2][0-9]/','routes'=>$post_routes),
  );
  $category_routes = array(
    '%category%'=> array('type'=>'query','query'=>'post_type=post&taxonomy=category&term=%category%'),
  );
  $tag_routes = array(
    '%tag%'=> array('type'=>'query','query'=>'post_type=post&taxonomy=post_tag&term=%tag%'),
  );
  $wp_routes = array(
    'about' => array('type'=>'page'),
    '%year%'  => array('type'=>'query','query'=>'post_type=post&year=%year%','match'=>'/[0-9]{4}/','routes'=>$year_routes),
    'products' => array('type'=>'post_list','post_type'=>'product'),
    'category' => array('type'=>'tax_list','taxonomy'=>'category','routes'=>$category_routes),
    'tag' => array('type'=>'tax_list','taxonomy'=>'post_tag','routes'=>$tag_routes),
  );
}

And here is what it looks like when using print_r():

Array
(
  [about] => Array
    (
      [type] => page
    )

  [%year%] => Array
    (
      [type] => query
      [query] => post_type=post&year=%year%
      [routes] => Array
        (
          [top10] => Array
            (
              [type] => query
              [query] => post_type=post&taxonomy=post_tag&term=top10
            )

          [%month%] => Array
            (
              [type] => query
              [query] => post_type=post&year=%_parent%&month=%month%
              [routes] => Array
                (
                  [%postname%] => Array
                    (
                      [type] => post
                    )

                )

            )

        )

    )

  [products] => Array
    (
      [type] => post_list
      [post_type] => product
    )

  [category] => Array
    (
      [type] => tax_list
      [taxonomy] => category
      [routes] => Array
        (
          [%category%] => Array
            (
              [type] => query
              [query] => post_type=post&taxonomy=category&term=%category%
            )

        )

    )

  [tag] => Array
    (
      [type] => tax_list
      [taxonomy] => post_tag
      [routes] => Array
        (
          [%tag%] => Array
            (
              [type] => query
              [query] => post_type=post&taxonomy=post_tag&term=%tag%
            )

        )

    )

)

BTW, these tickets are probably somewhat related and I believe this approach would make it much easier to resolve these issues:

  • #12884 - Taxonomy Pages aren't properly redirected on main site when using Multi Site Subdirectories
  • #12002 - WPMU should not lock the root blog into using a /blog prefix on permalinks
  • #11279 - Support for putting all "blog" content under /blog/

And these requests/issues:

Attachments (1)

controller.php (25.7 KB) - added by jacobsantos 14 years ago.
Prototype for router.

Download all attachments as: .zip

Change History (69)

#1 @mikeschinkel
14 years ago

I forgot to emphasize that this could large, complex sites scale and I forgot to mention that this would also provide a nice convenient way to provide cleaner context metadata about the current URL being requested (i.e. page, post, tax_list, post_list, (post)query, etc.), it would make it easier to develop robust theme components for serving pages for those patterns and it would be easier to identify and formalize new patterns (i.e. a url that loads a "calendar" type, for example.)

#2 follow-up: @scribu
14 years ago

Why do some routes have %% and some don't? i.e. 'top10' vs '%month%'

#3 in reply to: ↑ 2 @mikeschinkel
14 years ago

Replying to scribu:

Why do some routes have %% and some don't? i.e. 'top10' vs '%month%'

Hi @scribu, really glad to have you comment. I remember you were the one to express interest in this concept when I mentioned it briefly on wp-hackers a few months ago.

The routes with %% are rewrite tags vs. literal matches; i.e. 'top10' is literal whereas '%month%' can accept any value that matches the regex, i.e.: for the following URL %match% matches '2010' and 'top10' match 'top10', of course. So this:

http://example.com/%year%/top10/

Matches:

http://example.com/2010/top10/

That is equivalent to the following rewrite rule

[([0-9]{4})/top10/?$] => index.php?post_type=post&year=$matches[1]&taxonomy=post_tag&term=top10

This would be similar to this (assuming I did this right; I still have trouble with WordPress' rewrite system even though I've used to numerous times):

add_rewrite_tag('%year%','([0-9]{4})');
add_permastruct('top10', '%year%/top10');

I believe we can do it in a way that get_option('rewrite_rules') are tested first for backward compatibility and then the routes are inspected if the old rewrite rules don't apply. However, we could make it so any calls to add_rewrite_tag(), add_rewrite_rule() and add_permalink() build the route structure instead of rewrite rules and thus convert most rewriting to routing in a very short period of time.

Additional benefits of this approach would include:

  • Simplicity: I think we'll find it will take a lot less code for this than is currently in rewrite.php (~2000 lines.)
  • Understandability: I think it will be much easier for people to understand this tree structure than the global flat mapping of the rewrite rules.
  • Flexibility: We'll be able to use different options down different path segments. If some paths want to disable fe
  • Eliminate Clutter: Many of the rewrite rules are not needs but it was easier to code it that way, i.e. most pages and custom post types don't need feeds.
  • Robustness: Since all routes wouldn't be path-segment local we can add new routes along a path segment being much less concerned with conflicting rewrite tags.
  • Reuse Routes: With this structure it would be very easy to define feed, pages, comments, trackbacks, attachments, etc. in one place and reference it anywhere needed as opposed to the current structure which has that information duplicated for every rewrite. In a system I'm working on with 5 custom post types there are 60 rewrite rules dealing with attachments. I think attachments can be defined as five routes and referenced only where needed.
  • Performance: This approach would allow for the literal tags to be stored as array keys and thus will be faster to match against than running 100+ reasonably complex regexes in order to find the last regex that matches the "about" page URL. See the psuedo-code showing the difference for loading a root page with routes vs. with rewrites (yes the full route code will be much more complicated):

With routes:

if (isset($wp_routes[$request])) {  // Once...
  load_page($request);
}

With rewrites:

foreach ( $rewrite as $match => $query) {   // 100 times or more...
  if (preg_match("#^$match#",$request,$matches) {
    load_page($request);
  }
}


Hopefully this explains?

P.S. Just to be comprehensive, the rewrite tags would still be global because they would map onto a single URL. But ensure rewrite tags are unique is easier (and more flexible) than requiring URL path segments to be unique across all URLs that are not part of the same path segment branch. Also, routes would be processed inside $wp->parse_request() where rewrites are currently being processed. $wp->query_posts() would execute exactly as it already does, thus when I say backward compatible I'm referring to creating a drop-in replacement for $wp->parse_request().

#4 follow-up: @scribu
14 years ago

You should really bring this up (in sumarised form) in the dev chat, when scope for 3.1 is set.

#5 in reply to: ↑ 4 @mikeschinkel
14 years ago

Replying to scribu:

You should really bring this up (in sumarised form) in the dev chat, when scope for 3.1 is set.

Cool. Any idea when that will be? (And will you be there to support it? :)

#6 @scribu
14 years ago

No and yes, respectively.

#7 follow-up: @scribu
14 years ago

I found this post helpful in understanding the current routing system:

http://ottopress.com/2010/category-in-permalinks-considered-harmful/

#8 in reply to: ↑ 7 @mikeschinkel
14 years ago

Replying to scribu:

I found this post helpful in understanding the current routing system:

http://ottopress.com/2010/category-in-permalinks-considered-harmful/

Thanks for the article. It helps me understand some of the issues I didn't understand before. It also helps me recognize how using a tree structure for URL routing could add even more benefits than I first realized. Think I need to build a prototype...

#9 @mikeschinkel
14 years ago

Here's a thread on wp-hackers that presents another use-case for this:

http://lists.automattic.com/pipermail/wp-hackers/2010-April/031582.html

#10 @johnonolan
14 years ago

That thread is also relevant to #12974

#12 @ryanpc
14 years ago

  • Cc ryanpc@… added

#14 @mikeschinkel
14 years ago

Another request for improving the URL system:

http://wordpress.org/support/topic/397831

#15 @hakre
14 years ago

  • Keywords dev-feedback reporter-feedback added

URL resolvement is hardcoded into WP. It has been grown since years and whenever this was for a change, it has been said it's kept for backwards compability.

So do not expect to change this.

What you can do instead: Create your own frontend controller:

  • Intercept the HTTP request before it get's passed into WordPress (e.g. index.php)
  • Store the orginal request and parse it as needed.
  • Transpose the request into something WP can deal with (e.g. changing query variables in $_REQUEST, $_SERVER, $_POST, $_GET etc.)
  • Give your plugins / extensions access to the original request.

I suggest to close this as wontfix because it's highly certain that this issue won't get fixed.

#16 @hakre
14 years ago

  • Summary changed from Evolve the URL routing system to Change the URL routing system

#17 @johnonolan
14 years ago

  • Keywords dev-feedback reporter-feedback removed
  • Milestone changed from Unassigned to 3.1
  • Summary changed from Change the URL routing system to Evolve the URL routing system

Mike is fairly familiar with the URL system and why it works the way it does, he wouldn't have started this ticket if he wasn't. Regardless, this is going to get a full review in a 3.1 dev meeting so there's no need to do anything with it now.

#18 @mikeschinkel
14 years ago

Hakre - I understand the URL routing system reasonably well (I just spend a full Saturday working around if for a client.) This proposal presumes (and hopefully I'm correct) that the parse_query() can be superseded without sacrificing any backward compatibility.

John, thanks for the support.

I'm currently working on three projects that all need better URL routing and I'm working 12 hours days on these projects so I've not had time to go off and spend the 3 days this will probably take but practically every day I work I realize the need. When I have the ability to take the time I plan on building a proof of concept. At least I hope I can get the 3+ days it will take to do this sometime soon.

BTW, this is just one potential way to address the need for more flexible URL routing; I'm pretty sure we are going to need it once custom post types get full into the wild.

#19 follow-up: @hakre
14 years ago

Well I actually made a suggestion based on what I personally assume what is happening with this ticket. And I won't fight over a single word. My suggestion was from a practical point of view with the projects history in mind. I might be wrong and infact for this ticket, I really hope I am.

It's this way: If you ask me for my personal opinion about this, we might have more in common as you thought:

  • The current Routing system has immense deficiencies
  • The Permalink Implementation is a mess
  • The Frontend-Controller Implementation is a mess
  • WP_Query needs a refactoring for sure

So please just fix it and all the best by that. And please do not let yourself get de-motivated by design issues, missing specs, backwards compatibility blockers and all the fun we have over here. Oh for this one you'll get an immense needs-unit-tests for sure, so better be prepared. Good luck!

Keep in mind that some URLs get reflected over HTTP redirects so you might want to take a look in the redirect_canonical() function as well.

I know this can be pretty hard w/o any specs, so I personally welcome any input, especially if it's specs and docs. I would love to see this already in 3.1!

On thing I would love to see in a new implementation is to have better query var support like

http://example.com/permalink-of-a-post/print

instead of

http://example.com/permalink-of-a-post/print=1

I think you know what I mean.

Here is a custom post type example (that feature if often propagated as CMS, but well w/o propper URLs I'm totally with you):

http://lists.automattic.com/pipermail/wp-testers/2010-May/012980.html

#20 in reply to: ↑ 19 ; follow-up: @scribu
14 years ago

Replying to hakre:

On thing I would love to see in a new implementation is to have better query var support like

http://example.com/permalink-of-a-post/print

instead of

http://example.com/permalink-of-a-post/print=1

Those are called endpoints and there are plans to make them easier to use IN WP 3.1, independend of this ticket.

#21 in reply to: ↑ 20 @hakre
14 years ago

Replying to scribu:

Those are called endpoints and there are plans to make them easier to use IN WP 3.1, independend of this ticket.

Great feedback! Why not do both together? I'm sure a lot will profit from an evolvement of how URLs are resolved.

#22 @mikeschinkel
14 years ago

So as working on this even though I have projects with intense deadlines waiting as I want to move it forward and as it is a guilty pleasure for me. Figuring out where to start has not been easy but I decided to start both writing down some goals that I think are important and I also did some deconstructing of the rewrite rules after a vanilla install of WordPress 3.0 beta. (Note I write the goals here so they can be open to discussion; my views are not as important as the views of the core dev team and the collective views of the WP developer community, but at least my views are a starting point for discussion.)

Here are the goals I wrote down. I will probably think of more but at least these are a start:

NAMING

  • The existing system is referred to as the "Rewrite" system.
  • The new system is referred to as the "Routes" system.
  • These goals define "Compatible" as rewrite first, routes second.
  • These goals define "Transitional" as routes first, rewrite second.

GOALS

  • Compatibility
    • Drop-in replacement for $wp->parse_request().
    • Support first implemented via plugin thus potentially requiring new hooks.
    • No potential compatibility issues except potential edge cases with rewrite hooks in transitional mode.
    • Introduces as few new concepts and structures as absolutely necessary.
    • Supports query_vars as before.
    • As much as possible routes should feel familiar to WordPress devs and themers.
    • As compatible as possible with all plugins with selectable modes to change priority.
    • Matching existing external URL structure behavior a high priority
    • Priority of matching internal rewrite hook behavior selectable by mode
    • Enable plugins to register support for routes vs. rewrite otherwise rewrite assumed.
    • Mode to support internal usage of rewrites as primary, routes as secondary.
    • If possible, "sniff out" requirement to support rewrite mode based on hooks in use.
    • "No compatibility" mode should enable development of an easy-to-use URL configuration admin module.
    • Aim to deprecate rewrite after sufficient calendar time and suffient plugins convert to routes.
    • Default compatibility mode changes over releases, from rewrite, to compatible, to transition, to routes.
    • Near-term releases of WordPress delivered in "most compatible" mode.
  • Functionality
    • Easy to understand for the coder.
    • Reasonably easy to code advanced routes.
    • Minimal new class/structure/syntax required.
    • Able to define most routes using existing URL template formats.
    • Internally views URL paths as collections of slash-separated segments.
    • Primarily focuses on URL path segments but can address partial segments and multi-segments.
    • Maps URLs to query_vars, just like rewrite.
    • Mimics existing WordPress design patterns (register_*() functions,etc.)
    • Fully flexible regarding URL layout.
    • Logically recursive; Allows any path tree to be a branch of any path segment.
    • Post_type agnostic; allow any post type to come in any order.
    • Heterogeneous parent-child paths; i.e. a page can be a parent of a custom post type.
    • Explicitly stated: full support for paths like /%category%/%post%/.
    • Advanced matching; allows matching by literal, pattern(regex), or function.
    • Full control of path segment order; supports fine grain matching in any combination.
    • Group common endpoints (path segment branches) to be applied to branches in the path tree.
    • Allows assignment of literal matches post_types and/or taxonomies.
    • As performant as rewrites in standard case; acceptibly performant in 99 percentile.
    • Optimizable and tunable; certain routes can be preloaded by design.
    • Self-tuning on cron task; most commonly used routes pre-routed.
    • Easily supports basic structure of slashes and segments, i.e. /segment/segment/.../
    • Enables support for structure beyond slashes and segments, albeit more complex.
    • Order of URL path segment assignment mostly unimportant (except for regex & functions).

I'll post the deconstruction of URLs next.

#23 @mikeschinkel
14 years ago

Here is the deconstruction of the URLs I came up with.

I've designed a psuedo-code where %%var%% {...} is a macro for one or URL path branches that can be applied to a path branch, and they can be optional. Each line that isn't a macro is a potential URL path and each line within a macro is a potential URL path branch. These lines assume the regex definitions of query_vars that are already defined in WordPress so these lines don't concern themselves with query var definitions.

Doing it this way allowed me to deconstruct them so as to have no duplication. I'm sure these are almost completely correct but they strike me as a tad inconsistent (i.e. some URLs get the ssort parameter and others that seem like they should don't; some get support for trackbacks, etc.)

I also noticed that %day% was not matched for some post and attachment URLs hence the %dummy% var. Also the %*% is a wildcard to match anything that comes before it.

As an aside, being able to view WordPress' URL rewrite/routing system in this manner makes it a lot easier for me to grasp and digest in one sitting. I'm thinking if the implementation is similarly straightforward it will be a comparatively easy for developers and even themers to get their heads around:

%%feed%% {
  feed/%feed%
  %feed%
}
%%page%% {
  page/%page%
}
%%feed_page%%(optional) {
  %%feed%%
  %%page%%
}  
%%sort_feed_page%%(optional) {
  sort/%ssort%/%%feed_page%%
}
%%comment_page%% {
  comment-page-%cpage% 
}
%%paged%% {
  page/%paged% 
  %paged%
}
%%trackback_feed_comment_page%%(optional) {
  trackback
  %%feed%%
  %%comment_page%% 
}
%%trackback_feed_paged_comment_page%%(optional) {
  %%trackback_feed_comment_page%%
  %%paged%%
}
%%attachment%% {
  attachment/%attachment%
}
%%attachment_trackback_feed_comment_page%% {
  %%attachment%%/%%trackback_feed_comment_page%%
}
    
%year%/%month%/%day%/%%sort_feed_page%%
%year%/%month%/%%sort_feed_page%%
%year%/%%sort_feed_page%%
author/%author%/%%sort_feed_page%%
tag/%tag%/%%sort_feed_page%%
category/%category%/%%sort_feed_page%%
%%sort_feed_page%%

robots.txt
%*%wp-%feed%.php
%*%wp-commentsrss2.php
%%feed_page%%

comments/%%feed_page%%
search/%s/%%feed_page%%

%year%/%month%/%dummy%/%%attachment_trackback_feed_comment_page%%

%year%/%month%/%dummy%/%post%/%%trackback_feed_paged_comment_page%%
%year%/%month%/%dummy%/%%trackback_feed_paged_comment_page%%
%year%/%month%/%%comment_page%%
%year%/%%comment_page%%

%*%/%%attachment_trackback_feed_comment_page%%
%page%/%%trackback_feed_paged_comment_page%%

#24 @jacobsantos
14 years ago

I'm very interested in this and plan on discussing this more. I do think it is time something was done to refactor the current implementation of the Rewrite system to more of a Controller system.

I have stated multiple times for working for a solution. I think perhaps a transitional layer could be implemented to show how it would work on top of the current system to ease to a full Routes system in the near future. If people can see even even a basic routes system, then I think more support and people might be willing to work on a full Routes implementation.

#25 @mikeschinkel
14 years ago

So I spent all day Wed the 26th working on this. My goal for this first step was to come up with code that would allow me to specify routes in a new way but to allow me to generate a list of regular expressions that would match with 100% fidelity the exiting list of regular expressions from those created by a vanilla install of WP3.0. At this stage I'm not worried about code compatibility yet; getting the output to be compatible is enough of a first hurdle.

After a day of coding I created a WP_Routes class was able to get to 100% fidelity close but the code is still very messy (it's recursive and as such hard to get just right as anyone who has done complex recursion knows.) I think I may have been able to get the code to match URLs with almost 100% fidelity but I'm struggling to get it to match the output regular expressions with 100% fidelity. The next step will likely be to set up code that can compare the rewrite rules' regular expressions with the regular expressions output by WP_Routes (instead of doing it by-hand) and see if I can work through the remaining incompatibilities.

More important I want to set up an exhaustive set of unit test cases that would test $_SERVER['REQUEST_URI'] using both the rewrite system and my new route system and compare the queries generated by both as those are more important than matching the actual regular expressions. After all, one of the problems with regular expressions is that they are almost impossible to robustly inspect with code to decipher and to recombine. By focusing on path segments we'll get a lot more flexibility and a lot greater ability to control routing via hooks than is currently possible with the rewrite system. For example, here is an example of the fragile code I had to write to do something that to the client appeared to be a relatively simple change to the URL. I fear it will break when they add a plugin as they are likely to do:

add_action('rewrite_rules_array', 'tyc_rewrite_rules_array');
function tyc_rewrite_rules_array($rules) {
	$keys = array();
	foreach($rules as $key => $rule) {
		if (preg_match('#^index.php\?restaurant-dummy=\$matches\[1\](\&(paged|feed)=\$matches\[2\])?$#',$rule)) {
			$keys[$key] = $rule;
		}
	}
	foreach($keys as $key => $rule)
		unset($rules[$key]);
	return $rules;
}

BTW, the goal of my efforts are not to generate the same regular expressions as the existing system but instead to load the same pages based on the same URLs. The plan is to inspect URL path segment vs. match the entire URL path, but if I can get my code to generate a 100% fidelity match for the existing rewrite rules then I'll know I'm on the right track related to what is required to specify the routes.

If anyone wants to help me generate that list of test cases that maps $_SERVER['REQUEST_URI'] to resultant WordPress URL-encoded queries it will be greatly appreciated.

FYI, I'm hosting this conference[1] so I may simply not have enough time to do any more work on this until July but if I can I will.

[1] http://www.thebusinessof.net/wordpress/

Anyway, here is the current code I have for specifying (almost) compatible vanilla WordPress 3.0 URL routing. It will need to change somewhat but currently this is what I have. I envision it will be one of several ways to specify routes, and it will also be able to be used behind-the-scenes for a transitional compatibility layer; i.e. functions like add_rewrite_tag() and add_permastruct() could end up calling these in a transitional compatibility mode. Note that I would envision (prefer?) to see a legacy compatibility mode (which code similar to this could support) and then an optional set of routes that could be used to clean up the complexity of the routing that was required because of how the rewrites had to be implemented.

register_query_var('%*%',array('pattern'=>'.*'));
register_query_var('%?%',array('pattern'=>'.+?'));
register_query_var('%robots%',array('literal'=>true));
register_query_var('%attachment%');
register_query_var('%tb%',array('literal'=>true));
register_query_var('%withcomments%',array('literal'=>true,'pattern'=>'.*wp-commentsrss2.php'));
register_query_var('%feed%', array('pattern'=>'(feed|rdf|rss|rss2|atom)'));
register_query_var('%cpage%', array('pattern'=>'([0-9]{1,})'));
register_query_var('%s%', array('pattern'=>'(.+)','expand'=>true));
register_query_var('%author_name%');
register_query_var('%tag%', array('expand'=>true));
register_query_var('%ssort%', array('expand'=>true));
register_query_var('%pagename%', array('post_type'=>'page','expand'=>true));
register_query_var('%category_name%', array('expand'=>true));
register_query_var('%year%', array('pattern'=>'([0-9]{4})'));
register_query_var('%monthnum%', array('pattern'=>'([0-9]{1,2})'));
register_query_var('%day%', array('pattern'=>'([0-9]{1,2})'));
register_query_var('%paged%', array('pattern'=>'([0-9]{1,})'));
register_query_var('%page%', array('pattern'=>'([0-9]+)?'));
register_query_var('%dummy%', array('pattern'=>'[^/]+'));
register_query_var('%name%', array('post_type'=>'post'));

register_query_literal('robots.txt', array('append'=>'robots=1'));
register_query_literal('trackback', array('append'=>'tb=1'));
register_query_literal('comments', array('append'=>'withcomments=1'));
register_query_literal('wp-atom.php',array('append'=>'feed=atom'));
register_query_literal('wp-commentsrss2.php',array('append'=>'feed=rss2&withcomments=1'));
register_query_literal('wp-feed.php',array('append'=>'feed=feed'));
register_query_literal('wp-rdf.php',array('append'=>'feed=rdf'));
register_query_literal('wp-rss.php',array('append'=>'feed=rss'));
register_query_literal('wp-rss2.php',array('append'=>'feed=rss2'));

register_route_group('%%feed%%',array(
	'feed/%feed%',
	'%feed%'));

register_route_group('%%page%%',array(
  '%page%'));

register_route_group('%%paged%%',array(
  'page/%paged%'));

register_route_group('%%feed_paged%%',array(
  '%%feed%%',
  '%%paged%%'),true);

register_route_group('%%sort_feed_paged%%',array(
  'sort/%ssort%/%%feed_paged%%'),true);

register_route_group('%%comment_page%%',array(
  'comment-page-%cpage%'));

register_route_group('%%trackback_feed_comment_page%%',array(
	'trackback',
	'%%feed%%',
	'%%comment_page%%'),true);

register_route_group('%%trackback_feed_paged_comment_page%%',array(
	'%%trackback_feed_comment_page%%',
	'%%paged%%'),true);

register_route_group('%%trackback_feed_page_paged_comment_page%%',array(
	'%%trackback_feed_comment_page%%',
	'%%paged%%',
	'%%page%%'),true);

register_route_group('%%attachment%%',array(
	'attachment/%attachment%'));

register_route_group('%%attachment_trackback_feed_comment_page%%',array(
	'%%attachment%%/%%trackback_feed_comment_page%%'));

register_route_path('%year%/%monthnum%/%day%/%%feed_paged%%');
register_route_path('%year%/%monthnum%/%day%/%%sort_feed_paged%%');
register_route_path('%year%/%monthnum%/%%sort_feed_paged%%');
register_route_path('%year%/%%sort_feed_paged%%');
register_route_path('author/%author_name%/%%sort_feed_paged%%');
register_route_path('author/%author_name%/%%feed_paged%%');
register_route_path('tag/%tag%/%%sort_feed_paged%%');
register_route_path('category/%category_name%/%%sort_feed_paged%%');
register_route_path('%%sort_feed_paged%%');
register_route_path('robots.txt');
register_route_path('%*%wp-atom.php');
register_route_path('%*%wp-commentsrss2.php');
register_route_path('%*%wp-feed.php');
register_route_path('%*%wp-rdf.php');
register_route_path('%*%wp-rss.php');
register_route_path('%*%wp-rss2.php');
register_route_path('%%feed_paged%%');
register_route_path('comments/%%feed_paged%%');
register_route_path('search/%s%/%%feed_paged%%');
register_route_path('%year%/%monthnum%/%dummy%/%%attachment_trackback_feed_comment_page%%');
register_route_path('%year%/%monthnum%/%dummy%/%name%/%%trackback_feed_paged_comment_page%%');
register_route_path('%year%/%monthnum%/%dummy%/%%trackback_feed_paged_comment_page%%');
register_route_path('%year%/%monthnum%/%%comment_page%%');
register_route_path('%year%/%%comment_page%%');
register_route_path('%?%/%%attachment_trackback_feed_comment_page%%');
register_route_path('%pagename%/%%trackback_feed_page_paged_comment_page%%');

One thing I have yet to figure out is how best to specify optional path segments. I'm loath to introduce new special characters to the URL template but I am considering using square brackets like so and would like to get other's input on this?

register_route_path('%pagename%/[%%trackback_feed_page_paged_comment_page%%]');

BTW, if it is not obvious anything surrounded by a pair of percent signs (i.e. %%foo%%) is a macro that expands to one or more URL path suffixes (which I currently am naming a "route group" but am open to a better name.)

For example, these:

register_route_group('%%page_suffix%%',array(
	'feed/%feed%',
	'%feed%',
	'page/%paged%',
	'page%paged%',
	));

register_route_path('pages/%pagename%/%%page_suffix%%');

would expand to:

register_route_path('pages/%pagename%/feed/%feed%');
register_route_path('pages/%pagename%/%feed%');
register_route_path('pages/%pagename%/page/%paged%');
register_route_path('pages/%pagename%/page%paged%');

Hopefully you can see why it would be helpful to have an optional specifier, i.e.

register_route_path('pages/%pagename%/[%%page_suffix%%]');

could then expand to:

register_route_path('pages/%pagename%/feed/%feed%');
register_route_path('pages/%pagename%/%feed%');
register_route_path('pages/%pagename%/page/%paged%');
register_route_path('pages/%pagename%/page%paged%');
register_route_path('pages/%pagename%/');

#26 @mikeschinkel
14 years ago

BTW, i'm not sharing the code for the WP_Routes class yet because I want to refactor is significant before I expose it to public scrutiny.

#27 follow-up: @jacobsantos
14 years ago

I think after looking at this, I'm going to go another route that is simpler, in my opinion. This all appears complex, but I do think that it offers a better system than what is in place.

How do you envision the system setting a callback and then calling it. I realize that most of what you are setting up is the routes part and not the full Controller implementation. I do think that specifying the callback as part of the Routes for when the correct one is found works more inline with the other controller implementations.

#28 in reply to: ↑ 27 @mikeschinkel
14 years ago

Replying to jacobsantos:

I think after looking at this, I'm going to go another route that is simpler, in my opinion. This all appears complex, but I do think that it offers a better system than what is in place.

That's fair, but then we are not starting from scratch. This approach would enable almost full flexibility with URL routing while changing as little as possible of what already exists in WordPress. While in a vacuum I would love to see a different system I think the most pragmatic approach is to honor the structure that already exists and only change what absolutely must be changed. I think there is a much greater chance of getting a workable solution integrated in core that take such an evolutionary vs. revolutionary approach.

How do you envision the system setting a callback and then calling it. I realize that most of what you are setting up is the routes part and not the full Controller implementation. I do think that specifying the callback as part of the Routes for when the correct one is found works more inline with the other controller implementations.

Yes, you are only seeing the setup component so, respectfully, I don't think you can fully judge it yet, right? :-)

I am aware of your advocacy of an MVC approach similar to Django and what I assume CakePHP, CodeIgnitor and Rails use but I think that would so significantly change that it is not viable, at least not in one revolutionary step. The approach I'm taking it is maintain the concept for mapping URLs to query vars and letting query vars drive the loading of content.

Yes, I agree that specifying the callback as part of the Routes might be preferable if we didn't have existing themes and plugins that expect the query system to work as is but I think this is the best approach considering where we are right now. Of course I'd love to hear from others on this issue like Andrew Nacin, scribu, hakre, Johnonolan etc. to see their opinion.

BTW, I don't think there is anything that would keep us from extending what I'm doing to enable associating routes with callbacks at a later date, but I'd like to focus on working within the confines of the existing query system first.

One point of note; the innovation over this approach compared to the existing rewrite system is that it inspects path segments instead of full paths. That has the potential to reduce the number of paths that need to be inspected for each page load as each path segment inspected further reduces the number of additional path segments to reduce, much like a b-tree database index speeds database access compared to sequential record access. Plus order of inspection for each segment becomes much is less important since there is less likely to be accidental matching and thus making plugins that hook paths more likely to be robust.

Additionally inspecting path segments allows us to mix-and-match at each level, i.e. these paths could exist in performant harmony:

/about/  (post_type="page")
/products/ (category="products")
/about/team/ (list of post_type="person")
/georgia/ (taxonomy of state, term="georgia")
/my-new-car/ (post_type="post")
/atlanta/ (post_type of "geo-region")

Inspecting path segments also allows us match via a variety of methods that could be optimized per specific use-case (via Memcached, for example):

  • Regex
  • Array keys or values, i.e. $segments[$segment] or in_array($segment,$segments)
  • Functions, i.e. MyMatchFunc($segment)
  • Expressions, i.e. i.e. (is_numeric($segment) and $segment >=1980 and $segment <=2099)
  • Database lookup count(new WP_Query("post_type=product&name=ipad"))
  • And so on...

#29 @jacobsantos
14 years ago

I wouldn't worry about the callback at this point, I only wanted to see what your thoughts were on it and if you were going to have it as part of your plan. I think having a better Routes would allow for a controller callback system to be built on top of it for a controller system.

Right now, I guess it is possible to specify a controller component on top of the current system, similar to the way the custom post types work. However, I think all of the current code is spread out in to 3 different classes.

Are you working to consolidate the functionality into separate classes and then use the new code back in those classes? Or create new classes from new code?

#30 follow-up: @scribu
14 years ago

BTW, i'm not sharing the code for the WP_Routes class yet because I want to refactor is significant before I expose it to public scrutiny.

Release early, release often. :)

Mapping routes to query_vars

Yes, I agree that specifying the callback as part of the Routes might be preferable if we didn't have existing themes and plugins that expect the query system to work as is but I think this is the best approach considering where we are right now.

Agreed.

Optional path segments

One thing I have yet to figure out is how best to specify optional path segments. I'm loath to introduce new special characters to the URL template but I am considering using square brackets like so and would like to get other's input on this?

register_route_path('%pagename%/[%%trackback_feed_page_paged_comment_page%%]');

That's an acceptable notation, except it gets confusing if you want to specify multiple optional segments:

register_route_path('%a%/[%b%]/[%c%]');

register_route_path('%a%/[%b%][/%c%]');

register_route_path('%a%/[%b%/][%c%]');

Since we're not inspecting full paths anymore, maybe we could register an array instead of a string:

register_route_path(array('%a%', array('%b%', %c%)));

The first level elements are mandatory, while the second level elements are optional.

#31 follow-up: @jacobsantos
14 years ago

Replying to mikeschinkel:

I am aware of your advocacy of an MVC approach similar to Django and what I assume CakePHP, CodeIgnitor and Rails use but I think that would so significantly change that it is not viable, at least not in one revolutionary step. The approach I'm taking it is maintain the concept for mapping URLs to query vars and letting query vars drive the loading of content.

Not really, the systems like that take a directory controller approach, where you have a single class within a directory. I don't believe WordPress would ever adopt that system, nor do I fully believe that is the best way to do it. Easier yes, but applicable to every system? Probably not.

What I envision for WordPress is more like Zend Framework and what I kind of assumed you were going to implement. Really, I mean the Routes and URI segment code is similar enough that most libraries probably duplicate a lot of the same code. Where the differences are in the way the controller is loaded and executed.

I'm thinking it might look similar to:

<?php
wp_register_controller($route, $callback);
?>

This might look up the route and add the callback or register the route with the callback.

I do think we are looking at two different problems. My problem is that I know the controller implementation is difficult and needs a lot of work. Your problem is that the routes themselves need a better implementation. Our goals are not conflicting. My goals can be accomplished on top of your implementation and I believe I will most likely focus on supporting both the current system and yours when you develop it.

Yes, I agree that specifying the callback as part of the Routes might be preferable if we didn't have existing themes and plugins that expect the query system to work as is but I think this is the best approach considering where we are right now. Of course I'd love to hear from others on this issue like Andrew Nacin, scribu, hakre, Johnonolan etc. to see their opinion.

I think many implementations piggy back onto the Routes implementation for loading the controller. That doesn't need to be the case in this. They could be kept separate and made to just basically the controller checks the route and then loads the correct callback. This could be done now, it is just easier when the routes is created for that specific purpose instead of the way it is now with WordPress routes the main focus and extra routes something that is just tacked on as supported, but not quite fully.

One point of note; the innovation over this approach compared to the existing rewrite system is that it inspects path segments instead of full paths.

This will work much like the others, which will lead it towards a controller implementation that makes sense to those who work with the others.

#32 @mikeschinkel
14 years ago

Replying to jacobsantos:

I wouldn't worry about the callback at this point, I only wanted to see what your thoughts were on it and if you were going to have it as part of your plan. I think having a better Routes would allow for a controller callback system to be built on top of it for a controller system.

Cool. And agreed.

Right now, I guess it is possible to specify a controller component on top of the current system, similar to the way the custom post types work. However, I think all of the current code is spread out in to 3 different classes.

Yeah, kind of.

Which 3 specific classes are you referring to?

Are you working to consolidate the functionality into separate classes and then use the new code back in those classes? Or create new classes from new code?

Not sure if I understand the question 100%, but I will try to answer it anyway.

My plan for this ticket is to (hopefully) make a drop-in replacement for $wp->parse_request() using the path segment based routes instead of full path regular expression rewrites. I envision doing my best not to affect any other part of WordPress with this route system.

I would like to implement it as a plugin (at least initially) which means I will likely be requesting one or more hooks to enable this. I want to make it as compatible as possible with existing plugin and hooks and of course fully compatible with core.

I want to enable full fidelity with the external view of a website; IOW a site should be able to switch completely from rewrites to the routing system without changing or having to 301 redirect a single URL. (Of course I'd like to give the site owner a constant definable option to use a cleaner set of URLs without the messy edge cases for those not concerned with backward compatibility, but that's a nice-to-have, not a must-have.)

I think it will be possible to switch to a "compatibility mode" where core could use this system instead of rewrites and behave 100% identically. This could be made possible by having core either back-end the existing functions add_rewrite_tag(), add_permastruct() and others or to bypass them and use methods of specifying URLs like (similar to) those shown above. I anticipate enabling both of these options, most likely selectable via defining a constant in wp-config.php.

I also believe it will be possible in this compatibility mode for most plugins that interact with rewrites to use the rewrite system and thus still transparently support the route system; i.e. if plugins use functions like add_rewrite_tag() and add_permastruct() this should be possible. However, for those functions that do string search-and-replace on the existing rewrite rules, like the hook I showed above, I think it will likely not be possible to have them transparently support the route system.

I'm also envisioning dual usage where rewrites can continue to be used but if not matched then the routes system will take over. I see this as how compatibility with most older plugins could be maintained while having core and newer/updated plugins could use the route system. Additionally, this could be switched to where the route system could take priority and fall back to the rewrite system if desired by the user by them defining a constant in wp-config.php.

Assuming this routes system were adopted into core for v3.1 (for example) I envision that the most compatible mode would initially be used on both new installs and upgrades. Then maybe by v3.3 after most currently maintained plugins have switched to using routes WordPress could enable a "transitional mode" for new installs where the route system takes priority. OTOH upgrades to existing installs would continue to use compatibility mode for v3.3. Later, say maybe v3.5 WordPress could allow upgrades to use routes if upgraded from a version that already supports routes. Eventually, maybe v4.0, rewrites would be gone completely (or maybe not.)

Definable constant selectable modifications to this behavior could include having WordPress inspect plugins for likely incompatibility and offer to switch to transitional compatibility mode if no likely incompatibilities are found. Further there could be a "no compatibility mode" that advanced users could select to ensure that their plugins and themes only use routes and not rewrites. And I'm sure there might even be other "modes" to address concerns I've yet to recognize.

Finally I envision the ability for plugins to register support for routes which could also include plugins registering that they do not use rewrites even if they don't affect URL routing so that routes could be used during an upgrade as soon as all plugins in use support routes.

Does that answer the question?

#33 in reply to: ↑ 30 @mikeschinkel
14 years ago

Replying to scribu:

BTW, i'm not sharing the code for the WP_Routes class yet because I want to refactor is significant before I expose it to public scrutiny.

Release early, release often. :)

Heh. Point taken.

I will as soon as I think I'll have time to follow up (right now I don't have time to code anything more for a little while, at least.)

Optional path segments

One thing I have yet to figure out is how best to specify optional path segments. I'm loath to introduce new special characters to the URL template but I am considering using square brackets like so and would like to get other's input on this?

register_route_path('%pagename%/[%%trackback_feed_page_paged_comment_page%%]');

That's an acceptable notation, except it gets confusing if you want to specify multiple optional segments:

register_route_path('%a%/[%b%]/[%c%]');

register_route_path('%a%/[%b%][/%c%]');

register_route_path('%a%/[%b%/][%c%]');

Since we're not inspecting full paths anymore, maybe we could register an array instead of a string:

register_route_path(array('%a%', array('%b%', %c%)));

The first level elements are mandatory, while the second level elements are optional.

I like that as an idea, but I would like to allow for specifying via a string too. IOW, it would be conceptually similar to how both of these work the same:

$drafts_query = new WP_Query( "post_type=post&post_status=draft&posts_per_page=1&orderby=modified&order=DESC&&author=".$GLOBALS[current_user]->ID);

$drafts_query = new WP_Query( array(
	'post_type' => 'post',
	'post_status' => 'draft',
	'posts_per_page' => 1,
	'orderby' => 'modified',
	'order' => 'DESC',
	'author' => $GLOBALS['current_user']->ID,
) );

BTW, explicitly stated I'm thinking that register_route_path() would define the last path segment and allow the other path segments to to be used internally as selectors, i.e. the following:

register_route_path('%foo%/%bar%/[%baz%]');

register_route_path('%foo%/%bar%/%baz%',array(
	'optional' => true,
) );

would both be an appropriate way to indicate that "%baz%" is an optional child of the path "%foo%/%bar%". Internally is would explode() on slashes, navigate and/or build the path tree, and then set (or modify) the properties for the last path segment.

We could even potentially specify it this way as an alternate to the above (i.e. we could support both approaches):

register_route_segment('%baz%',array(
  'parent_path' => array('%foo%','%bar%'),
  'optional' => true,
));


BTW, I'm wondering if "optional" or "is_optional" is a better name for the property?

Also, if a segment is partially optional (i.e. "foo/%foo%[-%foo_num%]") is it better to add a "is_partially_optional" property or having both "is_var" and "is_optional" to be true to indicate that it is partially optional? (I'm presuming that when routes are pre-processed during plugin activation that details like this can be pre-evaluated so that all that is needed for the normal case is to unserialize() the route tree.)

One more important note: AFTER WORKING ON THIS MUCH OF THE DETAILS OF THE OPENING WRITEUP FOR THIS TICKET HAVE CHANGED AS I HAVE LEARNED WHAT'S ACTUALLY REQUIRED. SO IT IS IMPORTANT IF THIS TICKET INTERESTS YOU FOR YOU TO READ IT ALL.

#34 in reply to: ↑ 31 @mikeschinkel
14 years ago

Replying to jacobsantos:

Replying to mikeschinkel:

I am aware of your advocacy of an MVC approach similar to Django and what I assume CakePHP, CodeIgnitor and Rails use but I think that would so significantly change that it is not viable, at least not in one revolutionary step. The approach I'm taking it is maintain the concept for mapping URLs to query vars and letting query vars drive the loading of content.

Not really, the systems like that take a directory controller approach, where you have a single class within a directory. I don't believe WordPress would ever adopt that system, nor do I fully believe that is the best way to do it. Easier yes, but applicable to every system? Probably not.

What I envision for WordPress is more like Zend Framework and what I kind of assumed you were going to implement. Really, I mean the Routes and URI segment code is similar enough that most libraries probably duplicate a lot of the same code. Where the differences are in the way the controller is loaded and executed.

Sounds like you are more familiar with the specifics of those systems than me. I admittedly don't know anything about how Zend goes about it's routing though I'm sure it would be useful for me to study it just so I'll know. I'll add to my (ever growing) reading list. :)

I'm thinking it might look similar to:

<?php
wp_register_controller($route, $callback);
?>

This might look up the route and add the callback or register the route with the callback.

While I'm not familiar with the structures of $route and callback I presume they are similar in concept to what we are discussing.

It would seem with the structure I'm working on it should be possible to extend to something like this:

register_route_segment('%baz%',array(
  'parent_path' => array('%foo%','%bar%'),
  'optional' => true,
  'callback' => 'baz_with_the_foo_and_the_bar(),
));

While I don't think it would be a good idea to pursue this direction for v3.1 I do think we should ensure it is absolutely possible for you to extend it in that manner via hooks or other extensibility options.

I do think we are looking at two different problems. My problem is that I know the controller implementation is difficult and needs a lot of work. Your problem is that the routes themselves need a better implementation. Our goals are not conflicting. My goals can be accomplished on top of your implementation and I believe I will most likely focus on supporting both the current system and yours when you develop it.

I think many implementations piggy back onto the Routes implementation for loading the controller. That doesn't need to be the case in this. They could be kept separate and made to just basically the controller checks the route and then loads the correct callback. This could be done now, it is just easier when the routes is created for that specific purpose instead of the way it is now with WordPress routes the main focus and extra routes something that is just tacked on as supported, but not quite fully.

Yes, I would agree with that, and glad to hear that you think my work could be something you could build on rather than bypass.

One point of note; the innovation over this approach compared to the existing rewrite system is that it inspects path segments instead of full paths.

This will work much like the others, which will lead it towards a controller implementation that makes sense to those who work with the others.

Awesome!

Glad I'm going in the right direction for a change! (that's a general comment, not aimed at anyone in particular. :)

#35 follow-up: @F J Kaiser
14 years ago

  • Cc 24-7@… added

+1

Sounds like something important is moving on planet wordpress. I'm glad, that such an amount of good coders are interessted in this (exluding myself from "good coders").

Another benefit i can see (if i fully understood everything) for theme-developers: This would reduce the list of define('PATH'..) in functions.php and identify path-segments (for combining the with $example = get_bloginfo('url') . get_part_of_path "/bla.php"; - note that get_part_of_path is just a custom function i just thought of) on the fly.

So: Thanks a lot for this ticket! I keep reading.

#36 in reply to: ↑ 35 ; follow-up: @mikeschinkel
14 years ago

Replying to F J Kaiser:

+1

Sounds like something important is moving on planet wordpress. I'm glad, that such an amount of good coders are interessted in this (exluding myself from "good coders").

Thanks so much for the positive comments. It really helps to keep us motivated when we get positive feedback.

Another benefit i can see (if i fully understood everything) for theme-developers: This would reduce the list of define('PATH'..) in functions.php and identify path-segments (for combining the with $example = get_bloginfo('url') . get_part_of_path "/bla.php"; - note that get_part_of_path is just a custom function i just thought of) on the fly.

This sounds interesting but I don't fully understand what you are envisioning. Can you please elaborate with maybe several specific use-case examples?

#37 @mikeschinkel
14 years ago

Here's a link to a wp-hackers discussion that covers more use-cases this ticket is in part aiming to solve.

http://lists.automattic.com/pipermail/wp-hackers/2010-February/030333.html

#39 in reply to: ↑ 36 ; follow-up: @F J Kaiser
14 years ago

Another benefit i can see (if i fully understood everything) for theme-developers: This would reduce the list of define('PATH'..) in functions.php and identify path-segments (for combining the with $example = get_bloginfo('url') . get_part_of_path "/bla.php"; - note that get_part_of_path is just a custom function i just thought of) on the fly.

This sounds interesting but I don't fully understand what you are envisioning. Can you please elaborate with maybe several specific use-case examples?

Just for a short example (i had no time to search my themes completely): A test if $is_login is true:

$url_parts = parse_url( $_SERVERREQUEST_URI? );
$path_parts = pathinfo( $url_partspath? );
$dir_parts = explode( "/", $path_partsdirname? );
$dirname = end($dir_parts);
$filename = $path_partsbasename?;

$is_login = 'wp-login.php' == $filename;

I hope that helps to give some insights of what i meant. Thanks.

#40 in reply to: ↑ 39 @mikeschinkel
14 years ago

Replying to F J Kaiser:

Ah, thanks. It would actually be easy to create some standard functions like this without replacing the current system but yes, if we focus on path segments than function that inspect the path segments would be more obvious. I'll remember that when I work on the functionality (after I complete some projects that are becoming overdue. :)

#41 @aesqe
14 years ago

  • Cc aesqe@… added

#43 @jacobsantos
14 years ago

I think the fundamentals is that the controller implementation is wrong and building upon it would be trying to make a turd not a turd or beautify the turd to look less of what it is.

I did some work on this and while I was unable to completely get the WordPress compatibility working, it isn't long away. Looking at it again, the dispatcher needs to have the callback as part of the route in order to be able to call the controller when the route is found.

Check Routes -> Route is Found -> Get Controller -> Dispatch controller.
Or Router -> Dispatcher -> Controller.

I was hoping to disconnect the router from the dispatcher, which after developing a router implementation, I was thinking, it would be extremely roundabout and difficult. I will attempt to get the initial draft up within the next few days to see what I mean. I should have the dispatcher and WordPress compatibility working in that time.

#44 @jacobsantos
14 years ago

  • Cc jacobsantos added

#45 follow-up: @nacin
14 years ago

  • Milestone changed from 3.1 to Future Release

Until a massive task such as this one is slated for a milestone in a scope meeting, it should carry the Future Release milestone.

It can certainly be suggested at the 3.1 scope meeting, though I doubt we will have the stomach for it, given the possibility that the 3.1 dev cycle will be a unique one.

#46 in reply to: ↑ 45 ; follow-up: @mikeschinkel
14 years ago

Replying to nacin:

Until a massive task such as this one is slated for a milestone in a scope meeting, it should carry the Future Release milestone.

It can certainly be suggested at the 3.1 scope meeting, though I doubt we will have the stomach for it, given the possibility that the 3.1 dev cycle will be a unique one.

Agreed on the timing. This is a long term thing that needs to be gotten right, not rushed through to hit a release cycle -- just look at menus for what happens when the latter is done. :-(

#47 in reply to: ↑ 46 ; follow-up: @nacin
14 years ago

Replying to mikeschinkel:

Agreed on the timing. This is a long term thing that needs to be gotten right, not rushed through to hit a release cycle -- just look at menus for what happens when the latter is done. :-(

This isn't the forum, but I strongly disagree. We were aiming for v1 when we set the scope, and we ended up lapping that three times over. All features need time to fully develop and mature. Custom taxonomies weren't supported until 2.5 even though the schema was in place in 2.3; likewise, we didn't support any UI for them until 2.8, and it was only for non-hierarchical taxonomies. That doesn't mean 2.3 or 2.5 were rushed.

We got menus right. And we left room for expansion and enhancement. That's what we aimed for, and that's how it's done.

#48 in reply to: ↑ 47 @mikeschinkel
14 years ago

Replying to nacin:

This isn't the forum, but I strongly disagree. ... We got menus right. And we left room for expansion and enhancement. That's what we aimed for, and that's how it's done.

Guess I hit a hot button, sorry for that.

But I have to disagree that you got them right. They are good, but not great. I've been doing a lot of work with them and I see a lot of rough edges that may cause plugin compatibility issues later and some design decisions that may paint WordPress into corners it cannot get out of.

But you are correct, this is not the forum. But I do hope that when the right forum appears I can provide detailed constructive feedback on menus and that it will considered so that my providing the feedback won't have been a waste of time.

Please note, I do really want to help, it's not my desire to throw stones.

#49 follow-up: @hakre
14 years ago

Let's just formulate it the other way round: Let's make this feature done in such a way, that it doesn't produce more tickets then it has been created for to close.

I like the idea to start to look for places where to put in filters so to be able to prototype ideas and test them.

#50 in reply to: ↑ 49 @jacobsantos
14 years ago

Replying to hakre:

Let's just formulate it the other way round: Let's make this feature done in such a way, that it doesn't produce more tickets then it has been created for to close.

I like the idea to start to look for places where to put in filters so to be able to prototype ideas and test them.

I don't think that is reasonable. The HTTP API was tested, worked fine for those that tested it, but it wasn't until it got out into the wild that many, many finer points of PHP and PHP bugs came up. Sure dealing with routing bugs is different from PHP language bugs and safe mode bugs. It is the generally nature of things that it won't be 100% for everyone.

While I agree and I believe I commented on the wpdevel, that I think we could take this with steps, implementing non-destructive additions to WordPress and then once plugin developers and themers test it and see that it works, we start ripping out parts of WordPress and replacing it more of the routing system.

For me personally, I believe there are already hooks in place. Well, for my stuff at least, Mike's stuff might need more hooks to function properly.

@jacobsantos
14 years ago

Prototype for router.

#51 follow-up: @jacobsantos
14 years ago

Mike, I'm not going to be working on this. I do wish you luck in your implementation. I was reading through some old wp-hackers discussions on this from last year. I find it difficult to believe that it has been almost a year since everyone discussed it. I believe that when something is continuously brought up, that someone should step up and attempt to correct it.

Well, good luck. I hope you don't become demotivated from working on it. I'm bowing out from this.

If someone can remove my uploaded file, I would like that.

#52 in reply to: ↑ 51 @mikeschinkel
14 years ago

Replying to jacobsantos:

Mike, I'm not going to be working on this. I do wish you luck in your implementation. ... I hope you don't become demotivated from working on it. I'm bowing out from this.

Thanks. I won't get demotivated, just anxious to find a break in my client projects to work on it (I just spent the entire US holiday weekend coding WordPress plugins for clients and still not done; I need a vacation!)

#53 @mikeschinkel
14 years ago

Note that Austin Matzko proposed something very similar to the current status of this ticket back in Jan 2009 on wp-hackers:

http://comox.textdrive.com/pipermail/wp-testers/2009-January/011113.html

#54 follow-up: @hakre
14 years ago

Ref: #14201 - a good example where the problems in a current implementation can lie. Generating of Links, Permalinks-Structure, Permalinks-Parser, WP (the class) and even the canonical redirects are out of synch.

So Router and Link generation are a hightly connected pair and must be built on a specification they share.

#55 in reply to: ↑ 54 @mikeschinkel
14 years ago

Replying to hakre:

Ref: #14201 - a good example where the problems in a current implementation can lie. Generating of Links, Permalinks-Structure, Permalinks-Parser, WP (the class) and even the canonical redirects are out of synch.

Very interesting, thanks for sharing.

So Router and Link generation are a hightly connected pair and must be built on a specification they share.

So how do you think this should affect the direction I was proposing in the latter parts of this ticket?

#57 @sorich87
13 years ago

  • Cc sorich87@… added

#58 @BjornW
13 years ago

  • Cc mailings@… added

#59 @jamie.richard
13 years ago

  • Cc jamie.richard added

#60 follow-up: @scribu
13 years ago

After working with WP_Rewrite even more, I think this ticket's focus is on the wrong thing.

Ordered URL regex patterns are used in many other CMSs/frameworks and are a pretty concise and powerfull way of defining rewrite rules, with limitations of course.

But the perceived lack of flexibility of WP_Rewrite isn't due to regexes, but due to the way these patterns are generated and stored.

It's akward to re-order rules, for example. Also, they're not grouped in any way after they're generated: you can't easily filter out only the taxonomy rewrite rules, or just the rules for a particular post type etc.

So, although there certainly are use cases for parsing individual path segments - this can already be done (one example) - it shouldn't be the main way of defining rewrite paths.

We should focus instead on making it easier to manipulate the existing regex rules.

#61 @scribu
13 years ago

I'm thinking about an intermediary representation in between permastructs and the final rules array.

A rule would look like this:

array(
  'pattern' => 'regex string',
  'args' => array( 'query_var' => preg_index_nr )
)

Example:

array(
  'pattern' => 'tag/([^/]+)/page/?([0-9]{1,})/?$',
  'args' => array( 'tag' => 1, 'paged' => 2 )
)

which would lead to:

'tag/([^/]+)/page/?([0-9]{1,})/?$' => index.php?tag=$matches[1]&paged=$matches[2]

This doesn't solve the re-ordering problem, but I think it's a good start.

#62 in reply to: ↑ 60 ; follow-up: @mikeschinkel
13 years ago

Replying to scribu:

But the perceived lack of flexibility of WP_Rewrite isn't due to regexes, but due to the way these patterns are generated and stored.

You misquote me. It's not a lack of flexibility of regular expressions; it's the requirement to match each URL in it's entirety that makes it very difficult for most people to set URL routes for anything beyond trivial.

Remember the old saw:

"I had a programming problem and decided to use regular expressions. Then I had two problems."

So, although there certainly are use cases for parsing individual path segments - this can already be done (one example) - it shouldn't be the main way of defining rewrite paths.

We should focus instead on making it easier to manipulate the existing regex rules.

Can you please explain the logic behind your conclusion and why it makes sense to continue to flatten a tree structure for matching? Also please address the subsequent duplication required to match similar leaf nodes (like permastructs) rather than to match a tree structure using a tree structure, as proposed?

Also, can you help me understand why you'd advocate maintaining a system that generates and map upwards of 1000 regular expresssions for a complex CMS when with a tree structure the matching would be 1 to 2 orders of magnitude less per page load?

And please address why retaining the complexity of the regular expressions as used in the rewrite system makes sense vs. much more straightforward simple match rules that mere mortals are much more likely to understand (with fallback to RegEx, of course?)

#63 in reply to: ↑ 62 ; follow-up: @scribu
13 years ago

Replying to mikeschinkel:

Replying to scribu:

But the perceived lack of flexibility of WP_Rewrite isn't due to regexes, but due to the way these patterns are generated and stored.

You misquote me. It's not a lack of flexibility of regular expressions; it's the requirement to match each URL in it's entirety that makes it very difficult for most people to set URL routes for anything beyond trivial.

I didn't quote you to begin with.


Also, can you help me understand why you'd advocate maintaining a system that generates and map upwards of 1000 regular expresssions for a complex CMS when with a tree structure the matching would be 1 to 2 orders of magnitude less per page load?

I advocate maintaining this system for normal use. Complex CMSs can do tree structure matching, as needed.


And please address why retaining the complexity of the regular expressions as used in the rewrite system makes sense vs. much more straightforward simple match rules that mere mortals are much more likely to understand (with fallback to RegEx, of course?)

The system you propose is not actually that simple at all. You just replace the regex engine with PHP code, removing one common skillset with a very specific one (knowing the tree structure API and algorithm).

Last edited 13 years ago by scribu (previous) (diff)

#64 @scribu
13 years ago

The evolution has begun: #16687

#65 in reply to: ↑ 63 @mikeschinkel
13 years ago

  • Resolution set to wontfix
  • Status changed from new to closed

Replying to scribu:

I didn't quote you to begin with.

If not, my apologies.

I advocate maintaining this system for normal use. Complex CMSs can do tree structure matching, as needed.

Okay...

The system you propose is not actually that simple at all.

Actually, this ticket needs to be closed, which I will do. It was an exploration and it's original emphasis has evolved greatly so viewed from the beginning it will likely confuse people more than not.

You just replace the regex engine with PHP code, removing one common skillset with a very specific one (knowing the tree structure API and algorithm).

I disagree that knowledge of regex is a common skillset among the vast majority of WordPress professionals (who are mostly designers.) Yes you know regex well, but you are in a very tiny minority.

Worse, the way regexes are used with WordPress for URL rewriting requires consideration of the entire URL space for a site when matching instead of just considering a path segment. Yes, what I propose would adds more complexity into the guts of WordPress, it would remove complexity for the average person designing sites.

Maybe the problem is that you simply can't visualize what I have in mind and if so then that's my fault for not communicating it well enough yet. But with this as a base to start from I'll be creating a plugin that will implement what I am visualizing so that you'll be able to evaluate it better that you currently can.

#66 @scribu
13 years ago

  • Milestone Future Release deleted

Ok, I agree that this ticket has become a mess.

#67 @mikeschinkel
13 years ago

Related follow up: #16692

Note: See TracTickets for help on using tickets.