#5305 closed defect (bug) (fixed)
permalinks broken when article name is numeric
Reported by: | thomask | Owned by: | valendesigns |
---|---|---|---|
Milestone: | 4.3 | Priority: | normal |
Severity: | major | Version: | 2.3.1 |
Component: | Permalinks | Keywords: | has-patch commit |
Focuses: | Cc: |
Description
if you create numeric-only post name, the generated slug is this number - this conflicts with article ID, so it returns different article or 404 page, never the article. It can be then solved by generating manual slug with some char in it, but i think it would be better to include some char in that case, e.g. underscore, like _123
Also if someone will try to solve this, it would be nice to solve other problem - if post slug is begining with the slug of the category, than the category page returns that post, not the category
Attachments (24)
Change History (140)
#2
@
17 years ago
I don't think he's using permalinks. I think he's saying, like if his post slug is 123 and his title is 123, it breaks. It looks like 2.4 trunk doesn't break like that though. Unless it is a specific permalink structure.
#3
@
17 years ago
My structure is /archief/%postname%/, and I see the same problem. I've simply re-slugged the offending posts to the number in words, so 30 would become "thirty" (or in my case, "dertig", as I'm Dutch).
I didn't have this problem when my structure was /archief/year/month/day/postname/ though.
#4
@
17 years ago
Actually, I am able to verify that this does happen. If you have a custom permalink, where you have something like /archives/%postname%/ or /archives/post/%postname%/ and you publish a post with a numeric title such as 2008, when you click to view the post, you get a 404 error. In all my testing, this only occurs with this type of a permalink where you have a static word followed by a forward slash and then the dynamic postname.
If you try /archive/%author%/%postname/ or put any of the other dynamic options between the static word and the dynamic postname, the link will work fine.
I verified this in 2.3.1 and in 2.4 trunk.
#5
@
17 years ago
Also, if you have a custom permalink that is just /%postname%/ you get a 404 error on a post with a numeric title.
As for the other problem that was mentioned concerning if a post slug is beginning with the slug of the category, than the category page returns that post, not the category. I have been unable to recreate it.
#6
@
17 years ago
Just a note, Ryan in comment 25 on ticket #3614 (http://trac.wordpress.org/ticket/3614#comment:25) explains why the 404 errors occur with permalinks like those that I tried (I don't personally use them myself) or that others use. I have yet to test the fix that they have come up with on that ticket to see if it solves the problem here(although, it appears as though you would still need to modify your permalinks structure to not be like those above). When I get a chance I'll test this ticket based on ticket #3614.
#8
follow-up:
↓ 9
@
16 years ago
I've just a had a similar ticket closed as a duplicate, although I'm not 100% sure from the comments here that people have grasped what the problem is.
If you need examples, check the info on http://trac.wordpress.org/ticket/6997
To summarise, I create a page (lets call it "a"). The slug is also "a". I then create another page, give it a slug of "2008" and set "a" to be its parent. When I try to navigate to /a/2008/, it simply shows me the content of "a", rather than "2008". If I add anything else onto the end (ie a/2008/a/) it redirects me to a post written in 2008.
I just seems to be the case that as soon as it sees 2008, it tries to guess what I want, rather than just checking to see if there is actually a page called 2008.
#9
in reply to:
↑ 8
@
16 years ago
By the look of it, Its reconising the 2008 as a page value(ie. The 2008th page):
object(WP_Query)[3] public 'query_vars' => & array 'page' => string '/2008' (length=5) 'pagename' => string 'a' (length=1)
(Which in itself seems to be a bug that /2008 is reconised as the page number, it shouldnt have the slash in there)
The rule which is being matched is:
[(.+?)(/[0-9]+)?/?$] => index.php?pagename=$matches[1]&page=$matches[2]
I can see 2 ways to resolve it:
- Revert to the old method of actually having *every* page slug being checked for, which is slow, messy, and really isnt going to happen
- Limit pages to only have a certain ammount of pages (Why remove functionality?)
- Prefix extra pages with /page/
More on that last one: Most common permalink structures in WordPress allready use that method (.*/page/[0-9]+).
Changing the current permalink settings will break links on pages at present (ie. /a/2/ for page 2) in search results, But would allow for numeric page slugs on subpages.
#11
@
16 years ago
I also had a ticket about this closed and I had no idea it existed already, sorry; as I had no such problems in any version prior to 2.5.1. I'm confused now. How come that the issue occurs in different versions of WP for each person?
#13
@
16 years ago
For anyone looking to do this meanwhile and is willing to give up the paging ability, here's a plugin I wrote:
<?php /* Plugin Name: Allow Numeric Stubs Description: Allows children Pages to have a stub that is only a number. Sacrifices the <code><!--nextpage--></code> ability in Pages. Version: 2008.07.23 Author: Viper007Bond Author URI: http://www.viper007bond.com/ */ // Register plugin hooks register_activation_hook( __FILE__, 'allow_numeric_stubs_activate' ); add_filter( 'page_rewrite_rules', 'allow_numeric_stubs' ); // Force a flush of the rewrite rules when this plugin is activated function allow_numeric_stubs_activate() { global $wp_rewrite; $wp_rewrite->flush_rules(); } // Remove the offending rule and replace it with something else function allow_numeric_stubs( $rules ) { unset( $rules['(.+?)(/[0-9]+)?/?$'] ); $rules['(.+?)?/?$'] = 'index.php?pagename=$matches[1]'; return $rules; } ?>
#14
@
16 years ago
This never happened to me in 2.3.2, but happens in 2.5. I'm having the same problem described by chrismou in comment 8.
Duplicated in #7444.
#15
@
16 years ago
Above code released as a plugin BTW: http://wordpress.org/extend/plugins/allow-numeric-stubs/
#19
@
15 years ago
- Component changed from General to Permalinks
- Owner changed from anonymous to ryan
#22
follow-up:
↓ 23
@
13 years ago
Just stumbled upon this. Steps to reproduce:
- Create a post with slug
1
. - Set permalink structure to
/%year%/%monthnum%/%postname%/
. - Then we have request:
2008/12/1
- Matched rule:
year=2008&monthnum=12&day=1
Other permalink structures (Day and name, Numeric, even /%postname%/
mentioned above) work fine.
#23
in reply to:
↑ 22
;
follow-up:
↓ 24
@
13 years ago
Replying to SergeyBiryukov:
- Create a post with slug
1
.
I don't believe that's possible. WordPress will currently change that to 1-2
as far as I know.
#24
in reply to:
↑ 23
@
13 years ago
Replying to Viper007Bond:
I don't believe that's possible. WordPress will currently change that to
1-2
as far as I know.
That's for pages. It's still possible to create a post with a numeric-only slug.
#28
@
12 years ago
This is the date rewrite rules being generated regardless of the post_permastruct. Because the post permastruct Sergey listed doesn't match any form of endian, this date permastruct wins:
if ( empty($date_endian) ) $date_endian = '%year%/%monthnum%/%day%';
Because of that, day archives are generated which wipe out post permastruct when the slug matches a 1-2 digit integer. Here's an abbreviated list of the rules that are generated:
array ( '([0-9]{4})/([0-9]{1,2})/([0-9]{1,2})/page/?([0-9]{1,})/?$' => 'index.php?year=$matches[1]&monthnum=$matches[2]&day=$matches[3]&paged=$matches[4]', '([0-9]{4})/([0-9]{1,2})/([0-9]{1,2})/?$' => 'index.php?year=$matches[1]&monthnum=$matches[2]&day=$matches[3]', '([0-9]{4})/([0-9]{1,2})/page/?([0-9]{1,})/?$' => 'index.php?year=$matches[1]&monthnum=$matches[2]&paged=$matches[3]', '([0-9]{4})/([0-9]{1,2})/?$' => 'index.php?year=$matches[1]&monthnum=$matches[2]', '([0-9]{4})/page/?([0-9]{1,})/?$' => 'index.php?year=$matches[1]&paged=$matches[2]', '([0-9]{4})/?$' => 'index.php?year=$matches[1]', '([0-9]{4})/([0-9]{1,2})/([^/]+)(/[0-9]+)?/?$' => 'index.php?year=$matches[1]&monthnum=$matches[2]&name=$matches[3]&page=$matches[4]', )
Similar issue: #13701
This ticket was mentioned in IRC in #wordpress-dev by SergeyBiryukov. View the logs.
11 years ago
#36
@
11 years ago
Posting the entire #26974 ticket's description for reference:
To reproduce, create a new blog, select that permalink structure, and publish an untitled post. View it. Expected: the post shows up. Actual: the day shows up empty — unless, of course, the post's day matches the post's id.
Minor, but annoying. Temporary fix to disable day links entirely:
add_filter('date_rewrite_rules', function($date_rewrite_rules) { if (get_option('permalink_structure') == '/%year%/%monthnum%/%postname%/') { $date_rewrite_rules = array_filter($date_rewrite_rules, function($rule) { return strpos($rule, '&day=') === false; }); } return $date_rewrite_rules; });
Methinks disabling day links should be the default behavior in WP when that structure is chosen.
#40
@
10 years ago
- Keywords has-patch added; needs-patch removed
Seems to me that the only real solution to this problem is going to be to prefix those numeric slugs with some alpha-dash-text of some sort; otherwise, it will always get confused with any numeric value expected in the permalink, and thus interpreted incorrectly. Prefixing it with something as nondescript as '_' seems like the wrong solution, if for no other reason than _123 just looks weird in a URL.
My solution simply prefixes a numeric slug with 'number-' ( http://example.com/number-123/ ). It is longer, and one could argue that it is equally nondescript, but at least you know it is a number and it does not get confused with expected numeric values in the permalink. This is a blanket patch to the slug generation function, so it applies to all post_types. As requested, this prefix allows any post type to work in any of the standard permalink structures. As a party favor to all those devs out there who love their filters (like me), I also have a filter on 'number-' that only gets called when it is needed. As a fallback to all of this, if 'number-123' is not a nice permalink for a specific person, we obviously still have the ability to manually override the permalink, just not with anything numeric.
The original question also included a secondary issue, where apparently post_name values that were prefixes to a category slug used to force the category page to load instead of the post page. Based on the current permalink class, this sounds like it is already solved and not even possible any more. Even the old category slug - parent page slug collision problem seems solved. This is because categories and tags are all prefixed with a slug now. Matching clashes with that specified slug seems unnecessary, so I assume we are just dropping that second request.
Thus I will attach the patch.
#41
follow-up:
↓ 42
@
10 years ago
number
must be translatable, and then you end up with something like číslo
. That would probably raise new issues. I don’t think this is a robust solution.
#42
in reply to:
↑ 41
@
10 years ago
Replying to toscho:
number
must be translatable, and then you end up with something likečíslo
. That would probably raise new issues. I don’t think this is a robust solution.
You make a good point that number
needs to be translatable. Thus, I have updated the patch to include that in the solution. The second part of your objection is moot however. The slug sanitization process already handles sanitization of translations. If that is really an objection, then the sanitization functions need redesign, though I think you will find they don't. As for the 'robustness' of the solution, it is meant to be simple. If there is something I missed as far as the solution is concerned in the ways of 'robustness', feel free to suggest them, and I will try to work them in. Thanks again for the comment. Anything to make it better.
This ticket was mentioned in IRC in #wordpress-dev by SergeyBiryukov. View the logs.
10 years ago
This ticket was mentioned in Slack in #forums by radices. View the logs.
10 years ago
This ticket was mentioned in Slack in #forums by radices. View the logs.
10 years ago
#46
@
10 years ago
- Keywords needs-patch good-first-bug added; has-patch removed
- Milestone changed from Future Release to 4.2
Let's just prepend or append the number with an abstract character (maybe -
?) and avoid translations.
Bringing it into 4.2 for consideration - needs a patch soon though.
#47
@
10 years ago
- Keywords has-patch added; needs-patch removed
Replying to ericlewis:
Let's just prepend or append the number with an abstract character (maybe
-
?) and avoid translations.
Bringing it into 4.2 for consideration - needs a patch soon though.
The problem with prepending a hyphen alone is that it's technically still numeric and will run a second time when saving which then prepends a double hyphen --
.
I've updated the patch to prepend the value of $post_type
together with a hyphen to avoid translations. So you get something like post-123
or page-123
. As well, I've added proper documentation to the new filter.
The one caveat is we are going to be changing slugs without any notice to the end user. So I created a second patch to allow this feature to be filtered off. That way if you want your slugs unmolested because you use a compatible permalink structure you're not completely without options.
#48
follow-up:
↓ 49
@
10 years ago
I don't like this last patch, as it introduces potential localization problems.
Prepending or appending a hyphen might technically be numeric, but it doesn't get caught by the rewrite regex, and that's the only thing that counts. Imho, it's a better option that trying to introduce some arbitrary string that screams "translate me".
Along similar lines, we could prepend and/or append a colon :
to the post's slug instead of a hyphen, in order to get urls like /2015/01/:23/
or /2015/01/:23:/
. (This is assuming, of course, that sanitize_title wouldn't balk at the input. Whereas it actually might...)
Anyway, my concern is these potential solutions don't really fix the problem. They merely sweep things under the rug in the hopes of not allowing the bug in the future while not addressing the multitudes of sites that exhibit the issue with various duct tape fixes already. That and the other solutions discussed here merely stick to sweeping it under the rug so as to try to avoid it from occurring.
If the latest patch or a variation thereof gets applied, I'm left wondering what to do with the existing posts on my site that led me to report a bug, and I'd gather other reporters in the thread are left asking for more as well.
Be it #22592, #26974, #22592 or #24294, #20185, #26641, the issue seems to invariably breaking down to one of:
/yyyy/mm/slug/
conflicting with/yyyy/mm/dd/
/yyyy/slug/
conflicting with/yyyy/mm/
/slug/
conflicting with/yyyy/
On my own site, I disabled day links. It's not a one-size fits all solution, since for all I know I'd like these day links to get recognized in the event a user tries an arbitrary day link, but it was the right thing to do.
It seems to me that the better solution would be something like this:
- On permalink save (and on the next WP upgrade), test for permalink conflict potential. Suppose a two or four digit slug. Does a day rule catch it? The same for month and year rules. If yes, flag the structure as potentially leading to conflicts (as in year, month, day, or none).
- On the appropriate date archive type where we know conflicts may occur (year, month or day), try treating the offending parameter as if it was a post ID. This is a nearly free query -- we're looking for a single post. If a post is found, serve the expected post. If not, either fallback to serving the date archive as usual, or redirect to something appropriate (e.g. day url to month url, month url to year url, year url to blog url).
The upside of this suggestion is it "just works" from the end-user's point of view. I may be wrong, but I can't imagine a site owner caring the slightest bit about offending date archives having a slightly non-standard behavior if he opted for a permalink structure that basically invites potential conflicts.
#49
in reply to:
↑ 48
;
follow-up:
↓ 54
@
10 years ago
Replying to Denis-de-Bernardy:
On the appropriate date archive type where we know conflicts may occur (year, month or day), try treating the offending parameter as if it was a post ID. This is a nearly free query -- we're looking for a single post. If a post is found, serve the expected post. If not, either fallback to serving the date archive as usual, or redirect to something appropriate (e.g. day url to month url, month url to year url, year url to blog url).
Makes the most sense to me.
#50
@
10 years ago
Yeah. Just to clarify, I meant "as if it was a post slug", rather than as a post ID.
#51
follow-up:
↓ 52
@
10 years ago
How about padding the slug with a leading zero?
I found another bug, not sure if there is a ticket for it.
- Set the permalink structure to
%postname%
. - Create a post with slug
foo-bar
. - Create a page with the same slug
foo-bar
.
Now the URL example.com/foo-bar/ loads the page created in step 2.
Problem lies in wp_unique_post_slug()
which checks for uniqueness only within the post type.
#54
in reply to:
↑ 49
@
10 years ago
- Keywords needs-patch needs-unit-tests added; good-first-bug has-patch removed
Replying to SergeyBiryukov:
Replying to Denis-de-Bernardy:
On the appropriate date archive type where we know conflicts may occur (year, month or day), try treating the offending parameter as if it was a post ID. This is a nearly free query -- we're looking for a single post. If a post is found, serve the expected post. If not, either fallback to serving the date archive as usual, or redirect to something appropriate (e.g. day url to month url, month url to year url, year url to blog url).
Makes the most sense to me.
I'll start looking into a better long term solution, which doesn't force a prepend or append on a numeric slug.
#55
@
10 years ago
- Keywords needs-unit-tests removed
Disregard the last patch it doesn't have full coverage, but is on the right path.
#56
@
10 years ago
- Keywords has-patch added; needs-patch removed
OK, so patch 5305.3.diff should solve this issue. After trying a couple different things I stumbled into WP_Rewrite::get_date_permastruct
where the solution was staring me in the face. We already have some code in place that was preventing %post_id%
collisions. All I did was extend the current implementation to include fixes for the offending %postname%
permalink structures.
#57
@
10 years ago
Patch 5305.4.diff adds a unit test that fails before the fix and passes once it's applied.
#58
@
10 years ago
- Keywords has-patch removed
I stepped away for a minute and quickly realized there were backwards compatibility concerns due to redirect implications with this fix. I also ran all the phpunit tests, not just the post group, and got back failing tests. I'm going to keep at it, but I think we're at least moving in the right direction. I'll keep you guys posted. Sorry for all the chatter. I should have done more testing before uploading a patch.
#59
@
10 years ago
Hooray, I'm confident I have a solution!
However, I want to test it rigorously before I upload it this time. The previous fix was OK in theory, but it didn't work in practice. This time around I created a pre_get_posts
filter that solves the issue for this and all the other associated tickets without changing the permalinks, so no backwards compatibility concerns or broken tests.
I'll do a bit more testing and upload the patch in the next day or so. In the meantime I need to know how to create a unit test post that has an ID above 1k, so I can test for one of the known year
permalink conflicts. Anyone have suggestions or tricks that might be useful in artificially inflating the post ID without actually creating that many posts?
#60
@
10 years ago
- Keywords has-patch needs-testing added
Patch 5305.5.diff is ready for testing! I've included 6 new unit tests and I feel confident that the issue is now solved. I couldn't account for every possible permalink structure, but I feel we now have coverage on the majority of them. As well, this fixes posts without titles, which has also been a long standing issue.
I did have to introduce a new filter, as much as I didn't want to, but we needed a way to extend this functionality to custom post types without having to use the functions I created. Please let me know if there is an outlier permalink structure that you know of that wasn't mentioned in the comments that needs to be accounted for, or you think I did something in a way that doesn't make sense so I can fix it.
Thanks Guys! And thank you @DrewAPicture for making me the owner.
This ticket was mentioned in Slack in #core by valendesigns. View the logs.
10 years ago
#62
@
10 years ago
- Keywords dev-feedback added
It's been a week without any response to the latest 5305.6.diff patch, so I figure it's time to ping this thread and get the conversation going again. Has anyone had time to test the patch or look over the code?
#63
@
10 years ago
valendesigns - Thanks for your work on this. Broadly speaking, I think the strategy here is good: when a date query is detected and the permalink structure contains %postname%
, give precedence to a post matching the postname, rather than the corresponding date archive.
I have some implementation suggestions, some large and some small. See [5305.7.diff].
First, a 'pre_get_posts' callback doesn't seem like the right place to do this. For one thing, while you've given it a priority of 1, it's still possible for plugins to intervene before it runs, which could cause various problems. It's also not clear to me why we would want the proposed to be unhookable. It seems to me that the proper place is in WP_Query::parse_query()
, right after the existing date checks.
I've also simplified (and, I think, generalized) your $token_index
trick. What we really care about is the location of %postname%
, and whether it's preceded directly by %year%
or %month%
or whether it's in the 0th position. My patch tests this more directly.
I also combined the two functions into a single method. By returning from the method whenever we know there's nothing to do, we prevent the nesting that I'm assuming you were trying to avoid.
I simplified <!--nextpage-->
support as well. Most of the necessary work - like the nextpage count - is done for us later on in setup_postdata()
. No need to repeat it here. The only thing required is to translate the date query var into 'page'.
I've left a @todo
comment on the zeroise()
block. There is still a bug here. Month and day URL chunks are parsed properly with or without the leading zero (([0-9]{4})/([0-9]{1,2})/([0-9]{1,2})/page/?([0-9]{1,})/?$'
- note the {1,2}). This means that
2015/2 is a potential clash when post_name=2, in addition to
2015/02 and post_name=02. Does this mean we have to do two calls to
get_page_by_path()`?
#64
@
10 years ago
Forgot to mention in the above that I left out the post_type filter, and hardcoded 'post'. WP doesn't create date-based archives for CPTs out of the box. So I'm not sure that there's a natural situation where you'd have the kind of slug conflict covered in this post. I suppose it'd be possible in cases where a plugin registers its own rewrite rules for these kinds of archives. But in that case, we should probably be doing something more automatic to detect that these post types are subject to the clash in question - maybe examining the rewrite rules or something? Any thoughts are welcome.
#65
@
10 years ago
- Keywords dev-feedback removed
@boonebgorges Thank you for testing my patch. Also, I really like your iteration, it's very clean and concise.
To address using pre_get_posts
, my thinking was doing it that way first would make it easier to test and iterate on. I just wasn't sure of the best place to jump in and take over. You seemed to have found it though.
As for <!--nextpage-->
, this is something that does not act in the same manner as the last patch. For example, say I have a /%year%/%postname%/
permalink structure and I create a post named '02' and I try and view any day archive in February, the post takes over. The way I had it setup was that it took over only if <!--nextpage-->
was being used. And then, only for the pages it uses. So that means if I have 2 pages for the '02' post, then /2015/02/1/
& /2015/02/2/
would match a page, but /2015/02/3/
would pass and the date archive would be shown instead. So even though setup_postdata()
is where the current page count happens, we need to do our own logic to establish the paging boundaries and avoid introducing a regression.
I used zeroise()
in an attempt to normalize input so it returned in a date format with a leading zero. However, after second thought, that logic may be somewhat flawed due to the reasons you mentioned above with '2' & '02' being the same as a date, but not as a post. The real issue is that the 0
is not passed in the query var, so it doesn't match the '02' post name format. I guess the question should be why is the leading zero being stripped away from the query var? We might want to start looking in that direction to avoid a second query and/or using zeroise()
. If that can't be changed, then yes we'll likely need to do a second get_page_by_path()
.
Something to note, is that get_page_by_path()
cannot find posts without titles. It will always return null
in that situation. So in order to resolve that, we need to have the second check in place or update the logic in get_page_by_path()
to allow for checks against ID
and not just post_name
. In either case, we should find a way to address resolving posts without titles where the ID collides with a date.
I realized yesterday that the post_type
filter was not needed when I tested the patch further by creating a very simple CPT. I quickly found out that it was a pointless filter and I had every intention of removing it, but got sidetracked on something else. It would seem you beat me to it. ;)
I only briefly tested your new patch, but I'll put it through its paces and lob back an update on Monday. Overall though, I think we're getting pretty close to having a final solution to this long standing issue.
Thanks again for taking the time to rework the code and give feedback. Cheers!
#66
@
10 years ago
Sorry, I take my comment back about get_page_by_path()
. After further investigation and testing, I was very much wrong on that one. I'm trying to remember why I thought that it would return null
in that situation, I know there was a reason for the extra get_post
call in the resolve function, but for the life of me I can't remember what I was thinking when I wrote it. Disregard that part of my previous response. Though if I remember what edge case I was trying to account for I'll let you know.
#67
@
10 years ago
I've added three new tests and a couple fixes in patch 5305.8.diff. Two of those tests are to demonstrate the post_name=2
vs. post_name=02
issue, and that it's now been fixed. The third test demonstrates the pagination issue I mentioned above. Basically, the test would return true
before the patch is applied and is expected to return true
after the patch is applied. Without the latest changes, is_day
would return false
, which is not the expected behavior. I would have left this all alone until Monday, but I got pulled in when I started testing.
#68
follow-up:
↓ 69
@
10 years ago
Thanks, valendesigns. You're right about nextpage - I hadn't been thinking of the case of invalid page numbers. On that note, I've added a few more tests, and tweaked the logic a bit more. Take the following two cases:
- A post called
02
with two instances of<!--nextpage-->
.example.com/2015/02/05
is not a valid page, so should resolve to the day archive. (Your patch did this.) - A post called
02
with *no* instances of<!--nextpage-->
.example.com/2015/02/05
is not a valid page, so should resolve to the day archive. (Your patch did not do this, but 5305.9.diff does.)
I also reorganized the comments to narrate the flow a bit better. Let me know what you think.
#69
in reply to:
↑ 68
@
10 years ago
@boonebgorges This looks great! Nice catch on the page 1
bug, as well. I think the only thing left to do is have more people test it out. So I'll summarize the patch for those interested in testing.
- Added a total of twelve new tests. Two of those tests should pass and ten should fail before the patch is applied. The two passing tests are:
test_date_slug_collision_should_distinguish_too_high_pagination_from_date()
test_date_slug_collision_should_be_ignored_when_pagination_var_is_present_but_post_does_not_have_multiple_pages()
- Resolved the following date premalink structures for numeric slugs; this includes posts without titles.
/post_name/
/post_name/page/
/year/post_name/
/year/post_name/page/
/year/month/post_name/
- Accounted for
day
andmonthnum
values that are cast to integers, which strips leading zeroes. This allows us to resolve variations inpost_name
, like/2015/2/
and/2015/02/
.
- Added the
page
query var when post pagination is present and valid.
#70
@
10 years ago
- Keywords needs-patch added; has-patch needs-testing removed
Coming back to review this, I think there's a problem with the current approach. By intervening in WP_Query::parse_query()
, we make it impossible to create a WP_Query
object whose date params clash with the permalink of a post with a numeric slug. See test_numeric_slug_permalink_conflicts_should_only_be_resolved_for_the_main_query()
in 5305.10.diff. What we are trying to resolve here is improperly parsed URLs, but solving the problem in WP_Query
means that we're "fixing" queries that didn't originate in URLs at all.
I see two possible ways forward:
- Move our proposed disambiguation logic to somewhere around the 'request' filter or 'parse_request' action. The internals will have to change a bit.
- Keep it in
WP_Query
, but only run it ifis_main_query()
.
2 is easier, but I don't think it'll be reliable. It won't fix url_to_postid()
, for instance. It looks like 1 could possibly fit into parse_request()
after non-public post_types are filtered out - the disambiguation is a similar kind of sanity check.
valendesigns, do you have thoughts about this?
#71
@
10 years ago
@boonebgorges I originally had is_main_query()
in the code about 5 patches ago, but I removed it before uploading and after finding that is_date
worked reasonably well to keep our code out of other queries. Though, as you've pointed out we are now blocking a valid date query altogether. So yes, I agree we need to move it or rethink the implementation. I'm working on some ideas to see if it's possible to keep it where it is, but I'm not all that optimistic.
#72
@
10 years ago
- Keywords has-patch added; needs-patch removed
@boonebgorges I've moved the function, and how & where it's called. It now directly modifies the query vars before WP_Query
is called. As well, I've added additional tests. Please take a look at 5305.11.diff and let me know what you think.
#73
@
10 years ago
Getting closer :)
After some conversation with valendesigns, I'm going to propose that we take a two-pronged strategy. See 5305.12.diff.
- During
parse_request()
, if we detect that a URL looks like a date query, look for conflicts with posts that have numeric slugs. (This is what 11.diff already did.) - Simply doing (1) is a potentially incomplete solution, because it fixes the original bug - not being able to access a post '28' located at
/2015/02/28/
- at the expense of creating another, lesser bug - not being able to access the date archive at/2015/02/28/
. So we should also attempt to prevent the creation of posts with this kind of slug. 12.diff attempts a solution that is more modest, in two ways, than what has been suggested in previous discussion on this ticket.- It only affects the auto-generated slug in
get_sample_permalink()
. If you want to override this by manually entering a conflicting slug (or by creating posts programatically), you may. - It only enforces uniqueness (using the
-2
etc formula used inwp_unique_post_slug()
) if it's detected that the suggested post would actually create a date conflict. We look at the current permalink structure, and verify that the autogeneratedpost_name
would be a valid value for year/monthnum/day, as appropriate. So the suffixing will only happen in very rare cases, instead of every time thepost_name
is numeric.
- It only affects the auto-generated slug in
Feedback welcome. I'd especially like to know what people think about the get_sample_permalink()
strategy - is it too lax? Should we be enforcing this in wp_unique_post_slug()
?
#74
@
10 years ago
@boonebgorges Sorry for the delay, I was out of town the last couple of days. This looks great! Though, I haven't tested it yet. I will tonight and get back to you. As far as the get_sample_permalink()
strategy, my first thoughts are that it's good to be somewhere in the middle when enforcing unique slugs. Especially since we have a solid two prong approach that covers both sides of the fence without being overly aggressive.
This ticket was mentioned in Slack in #core by boone. View the logs.
10 years ago
This ticket was mentioned in Slack in #core by drew. View the logs.
10 years ago
#77
follow-up:
↓ 78
@
10 years ago
- Keywords 4.3-early added
- Milestone changed from 4.2 to Future Release
Given the late date, I don't feel comfortable pulling the trigger on this for 4.2. Let's do it first thing for 4.3.
#78
in reply to:
↑ 77
;
follow-up:
↓ 79
@
10 years ago
Replying to boonebgorges:
Given the late date, I don't feel comfortable pulling the trigger on this for 4.2. Let's do it first thing for 4.3.
That's really unfortunate to hear. I assumed all that was left to do was get additional dev feedback. The last patch looked complete, and with all the unit tests this seemed like it would make it into 4.2 for sure. :(
#79
in reply to:
↑ 78
@
10 years ago
Replying to valendesigns:
Replying to boonebgorges:
Given the late date, I don't feel comfortable pulling the trigger on this for 4.2. Let's do it first thing for 4.3.
That's really unfortunate to hear. I assumed all that was left to do was get additional dev feedback. The last patch looked complete, and with all the unit tests this seemed like it would make it into 4.2 for sure. :(
The hard work will not be in vain. It's just that this ticket has been open for many years, with many of the Great Minds of WordPress trying unsuccessfully to solve the problem, so I want to be sure we have adequate time for feedback pre- and post-commit. Thanks for your patience.
This ticket was mentioned in Slack in #core by valendesigns. View the logs.
9 years ago
#83
@
9 years ago
valendesigns - I'm glad we let this percolate for 4.3, because in revisiting it, I found some problems :) I've run out of time to work on the patch today, so I wanted to throw it up here for you to chew on.
5305.15.diff proposes a couple of changes:
- Force 'wp_unique_post_slug_is_bad_[hierarchical|flat]_slug' in
get_sample_permalink()
. This is necessary, because when the existing$post
is an auto-draft, thepost_name
in the database will be empty, which means that the slug won't be suffixed. I'm not a huge fan of the add/remove filter trick, but it's done elsewhere in core, and it's far simpler than any other solution I could come up with.
- I changed the "resolve" function name to avoid the underscore prefix.
Unfortunately, I still don't think we're ready to move on this. A couple outstanding issues that I want to throw out for consideration:
- The
get_sample_permalink()
suffixing is too aggressive. On a typical WP installation, it's only 'post' objects that will have URLs that clash with date archives. (Objects from hierarchical post types are prevented from having numeric slugs; see #11917 [13424].) So perhaps we should only be massaging the suggested permalink if the post_type is 'post'. But are there cases where a CPT might have a conflicting slug? I don't think so, but it needs some thought.
- I'm having second thoughts about the
get_sample_permalink()
strategy. Hierarchical post type objects are prevented from having numeric slugs at all; see #11917 [13424]. I don't think we have to go that far, but perhaps we should put our date-clash enforcement at the level ofwp_unique_post_slug()
, if only for consistency. This will mean rewriting a number of our tests, since it'll no longer be possible to create fixtures with date-clash slugs through the unit test factory.
#84
@
9 years ago
@boonebgorges Sorry for not responding back sooner, I'll have some dedicated time this weekend to focus on this issue. But I wanted to comment that I do not think there is an issue here with CPT as a conflict cannot exists with these permalink structures. Correct me if I'm wrong here, but to me there is only an issue with posts because without the post_type
query var it cannot route to a CPT. So a quick check for post
sounds like a good solution to me.
#86
@
9 years ago
Back to the drawing board with 5305.16.diff. I've moved the check into wp_unique_post_slug()
. This avoids weird issues with post drafts, and more closely parallels the similar pagination checks in [13424].
16.diff does not contain the rewrite/query parsing. I'm of two minds about this. On the one hand, forcing uniqueness in wp_unique_post_slug()
means that slug conflicts won't happen under normal circumstances, so it's not worth the additional overhead that wp_resolve_numeric_slug_conflicts()
introduces. On the other hand, post/archive URL conflicts may still arise in the following cases: (a) legacy content, (b) changed permalink settings, (c) manual manipulation (direct SQL, filtering 'wp_unique_post_slug'
, etc). I could go either way here. valendesigns, what do you think?
#87
@
9 years ago
Well 16 does remove a lot of potential overhead, so I am with you on that. Though, as you mentioned it has drawbacks with legacy slugs. Which I think is potentially the biggest issue we face, as the point of this ticket is really to fix legacy slugs.
Yes, 16 fixes future collisions, but if we go this route we're only half fixing the issue. I feel like wp_resolve_numeric_slug_conflicts
should remain intact and be ran when there is a 404 or something along those lines. There needs to be a way to prevent the collision for legacy slugs for this really to be a complete fix. We need to limit the use of the function to specific scenarios so we don't introduce unnecessary overhead. Is that even possible or are we chasing a unicorn here?
#88
@
9 years ago
Which I think is potentially the biggest issue we face, as the point of this ticket is really to fix legacy slugs.
I disagree - the point of the ticket is to fix numeric slug conflicts. At any given time, all existing conflicts are "legacy", but preventing future instances of the problem is still a Good Thing.
Anyway, I was really just throwing 16 out there to see what you thought. Let's go with something like 5305.17.diff, which includes the rewrite-level conflict resolution. That patch also includes some enhancements to the unit tests so that they work properly with the new wp_unique_post_slug()
changes, as well as new logic that prevents a URL like /2013/05/25
from resolving to post_name=25
if that post was not published in May 2013 (phew).
We need to limit the use of the function to specific scenarios so we don't introduce unnecessary overhead.
I think 17.diff is as focused as we're going to get. We can't run on 404, because the problem here is not 404s - the problem is that certain URLs resolve to date archives when they should resolve to single posts. So we only run the additional regex and database query when WP has parsed the request as a date archive. Anything more focused than this is probably a unicorn.
I think this is about ready to go, but I'd appreciate a last look.
#91
follow-up:
↓ 94
@
9 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
I tested it by creating a post without title which supposed to get an id=3862.
Permalink setting was %postname%
It converted the permalink to example.com/3862-2, but url redirects to 404 still.
#92
follow-up:
↓ 93
@
9 years ago
Shouldn't a permalink structure have slashes? Like, /%postname%/
for example.
#93
in reply to:
↑ 92
@
9 years ago
Replying to valendesigns:
Shouldn't a permalink structure have slashes? Like,
/%postname%/
for example.
Yes, I meant /%postname%/ in the first place. Sorry for sloppy writing.
5305.17.diff is not working on "creating a post without title" case while using /%postname%/ structure. Other information that may help: I tested 5305.9.diff, that was working.
#94
in reply to:
↑ 91
@
9 years ago
Replying to hkucukali:
I tested it by creating a post without title which supposed to get an id=3862.
Permalink setting was %postname%
It converted the permalink to example.com/3862-2, but url redirects to 404 still.
Just tested on trunk (rev 33152):
- Dashboard > Settings > Permalinks - changed permalink structure to
/%postname%/
and hit Save Changes. - Posts > Add New
- Created a post with no title and hit Publish
- A post is created with id 67.
- "Permalink" is of the form example.com/67-2/.
- Visiting example.com/67-2/ *shows the post as expected*, not a 404.
hkucukali - Can you verify that these are the steps you've taken? And can you also verify that you are testing against trunk, and not just a patched version of 4.2.x? It's possible that you made a mistake when backporting.
Can anyone else verify one way or another whether posts with no titles are working correctly?
#95
@
9 years ago
Exactly same steps, but at last it goes 404.
I did not test against trunk, just patched 4.2.2. Sorry for inconvenience, I wanted to notify if it helps little bit.
#96
@
9 years ago
hkucukali - Thanks very much for the update. valendesigns, would you mind firing up trunk to test the steps listed above? If you can confirm it's working, we'll chalk hkucukali's problems up to a bad backport to 4.2.
#97
follow-up:
↓ 99
@
9 years ago
- Keywords needs-patch added; has-patch removed
I get a 404 now too without the title. Though, all my old test posts that don't have titles and the -2
in them still work without returning a 404. Whats worse is that my slugs for the old posts are being forced updated to -2
structure if I save, which would break incoming links. I kind of wish we had gone with an earlier more backwards compatible patch that allowed the URL to be whatever we want them to be and not append junk to the end of the slug.
#99
in reply to:
↑ 97
@
9 years ago
Replying to valendesigns:
I kind of wish we had gone with an earlier more backwards compatible patch that allowed the URL to be whatever we want them to be and not append junk to the end of the slug.
Oh pshaw. If it were easy, it would've been done years ago :)
I did manage to reproduce the issue. Two things in 5305.18.diff:
- When no post_name is provided to
wp_insert_post()
, it falls back onsanitize_title( $post_title, $post_ID )
. The$post_ID
fallback hasn't needed a unique check in the past, since post IDs are unique. But now thatwp_unique_post_slug()
is doing a bit more work, we need to run the fallback title through it. (I didn't test, but it's likely that there is an existing bug here that's unrelated to numeric slugs: if you have a existing post with slug 'foo', and then try to create a post like thiswp_insert_post( array( 'post_name' => '', 'post_title' => 'Foo' ) )
, you'd probably get some weirdness. The unique check proposed here will fix that bonus bug.) - As valendesigns notes,
wp_unique_post_slug()
has no business suggesting a new unique slug for an existing post under normal circumstances. I've added a check towp_unique_post_slug()
that ensures that the numeric-slug check is only performed when the slug being checked is different from the one that the post already has.
It looks like this fixes the workflows that we've discussed in the last couple comments. valendesigns and hkucukali, could you put 5305.18.diff through its paces?
#101
@
9 years ago
- Keywords has-patch commit added; needs-patch removed
Looks good! No more 404's on my end, all the old test posts work, and new posts only get the extra bits when they are going to collide with a date.
#107
@
9 years ago
- Resolution fixed deleted
- Status changed from closed to reopened
Reopening as r33262 breaks bbPress, bbPress replies are no longer created with a unique GUID:
Related: #bbPress2844
This ticket was mentioned in Slack in #core by obenland. View the logs.
9 years ago
#109
@
9 years ago
The bbPress ticket cited above is not strictly caused by the changes in [33262], but [33262] did expose an ancient bug in wp_insert_post()
.
In cases where a post has an empty post_name
- generally this happens when post_title
is empty - wp_insert_post()
generates a post_name
from the post's ID
. (See eg [33261].) The function then directly updates the database using $wpdb->update()
. https://core.trac.wordpress.org/browser/tags/4.2.4/src/wp-includes/post.php#L3407
A few lines later, wp_insert_post()
generates a guid for new posts, using get_permalink()
. https://core.trac.wordpress.org/browser/tags/4.2.4/src/wp-includes/post.php?marks=3438#L3434
The problem is that get_permalink()
generates its return value based on a *cached* post, if available. Prior to [33262], the cache for the newly created post would always be empty at this point, which means that the fresh data would always be pulled from the database. But the get_post()
call introduced in [33262] means that the cache is populated before the guid is determined, meaning that get_permalink()
is looking at stale data. (wp_insert_post()
does clear the post cache, but only later in the function.)
The solution is to clear the post cache immediately after the database update. See 5305.20.diff.
#110
@
9 years ago
Boone, thanks for this, confirming 5305.20.diff fixes the issue for bbPress :)
#111
@
9 years ago
5305.20.diff looks good to me.
thomask, can you include your Permalink Structure? And provide specific examples? This could help duplicate the problem.