Ticket #15397 (new enhancement)

Opened 19 months ago

Last modified 16 months ago

redirect_guess_404_permalink() purposedly doesn't guess posts with updated dates

Reported by: archon810 Owned by:
Priority: lowest Milestone: Future Release
Component: Canonical Version:
Severity: normal Keywords: has-patch dev-feedback
Cc: admin@…

Description

Problem

Here's my post path scheme:  http://site.com/YEAR/MONTH/DAY/SLUG. Whenever I have writers working on a post for a while and saving drafts (we're all using Windows Live Writer), they oftentimes publish to the date when the last draft was saved, i.e. several days in the past. Then, they quickly correct the date but the previously tweeted/shared link is now 404 due to the changed date.

I've looked into the source of redirect_guess_404_permalink(), and it purposedly narrows down the query when it sees a post date to that date only. If I understand correctly, this is done to minimize accidental redirects to the wrong post, but has the side effect of not guessing the new link if only the date was changed.

A workaround of removing these lines:

  if ( get_query_var('year') )
    $where .= $wpdb->prepare(" AND YEAR(post_date) = %d", get_query_var('year'));
  if ( get_query_var('monthnum') )
    $where .= $wpdb->prepare(" AND MONTH(post_date) = %d", get_query_var('monthnum'));
  if ( get_query_var('day') )
    $where .= $wpdb->prepare(" AND DAYOFMONTH(post_date) = %d", get_query_var('day'));

fixes the problem for me.

Can this case be solved in the trunk and the code above removed, or logic improved? My .htaccess file is filled with 301 redirects to correct wrong dates.

Thank you.

Attachments

15397.diff Download (2.5 KB) - added by solarissmoke 16 months ago.

Change History

  • Cc admin@… added
  • Keywords 2nd-opinion added
  • Priority changed from normal to lowest
  • Type changed from defect (bug) to enhancement
  • Milestone changed from Awaiting Review to Future Release

Seems to me we'll have to store date changes as well. Otherwise we'll be making far too liberal redirects, which this is designed to prevent.

Sounds like an acceptable workaround, similar to how _wp_old_slug works for slugs. Alternatively, maybe add a hook into this function, or provide an option we can set in order to disable date checking.

Right now, my core is edited (and it's worked really well), but I hate to see my core files with modifications.

Thanks.

I would love to see this function instead hooked in. I was asked a few weeks ago if there was a way to disable exactly this (but keep the rest of canonical), and it required a goofy hack.

comment:5 follow-up: ↓ 6   dd3217 months ago

Removing the specific months/days from the query has been mentioned before, and IMO it'd be nice if it could catch cases like that as well.

But it gets to a point where it may make the redirection more liberate than preferred, As soon as I saw this ticket, I thought that perhaps, we should be returning a Search Results page listing the possible locations which may have been intended..

Another option, is to attempt redirection with the date, and progressively remove refinements until a single post is returned. (ie. 2010/05/04/postname doesnt return, does 2010 + 05 + 04 + postname return? ok, how about 2010 + 05 + postname, 2010+postname? etc)

comment:6 in reply to: ↑ 5   archon81017 months ago

Replying to dd32:

Removing the specific months/days from the query has been mentioned before, and IMO it'd be nice if it could catch cases like that as well.

But it gets to a point where it may make the redirection more liberate than preferred, As soon as I saw this ticket, I thought that perhaps, we should be returning a Search Results page listing the possible locations which may have been intended..

Another option, is to attempt redirection with the date, and progressively remove refinements until a single post is returned. (ie. 2010/05/04/postname doesnt return, does 2010 + 05 + 04 + postname return? ok, how about 2010 + 05 + postname, 2010+postname? etc)

A search page for when the # of results is > 1 sounds acceptable, and so does the latter solution, although my concern is that it might result in more queries (which, in this case, is actually not bad - it would only result in 1 more query if there are no results by the time you get to the date filter).

  • Keywords has-patch dev-feedback added; 2nd-opinion removed

Attaching is a patch that tries to make the function more lenient without being too liberal:

If only one match for post name is found, it returns that without querying dates. Otherwise it tries to find the best match out of several by factoring in dates, incrementally by year, month, and day. There has to be a clear winner on the date match otherwise it returns false.

comment:8 follow-up: ↓ 9   markjaquith16 months ago

Good start. What about limiting the number of days that it can drift? I mean, what happens if you delete a post from 2004 with a slug of "my-post" and then subsequently you create a post in 2011 with a slug of "my-post" — you wouldn't want the 2004 URL to now redirect to the 2011 URL. When people change dates on posts, I'd wager that 90% of them change it plus or minus one day. We could restrict the changes to plus or minus 5 days (or so) and probably cover the majority of legitimate changes, without risking false redirection scenarios.

comment:9 in reply to: ↑ 8   solarissmoke16 months ago

Replying to markjaquith:

Good start. What about limiting the number of days that it can drift?

It would only be possible when the permalink structure is /%year%/%monthnum%/%date%/%postname%/, which is only one of many possible structures. So if the structure is /%year%/%postname%/ then we can only match the year..

@mark Unless you make the time period configurable, your solution is highly subjective. Writers can be working on a piece for days, weeks, or even months and then the final URL published by accident. Similarly, a post may be updated and its date changed so that it jumps back onto the front page - thus the date difference could be anything.

Note: See TracTickets for help on using tickets.