Make WordPress Core

Opened 13 years ago

Closed 4 years ago

Last modified 4 years ago

#16557 closed defect (bug) (fixed)

Ability to disable redirect_guess_404_permalink()

Reported by: msafi's profile msafi Owned by: audrasjb's profile audrasjb
Milestone: 5.5 Priority: normal
Severity: normal Version: 3.1
Component: Canonical Keywords: has-patch has-unit-tests commit has-dev-note
Focuses: Cc:

Description

Can you make redirect_guess_404_permalink() pluggable or have its return value pass-through a filter so that developers can override it?

I know I can remove_filter('template_redirect', 'redirect_canonical') but redirect_canonical is too useful to be disabled. Only disabling URL guessing would be great.

Thanks a lot,

MK

Attachments (6)

filters to tweak permalink guessing.diff (1.8 KB) - added by simonwheatley 13 years ago.
One possibility to allow filtering the where clauses for posts and pages
new filter to replace redirect_guess_404_permalink.diff (705 bytes) - added by simonwheatley 13 years ago.
New filter to replace redirect_guess_404_permalink
16557.diff (1.6 KB) - added by simonwheatley 9 years ago.
Refresh patch, rename filter, add test
16557.2.diff (2.4 KB) - added by DrewAPicture 9 years ago.
Refresh + hook docs
16557.short-circuit-and-strict-filters.diff (1.9 KB) - added by apedog 4 years ago.
New patch off current master
16557.3.diff (2.0 KB) - added by audrasjb 4 years ago.
Canonical: Introduce hooks to filter redirect_guess_404_permalink() function’s behavior

Download all attachments as: .zip

Change History (67)

#1 @msafi
13 years ago

  • Cc msafi added

#2 follow-up: @nacin
13 years ago

  • Keywords needs-patch added; URL guessing URL redirect removed
  • Milestone changed from Awaiting Review to Future Release

+1. In fact I thought I already created this ticket. It should probably be attached into redirect_canonical() through a filter.

Whipped this up not long ago, as a workaround:

function remove_redirect_guess_404_permalink( $redirect_url ) {
    if ( is_404() )
        return false;
    return $redirect_url;
}
add_filter( 'redirect_canonical', 'remove_redirect_guess_404_permalink' );

#3 @msafi
13 years ago

Oh, this is a good idea, thanks!

@simonwheatley
13 years ago

One possibility to allow filtering the where clauses for posts and pages

#4 @simonwheatley
13 years ago

  • Cc simon@… added
  • Keywords has-patch dev-feedback added; needs-patch removed
  • Version changed from 3.0.5 to 3.1

Added a possible patch which separates the where clauses into page related and post related, and allows plugin devs to filter either snippet of SQL.

Patch created against 3.1.1.

#5 @nacin
13 years ago

What I'd like to do is simply make redirect_guess_404_permalink() actually hook into redirect_canonical(), I guess as a filter since it returns data. Then the filter can be removed.

So instead of:

if ( ! $redirect_url )
	$redirect_url = redirect_guess_404_permalink();

Instead do:

$redirect_url = apply_filters( 'redirect_guess_404_permalink', $redirect_url );

And hook the function into there. Then it may be removed. Also, the filter name is lame -- perhaps can be more generic considering the positioning.

@simonwheatley
13 years ago

New filter to replace redirect_guess_404_permalink

#6 @simonwheatley
13 years ago

@nacin - Thanks for the steer. Is the attached diff what you were looking for?

#7 follow-ups: @westi
13 years ago

Replying to nacin:

What I'd like to do is simply make redirect_guess_404_permalink() actually hook into redirect_canonical(), I guess as a filter since it returns data. Then the filter can be removed.

So instead of:

if ( ! $redirect_url )
	$redirect_url = redirect_guess_404_permalink();

Instead do:

$redirect_url = apply_filters( 'redirect_guess_404_permalink', $redirect_url );

And hook the function into there. Then it may be removed. Also, the filter name is lame -- perhaps can be more generic considering the positioning.

Can we not just hook redirect_guess_404_permalink() onto redirect_canonical and be done with it - no new filters needed?

Last edited 13 years ago by westi (previous) (diff)

#8 in reply to: ↑ 7 @simonwheatley
13 years ago

Replying to westi:

Can we not just hook redirect_guess_404_permalink() onto redirect_canonical and be done with it - no new filters needed?

That would involve moving the point that the redirect guessing is done, which (from limited testing just now) means the current guessing behaviour doesn't happen. I favour the additional filter as per my most recent patch.

#9 @mboynes
11 years ago

  • Cc mboynes@… added

I'd love to see this finally make its way into core. Is there anything in particular holding it up?

#10 @joostdevalk
10 years ago

I see no reason why the first, very simple, patch could not just be committed and make everybody happy....

#11 @Lex_Robinson
10 years ago

Could this be merged please? I need it.

#12 @MikeSchinkel
10 years ago

In preparing for my WordCamp Atlanta talk about URL routing I've run into this function causing all kind of headache for customized URLs.

There's a reason for the HTTP 404 status code, incorrect URLs are incorrect and the client searching for them should be told so! :)

The problem with this is it can return one resource today and another resource in the future after the user added a higher priority "guess."

My vote is (for my own sites at least) to fully disable this functionality Please address this so we can use plugins to fully control our URLs.

Thanks!

P.S. Related #19693

#13 @SergeyBiryukov
10 years ago

  • Keywords 4.0-early added

#14 in reply to: ↑ 2 @haukep
10 years ago

I stumbled upon this feature request while researching a solution for the issue outlined here: http://wordpress.stackexchange.com/questions/144937/disable-only-url-auto-complete-not-the-whole-canonical-url-system

Replying to nacin:

Whipped this up not long ago, as a workaround:

function remove_redirect_guess_404_permalink( $redirect_url ) {
    if ( is_404() )
        return false;
    return $redirect_url;
}
add_filter( 'redirect_canonical', 'remove_redirect_guess_404_permalink' );

So for all people eagerly waiting for better control over how the redirect guess is performed, I took nacin's code above and turned it into a small plugin. The plugin still waits for its review but should hopefully be available here at some point: https://wordpress.org/plugins/disable-url-autocorrect-guessing/

#15 @paulschreiber
10 years ago

I, too, would find this helpful.

#16 @SergeyBiryukov
10 years ago

#29598 was marked as a duplicate.

@simonwheatley
9 years ago

Refresh patch, rename filter, add test

#17 @simonwheatley
9 years ago

New patch:

  • Renames the new filter to 404_redirect
  • Adds a test in canonical.php

#18 @DrewAPicture
9 years ago

  • Keywords needs-docs added

The new filter will need hook documentation.

#19 @simonwheatley
9 years ago

Darnit I had hook docs on that, must have got lost in some revert. I blame my head cold. Good spot, will do!

#20 @DrewAPicture
9 years ago

  • Keywords commit added; 4.0-early needs-docs removed
  • Milestone changed from Future Release to 4.2

16557.2.diff refreshes the patch following [31168] to move the add_filter() call to default-filters.php. Also adds hook docs. Moving to 4.2 for consideration.

@DrewAPicture
9 years ago

Refresh + hook docs

This ticket was mentioned in Slack in #core by drew. View the logs.


9 years ago

#22 @DrewAPicture
9 years ago

16557.2.diff still applies and unit tests pass.

This ticket was mentioned in Slack in #core by drew. View the logs.


9 years ago

This ticket was mentioned in Slack in #core by drew. View the logs.


9 years ago

#25 @DrewAPicture
9 years ago

  • Milestone changed from 4.2 to Future Release

Seems like this could use a little more consideration. The hook name could be improved and @ocean90 expressed reluctance about the efficacy of the unit test. Let's try this in a future release.

#26 @wonderboymusic
9 years ago

  • Keywords commit removed

#27 @ravenswd
8 years ago

What I'd really like to see is just a checkbox on one of the Settings screens (probably Permalinks): "Allow URL Guessing"

In the meantime, the plugin by haukep and Nacin works great! Thank you.

Caveat: After activating the plugin, carefully check your website for broken links. You may have had redirects you weren't aware of!

Last edited 8 years ago by ravenswd (previous) (diff)

#28 follow-up: @esemlabel
8 years ago

The if statement should contain exclusion of $_GET 'p' parametr for link rel='shortlink' to work properly.

function remove_redirect_guess_404_permalink( $redirect_url ) {
	if ( is_404() && !isset($_GET['p']) )
		return false;
	return $redirect_url;
}
add_filter( 'redirect_canonical', 'remove_redirect_guess_404_permalink' );

#29 in reply to: ↑ 7 @nacin
7 years ago

Replying to westi:

Can we not just hook redirect_guess_404_permalink() onto redirect_canonical and be done with it - no new filters needed?

If anyone ever picks this ticket up, I recommend trying to pursue this approach.

#30 @stevegibson12
7 years ago

I'd like to propose making this a higher priority as it is adversely affecting SEO results. I discovered Google is returning the shorter, inaccurate and incomplete URL rather than the real one.

https://webmasters.stackexchange.com/questions/108185/google-is-including-a-wrong-misleading-url-in-search-results-for-my-wordpress

The URL example.com/book is returned in Google search, when that's actually to a blog post for a site called Bookeo.

People clicking on /book think they're about to book an appointment or similar. This is totally misleading.

http://i.imgur.com/4KD49jJ.png

Will try some of the workarounds posted, and thanks for your consideration.

#31 in reply to: ↑ 28 @anonymized_14978628
6 years ago

Replying to esemlabel:

The if statement should contain exclusion of $_GET 'p' parametr for link rel='shortlink' to work properly.

function remove_redirect_guess_404_permalink( $redirect_url ) {
	if ( is_404() && !isset($_GET['p']) )
		return false;
	return $redirect_url;
}
add_filter( 'redirect_canonical', 'remove_redirect_guess_404_permalink' );

Could you provide the corrected code?

I agree with the others, this auto complete url needs to be sorted out.

#32 @DrLightman
6 years ago

Could you please, after 8 years, fullfill this request? Guessing to save a 404 by providing another resource is not right, as others have stated. Also it adds up a database overload by full scanning of the wp_posts table thanks to the post_name LIKE %s query that may fail 90% of the times.

Even a bool filter before like this would do:

$do_the_guess = apply_filters('do_redirect_guess_404_permalink', true);

if ( ! $redirect_url && $do_the_guess ) {
    if ( $redirect_url = redirect_guess_404_permalink() ) {
        ...

Thank you

This ticket was mentioned in Slack in #core by killua99. View the logs.


5 years ago

#34 @jivanpal
5 years ago

I'd like to give my +1 on this; returning 404s for invalid URLs ought to be the actual behaviour, not this guessing nonsense. Like @ravenswd, I'll be using @haukep's plugin for now (cheers btw), but I whole-heartedly expect a toggle for this guessing behaviour on the settings page in a future release of Core, hopefully with it disabled by default for new installations.

Last edited 5 years ago by jivanpal (previous) (diff)

#35 @neonkowy
5 years ago

  • Severity changed from minor to normal
  • Type changed from enhancement to defect (bug)

This is not enhancement. Such behaviour is not compliant with HTTP (see https://en.wikipedia.org/wiki/HTTP_404) in the first place. And it kills SEO.

#36 @laternastudio
5 years ago

I also agree that the current behavior seems pretty inconsistent with standards (i.e. if a page does not exist at a URL, that URL should return a 404 unless a redirect has been explicitly created).

Would be great to have a native solution to this!

#37 @desrosj
5 years ago

#38275 was marked as a duplicate.

#38 @Presskopp
5 years ago

#46796 was marked as a duplicate.

#39 @SergeyBiryukov
4 years ago

#48216 was marked as a duplicate.

#40 @apedog
4 years ago

I think a new short-circuit filter 'do_redirect_guess_404_permalink' at the start of redirect_guess_404_permalink() is the way to go.
The filter should return false (diable auto-guess), true (continue with core implementation) or string (short-circuit core implementation and return custom redirect string).

This would allow for implementing more robust and/or more performant algorithms (bypass the costly post_name LIKE %s query mentined by @DrLightman)
This would also allow for disabling the 404 redirect altogether for strict compliance with HTTP (@neonkowy, @laternastudio)

One could still implement a 404 redirect mechanism - say one that redirects /slug to story/slug but does not auto-correct to /slug-2.
Or a mechanism that only redirects to posts and pages but not to attachments (a personal pet-peeve).

+1 for having a checkbox in the permalinks settings page to disable auto-guessing.

+1 for having auto-guess disabled by default. It's confusing. It's unpredictable. It's a nightmare. More hassle than it's worth, really.

--

Taking a broader look, I think redirect_canonical is overdue for a re-write ( I mean,?p=123 => ?page_id=123 really? why? why?? ?p=123 is literally the canonical shortlink provided by WordPress itself - wp_get_shortlink. Why would redirect_canonical change that after the query has been made?)

Adding this filter seems like a good first iteration on this long overdue change. It will allow for some new (perhaps better, more customizable) implementations of the auto-guess in the wild. And it's safe. It will not break BC. And it might pave the way for a more robust modern routing system in future WordPress releases.

@apedog
4 years ago

New patch off current master

#41 @apedog
4 years ago

I've written a clean fresh patch that I believe addresses all the issues raised in this ticket.
I've added 3 new filters to redirect_guess_404_permalink():

  • 'do_redirect_guess_404_permalink' - Allows disabling 404 guessing altogether.
    add_filter('do_redirect_guess_404_permalink', '__return_false');
    
  • 'pre_redirect_guess_404_permalink' - Allows short-circuit of function and returning a custom guess from a callback. This is a much needed functionality IMO. Especially with current WordPress handling. A very easy way for plugins to hook in a custom guessing function while not breaking anything else.
    add_filter('pre_redirect_guess_404_permalink', 'my_custom_guess');
    function my_custom_guess($pre){
        return home_url('did-i-guess-right');
    }
    
  • 'strict_redirect_guess_404_permalink' - Allows strict guesses using post_name = %s SQL query instead of expensive (and overly broad) post_name LIKE %s} query. So slug might redirect to parent/slug but not to sluggish-day. Default is the current non-strict LIKE query.
    add_filter('strict_redirect_guess_404_permalink', '__return_true');
    

All filters are nested within single function. So code and logic are all neatly tucked in one place. Nice for cohesion.

I've tested these filters locally and they work.


  • Filters 'do_redirect_guess_404_permalink' and 'pre_redirect_guess_404_permalink' can be merged into a single filter. But semantically, disabling and short-circuiting are separate concerns so I figured two filters would be clearer.
  • Filter 'strict_redirect_guess_404_permalink' touches the actual logic. And might merit discussion. So maybe it could be considered separately from the short-circuit filters.

@markjaquith

#42 @jivanpal
4 years ago

@apedog Looks good, nice to see progress on this! I like the current separation of do_redirect_guess_404_permalink and pre_redirect_guess_404_permalink, since this allows filters for do_redirect_guess_404_permalink to always take precedence over those for pre_redirect_guess_404_permalink, rather than needing plugin authors to mess around with priority numbers for certain use cases.

#43 @audrasjb
4 years ago

  • Milestone changed from Future Release to 5.5
  • Owner set to audrasjb
  • Status changed from new to accepted

Hi there,

I think having this hook in WP 5.5 would be a nice goal.
Let's move forward on this. I'll send a refreshed patch in few hours.

@audrasjb
4 years ago

Canonical: Introduce hooks to filter redirect_guess_404_permalink() function’s behavior

#44 @audrasjb
4 years ago

  • Keywords needs-testing needs-unit-tests added; dev-feedback removed

Patch refreshed for 5.5.0 + few coding standards fixes.
Everyone is welcome to help testing this patch.

It would also probably deserves some unit tests. It would be wonderful if someone could to try to write them :)

This ticket was mentioned in Slack in #core by audrasjb. View the logs.


4 years ago

#46 @davidbaumwald
4 years ago

  • Keywords needs-dev-note added

With the added filters, this will need a call-out on the Misc Dev Note.

#47 @Confridin
4 years ago

Last patch works as expected, for both filters :

// Kills guess_404_function
add_filter( 'do_redirect_guess_404_permalink', '__return_false' );

// Makes strict guess instead of a loose one
add_filter( 'strict_redirect_guess_404_permalink', '__return_true' );

#49 follow-up: @donmhico
4 years ago

  • Keywords has-unit-tests added; needs-unit-tests removed

Thanks for all the contributions in this ticket. I added some unit tests. Hopefully we can merge this.
@audrasjb

This ticket was mentioned in PR #283 on WordPress/wordpress-develop by ryotsun.


4 years ago
#50

Added unit test in case guess permalink is disabled.
Trac ticket: https://core.trac.wordpress.org/ticket/16557

#51 in reply to: ↑ 49 @ryotsun
4 years ago

Replying to donmhico:

Thanks for all the contributions in this ticket. I added some unit tests. Hopefully we can merge this.
@audrasjb

OMG... I've just added unit test too, but you already added... lol

#52 @audrasjb
4 years ago

  • Keywords commit added; needs-testing removed

Thanks for the unit tests 👌
PR 282 looks good to go on my side.
https://github.com/WordPress/wordpress-develop/pull/282

Last edited 4 years ago by audrasjb (previous) (diff)

#53 @whyisjake
4 years ago

  • Resolution set to fixed
  • Status changed from accepted to closed

In 47878:

Canonical: Add the ability to disable redirect_guess_404_permalink().

This also adds a few more filters to make adding redirects easier. Notably:

  1. do_redirect_guess_404_permalink
  2. pre_redirect_guess_404_permalink
  3. strict_redirect_guess_404_permalink

Fixes: #16557.
Props: msafi, nacin, simonwheatley, westi, mboynes, joostdevalk, Lex_Robinson, MikeSchinkel, haukep, paulschreiber, DrewAPicture, ravenswd, esemlabel, stevegibson12, martychc23, DrLightman, jivanpal, neonkowy, laternastudio, apedog, audrasjb, davidbaumwald, Confridin, donmhico, ryotsun.

whyisjake commented on PR #282:


4 years ago
#54

Code was merged [here](https://core.trac.wordpress.org/changeset/47878), closing this out. Thanks @donmhico.

#55 @SergeyBiryukov
4 years ago

In 47885:

Docs: Correct @param type for pre_redirect_guess_404_permalink filter.

Improve filter documentation for consistency with other similar short-circuit filters in core.

Follow-up to [47878].

See #16557.

This ticket was mentioned in Slack in #docs by joyously. View the logs.


4 years ago

#57 @desrosj
4 years ago

In 48182:

Docs: Reword inline docs for better readability and clarity.

This improves the wording of the inline documentation for redirect_guess_404_permalink() and the related filters introduced in [47878].

Previously [47878,47885]
See #16557.

#58 follow-up: @johnbillion
4 years ago

@SergeyBiryukov In [47885] you switched the docs from saying Returning a false value from the filter to saying Passing a false value to the filter.

There's a mixture in core of using returning from versus passing to for the docs for short-circuit (and similar) filters.

My preference is to go with Returning a false value from the filter, because the filter is the callback function added with add_filter(), therefore the hook passes a value to the filter, and the filter returns a value to change its behaviour. The documentation is referring to the latter.

I think this makes more sense, what do you think? Happy to discuss here or in a separate ticket.

#59 in reply to: ↑ 58 @SergeyBiryukov
4 years ago

Replying to johnbillion:

My preference is to go with Returning a false value from the filter, because the filter is the callback function added with add_filter(), therefore the hook passes a value to the filter, and the filter returns a value to change its behaviour. The documentation is referring to the latter.

I think this makes more sense, what do you think? Happy to discuss here or in a separate ticket.

Thanks for catching that! I went back and forth for a bit and chose what seemed more consistent at the time.

I agree, happy to switch to your suggestion. Does 49572.returning-from-filter.diff:ticket:49572 look good?

#60 @SergeyBiryukov
4 years ago

In 48185:

Docs: Standardize on "Returning a value from the filter" vs. "Passing a value to the filter".

The filter is the callback function added with add_filter(), therefore the hook passes a value to the filter, and the filter returns a value to change its behaviour.

The documentation is referring to the latter.

Props johnbillion.
See #49572, #16557.

Note: See TracTickets for help on using tickets.