Ticket #18660 (assigned enhancement)

Opened 5 months ago

Last modified 8 weeks ago

Enhance rel_canonical function, add filter

Reported by: nathanrice Owned by: joostdevalk
Priority: normal Milestone: Awaiting Review
Component: Canonical Version: 3.3
Severity: normal Keywords: has-patch needs-testing 2nd-opinion
Cc: sirzooro, joost@…

Description

I think it's a bit shortsighted to think that only singular pages need the canonical tag output in the <head>. Considering the fact that just about any page on your site can be accessed with a malformed URL, I think it's time to enhance this function.

The attached patch is just a first pass. But I think it gets us started in the right direction. There's also a filter before output, so themes and plugins can further enhance the output of this plugin (related #14458).

Patch is against [438126].

Attachments

link-template.php.diff Download (2.2 KB) - added by nathanrice 5 months ago.
Enhance the rel_canonical() function
post-template.php.diff Download (516 bytes) - added by nathanrice 5 months ago.
link-template.php.2.diff Download (1.3 KB) - added by nathanrice 5 months ago.
canonical.patch Download (2.1 KB) - added by joostdevalk 5 months ago.
Canonical Patch
canonical.2.patch Download (2.1 KB) - added by joostdevalk 5 months ago.
Canonical Patch v2
canonical.3.patch Download (2.1 KB) - added by joostdevalk 5 months ago.
Canonical Patch v3
canonical.4.patch Download (2.3 KB) - added by joostdevalk 5 months ago.
Canonical Patch v4
canonical.5.patch Download (2.4 KB) - added by joostdevalk 5 months ago.
Canonical Patch v5
18660.diff Download (2.6 KB) - added by markjaquith 5 months ago.
trunk.diff Download (9.2 KB) - added by joostdevalk 5 months ago.
New approach, fixes several tickets at once.
canonical.6.patch Download (9.8 KB) - added by joostdevalk 5 months ago.
Canonical Patch v7

Change History

Enhance the rel_canonical() function

comment:1 follow-up: ↓ 2   nacin5 months ago

Note that 438126 would be Akismet external's SVN revision number. Current trunk is [18667].

Also note something about canonical in  http://yoast.com/wp-theme/genesis/, published earlier this week.

comment:2 in reply to: ↑ 1   nathanrice5 months ago

Replying to nacin:

Note that 438126 would be Akismet external's SVN revision number. Current trunk is [18667].

Yes, of course. My mistake.

Replying to nacin:

Also note something about canonical in  http://yoast.com/wp-theme/genesis/, published earlier this week.

For the official rel_canonical() function, we definitely shouldn't canonicalize back to page 1 of an archive. That suggestion was to benefit our users at the recommendation of a very well respected SEO expert, Greg Boser. Yoast said he and Greg will have to fight about that the next time they see each other :-)

So yes, my patch as is won't work. But it's a start. When I get some time, I'll updated the canonical generation to account for paging.

comment:3 follow-up: ↓ 4   scribu5 months ago

Why is this needed, when we also have canonical redirects?

comment:4 in reply to: ↑ 3   nathanrice5 months ago

Replying to scribu:

Why is this needed, when we also have canonical redirects?

Because canonical redirects don't correct all malformed URLs. Consider something like this:

http://example.com/?referrer=wordpress.org

No redirect. The canonical tag helps search engines know that when people link to the homepage with a malformed URL, you want pagerank to pass to http://example.com/ and not the other URL (seen by Google as two different pages).

Sorry about link-template.php.diff and post-template.php.diff. Different patches for different tickets. My bad.

  • Owner set to joostdevalk
  • Status changed from new to assigned

+1 on fixing this.

I've attached a better patch, based on the code in my WordPress SEO plugin. The pagination canonical Genesis uses is at best agressive optimization, and not based on the canonical standards, so we should indeed not include that.

This patch makes sure WordPress outputs a canonical on:

  • singular posts, pages, etc.
  • the homepage & frontpage
  • taxonomy, category & tag archives
  • post type archives
  • dated archives

It also deals with pagination, both for paged posts and pages and all paginated archives.

Canonical Patch

Slight bug in first version of that patch, replaced.

comment:8 follow-up: ↓ 10   nathanrice5 months ago

@joostdevalk Looks excellent to me.

Maybe this is just paranoia, but since $link can be filtered, I'd like to see it escaped before output.

Also, use home_url() in lieu of get_bloginfo('url').

The only thing I don't get is the distinction between front page and home (if show_on_front is page). Seems to me, both of those would be home_url().

Otherwise, patch looks good to me.

Nevermind my home/front confusion. I think you're trying to target the blog posts page, right?

Last edited 5 months ago by nathanrice (previous) (diff)

Canonical Patch v2

comment:10 in reply to: ↑ 8 ; follow-up: ↓ 11   joostdevalk5 months ago

Replying to nathanrice:

Maybe this is just paranoia, but since $link can be filtered, I'd like to see it escaped before output.

Agreed, fixed.

Also, use home_url() in lieu of get_bloginfo('url').

In fact, home_url( '/' ); to make it even shorter :)

The only thing I don't get is the distinction between front page and home (if show_on_front is page). Seems to me, both of those would be home_url().

Your second comment was correct, it's the posts page. And the logic for that is annoying indeed, we should probably have a convenience function for that too.

comment:11 in reply to: ↑ 10   nathanrice5 months ago

Replying to joostdevalk:

Your second comment was correct, it's the posts page. And the logic for that is annoying indeed, we should probably have a convenience function for that too.

Indeed. is_home() has become quite useless lately.

comment:12 follow-up: ↓ 13   nathanrice5 months ago

Should esc_url() be used instead of esc_attr()?

Canonical Patch v3

comment:13 in reply to: ↑ 12   joostdevalk5 months ago

Replying to nathanrice:

Should esc_url() be used instead of esc_attr()?

Yes indeed, changed.

Also fixed check for paginated posts and pages, that threw a notice in v2.

Canonical Patch v4

Thanks to feedback from johnbillionh on IRC:

  • added canonical for author archives
  • fixed canonical for paged archives when pretty permalinks are NOT enabled

Canonical Patch v5

Pagination in single links fixed too now.

Pass in the context to user_trailingslashit(), use esc_url() in the output, clean up some whitespace issues.

When I used esc_url it output &#38; instead of &amp;, any way to prevent that?

  • Cc sirzooro added

joostdevalk, Please take a look on my plugin,  Meta SEO Pack - in particular MetaSeoPack:::get_canonical() method. It generates canonical URL for all WP pages, including multi-paged content (this is a bit tricky area). I haven't updated if for a while so some updates will be needed there (in particular use get_search_link() for search results page and add support for custom post types if needed). It should give you few hints how to improve your patch.

New approach, fixes several tickets at once.

This new patch not only fixes canonical, but also implements rel=prev and rel=next, ticket #18672.

Canonical Patch v7

Another pass, fixes:

  • context passed to user_trailingslashit everywhere
  • removed use of setup_postdata for a simple count
  • fixed links for paginated pages on the frontpage (they need to include /page/)
  • passed author_name to get_author_posts_url to simplify the process of getting that URL

Naming of functions probably needs to be discussed....

comment:21 follow-up: ↓ 22   sirzooro5 months ago

You can use global variable $numpages instead calculating page count again.

comment:22 in reply to: ↑ 21 ; follow-up: ↓ 23   joostdevalk5 months ago

Replying to sirzooro:

You can use global variable $numpages instead calculating page count again.

Actually can't because that variable hasn't been set at the point yet.

comment:23 in reply to: ↑ 22   sirzooro5 months ago

Replying to joostdevalk:

Replying to sirzooro:

You can use global variable $numpages instead calculating page count again.

Actually can't because that variable hasn't been set at the point yet.

OK, I see. So your code may need extra change when #9911 will be commited.

What's the status on this ticket? It was submitted and patched prior to the feature freeze for 3.3, but the milestone hasn't been changed. Do we need a certain number of definitive "patch works for me" messages before it can be committed?

  • Cc joost@… added

So, let's make sure this makes it into 3.4, what needs to be done to get that going?

Note: See TracTickets for help on using tickets.