WordPress.org

Make WordPress Core

Opened 5 years ago

Closed 5 years ago

#8297 closed defect (bug) (fixed)

Comment Paging leads to duplicated URLs for the same content

Reported by: wnorris Owned by:
Milestone: 2.7 Priority: normal
Severity: major Version: 2.7
Component: Comments Keywords: has-patch needs-testing
Focuses: Cc:

Description

In order to optimize for search engines, each unique resource on the web should have a single canonical URL for that resource, and any others should redirect to the canonical URL. This concept was added in WordPress core some time ago. The new comment paging feature in 2.7 causes problems with this to varying degrees.

With comment paging turned on:

  • the main post URL and the highest numbered comment page URL result in the exact same data. Perhaps the comment page should redirect to the main post URL?
  • comment page numbers can go infinitely high, resulting in the post being displayed with no comments. Perhaps numbers that are two high should redirect to the highest legitimate comment page?

With comment paging turned off:

  • comment page numbers can always be appended to the URL, which displays the post with all comments (the same thing as if you hadn't specified a comment page number at all). When paging is turned off, any requests that specify a page number should *always* redirect to the canonical URL for the post (would this cause unforseen problems?).

The root of this problem is the fact that comment paging is accomplished by modifying the URL, rather than adding a query parameter. I'm sure there was some reasoning for this (perhaps dealing with caching?), but it results in multiple URLs which all contain (at least much of) the same content.

I haven't dug into the code for this yet, as I wanted to get some feedback first.

Attachments (1)

comment-pages.diff (2.5 KB) - added by wnorris 5 years ago.
fix trailing slash with "index.php/"

Download all attachments as: .zip

Change History (4)

comment:1 wnorris5 years ago

looks like this same problems exists for standard post paging (with /page/X)

comment:2 wnorris5 years ago

  • Keywords has-patch needs-testing added; dev-feedback removed

So it looks like the main problem was simply the trailing slash not getting applied correctly. The fix was just a couple of lines in wp-includes/canonical.php (attached).

I've accepted the fact that there isn't an easy way to redirect to the last comment page when you manually enter a page number that's too high. At the time of URL canonicalization, we simply don't know how many comments are going to be displayed on a page, so we don't know how many page numbers there will be.

The patch file also includes one other small change to wp-includes/link-template.php... the "Older Comments" and "Newer Comments" links weren't respecting whether the user wants trailing slashes on their permalinks. This was causing an additional redirect when returning to the first comment page, thereby losing the "#comments" fragment on the request.

wnorris5 years ago

fix trailing slash with "index.php/"

comment:3 markjaquith5 years ago

  • Resolution set to fixed
  • Status changed from new to closed

(In [9831]) Comment Page URL fixes by wnorris. fixes #8297

Note: See TracTickets for help on using tickets.