Make WordPress Core

Opened 14 years ago

Closed 14 years ago

#9069 closed defect (bug) (invalid)

Function get_comment_link returns duplicate URLs for comments available at permalink

Reported by: gregmulhauser's profile GregMulhauser Owned by:
Milestone: Priority: normal
Severity: normal Version: 2.7.1
Component: Comments Keywords:
Focuses: Cc:


When paging is active, get_comment_link always returns URLs with "comment-page-", even when the comment is available at the post's permalink, creating two different URLs for reaching the same comment. Ideally (IMHO), all parts of WordPress should always use the same URL for reaching any given comment.

For example, the first comment of a post may be available here:

But get_comment_link returns:

It is partly the fault of get_page_of_comment that get_comment_link is doing this (depends on ascending vs. descending order), but unfortunately other functions, plugins, etc., may already be relying on get_page_of_comment continuing to work the way it does now. Since this behaviour can't easily be fixed in get_comment_link itself without also altering get_page_of_comment, I'd like to suggest that the value returned by get_comment_link simply be wrapped in apply_filters.

That way, users (or plugin authors) who would like to get rid of extraneous "comment-page-1" and similar duplicate comment URLs can 'fix' the output of the function to provide unique comment URLs, while those who don't consider it broken in the first place (and are happy to have multiple URLs for the same comment) can keep the existing behaviour.

All the best,

Change History (12)

#1 @GregMulhauser
14 years ago

  • Version changed from 2.7 to 2.7.1

Not sure whether this submission from 9 days ago might have fallen off the radar to my failure to list the relevant version as 2.7.1 rather than 2.7 -- updating that now...

#2 @Viper007Bond
14 years ago

  • Milestone changed from 2.7.2 to Future Release

Remember that comment-page-1 is not always the comment page that is shown by default, it depends on the option.

Anyway, get_page_of_comment() is fine, but get_comment_link() should be smart enough to not add the comment page to the URL if the returned value of get_page_of_comment() is the same as the default.

#3 @Viper007Bond
14 years ago

  • Owner set to Viper007Bond

#4 @Viper007Bond
14 years ago

I've been thinking about this more and I think that always having the page number in the URL is a good thing. What if you decide you want the latest comments to be shown first rather than the oldest? Then /example-post/#comment-1 wouldn't work.

A filter on the output would be a good idea though.

#5 @GregMulhauser
14 years ago

  • Milestone changed from Future Release to 2.7.2

Yes, I suppose you could either say 1) get_page_of_comment() should be clever enough not to assume that a comment should be on a specific page number when paging is not needed at all to reach that comment, or 2) get_comment_link() should be clever enough not to listen to get_page_of_comment() when the latter fails to distinguish between page 1 and paging not being needed at all. Either way, a whole new page of comments gets created even when one does not need to exist, which is a minor nuisance but nonetheless a nuisance. (There are other severe problems in the new comments code, eg. #8841; this one is relatively minor.)

The question of #comment-1 "not working" is a red herring: when comments are ordered differently, we're no longer concerned with duplicate URLs for comment 1, we're concerned with duplicate URLs for comment 1234, or some other later comment. (It's just that that makes for a harder example to illustrate the problem than just looking at comment 1.)

I've already written and tested the plugin to fix this, and it can be done in around 20 lines of code, provided that get_comment_link is modified for filtering. It's actually terribly inefficient for get_comment_link() to go back and fix the incorrect assumption on the part of get_page_of_comment(), but as I said in the original post, other bits of code are already unfortunately relying on the existing behaviour of get_page_of_comment(). If the latter could distinguish between page 1 and not needing paging at all, the universe would be at peace once again, but unfortunately the design decision has already been made that get_page_of_comment() won't worry about that. At this late stage, I'm not sure how the value returned by get_page_of_comment() could be fixed -- e.g., '0' or 'false', etc. -- to make that distinction without requiring further modifications in other code. (Or maybe not that much would be broken anyway? Maybe I'm being too cautious?)

I'm happy to supply the plugin code if it's felt that the behaviour of get_comment_link() should be changed to make up for the assumption made by get_page_of_comment(), but I'm skeptical that such a big and inefficient change should really be forced on get_comment_link()...

Either way, I'm puzzled by why this would be moved to "Future Release" when it's trivial to at least change the single line required to add the filter to get_comment_link()... I hope it doesn't ruffle anyone's feathers that I've changed it back to 2.7.2... I'm afraid I really don't know how decisions are made about when to fix what bug, so I'm happy to be corrected on this!

All the best,

#6 @Viper007Bond
14 years ago

  • Milestone changed from 2.7.2 to Future Release
  • Type changed from defect (bug) to enhancement

Only bug fixes and security issues go into the 2.7 branch for the most part. Things such as this go into trunk (2.8 currently) and then if deemed necessary get back ported into the current branch (2.7 currently). When it gets committed, or is at least getting close with a working patch file, then it gets a proper milestone.

Anyway, nothing wrong with get_page_of_comment() -- it's doing the job it was designed to do and should always return an integer if paging is enabled. get_comment_link() is the one, that in your scenario, should be deciding what to do with that value. However as I said, I think it's best if the page number is always in the URL for comment permalinks. Having a bit of stuff in the head telling Google to ignore that URL would be a good addition though BTW.

And incase you didn't realize, what page a comment is on will never change unless: a) the comments-per-page is changed, or b) a comment is deleted. Oldest comments are always on page 1 and newest are on the latest page, regardless of settings. The settings just control which page gets shown first (number 1 or the last) and which comments show up at the top and bottom of each page.

So that means if a user switches from oldest comments being displayed by default to newest comments being displayed by default, all comments will remain on the page that they were on before. However a different page number will be displayed by default, that's all.

#7 @Viper007Bond
14 years ago

And yes, I have full plans on re-writing part of get_comment_link() so that can be filtered. Just haven't had a chance yet. :)

#8 @GregMulhauser
14 years ago

Yes, get_page_of_comment() might be doing the job it was designed to do -- however the design decision to return an integer when paging is enabled but paging is not necessary is forcing other functions to pick up the slack with extra DB queries if they want to avoid entirely unnecessary duplicate content issues. In other words, design things differently, and all the problems go away -- but as you have insisted multiple times now, "there's nothing wrong with get_page_of_comment()". I'm hearing loud and clear that there is little or no interest in evaluating whether the basic design itself is doing it's job, and little or no receptivity to positive contributions to helping WordPress do a better job.

(With regard to whether I realize how comment paging works, I'm not sure how that bears on the design issue and the problems which it has created. In case you didn't realize, I've spent a fair bit of time with your code, too -- see Greg's Threaded Comment Numbering plugin. Yeah, it's still a bit rough around the edges, hasn't even been nicely OOP wrapped yet, but maybe you can tell the pains I've gone to in order to interact with what is at times deeply problematic code.)

You mentioned "full plans on re-writing part of get_comment_link()" so it can be filtered, which will render the design decisions (but unfortunately not their performance ramifications) moot -- whether poor or otherwise. Here's a trivial change for line 475 of comment-template.php which will do the trick:

return apply_filters('get_comment_link',user_trailingslashit(
trailingslashit( get_permalink( $comment->comment_post_ID ) ) .
'comment-page-' . $args['page'], 'comment' ) . '#comment-' .

I leave it to you folks to decide whether to fix it -- or even whether just to insist it all works exactly as designed and so there's really no need to look at anything critically or with a view to improving it. Bye for now.

#9 @Viper007Bond
14 years ago

I'm not speaking for anyone other than me. I'm just a user who contributes code, nothing more.

Anyway, if you're serious about your ideas, the best way to further it's cause is to actually write up some code and attach a patch for it. It'll come down to the core developers whether to commit it or not.

#10 @GregMulhauser
14 years ago

  • Owner Viper007Bond deleted
  • Type changed from enhancement to defect (bug)

As Viper007Bond assigned this ticket to himself but then declined to use the provided fix -- or to consider the possibility that a much better fix would be to design get_page_of_comment() itself so as to avoid the problem in the first place -- I'd suggest setting this back to unassigned. I'm sure there are other folks out there with an interest in avoiding unnecessary creation of duplicate content but perhaps with less personal investment in whether one function or the other function
should be named as the culprit.

All the best, Greg

#11 @Viper007Bond
14 years ago

That could have been said nicer, but whatever. Yes, it should go back to unassigned though (I had forgotten that I had assigned it to myself).

Anyway, I've forked this ticket and created one for adding a filter: #9327. A patch is there as well if you need it.

#12 @Denis-de-Bernardy
14 years ago

  • Milestone Future Release deleted
  • Resolution set to invalid
  • Status changed from new to closed

#9327 got closed, so let's close this one as wontfix/invalid. The order is defined in the template, rather than in the query, this makes the entire ticket rather baseless.

Note: See TracTickets for help on using tickets.