Make WordPress Core

Opened 15 years ago

Closed 9 years ago

Last modified 4 years ago

#11694 closed defect (bug) (fixed)

WP should do sanity checks for paginated posts, pages and comments

Reported by: denis-de-bernardy's profile Denis-de-Bernardy Owned by: wonderboymusic's profile wonderboymusic
Milestone: 4.4 Priority: normal
Severity: major Version: 2.9
Component: Posts, Post Types Keywords: has-patch 3.9-early
Focuses: template Cc:

Description

Create a post with two pages using the <!--nextpage--> tag. Publish, and browse the post's *third* page.

WP should return a 404 here, rather than the last page of the post.

Along the same lines, it should return the correct canonical urls for paginated posts. Currently, rel=canonical will only ever return the post's first page.

Attachments (7)

11694.diff (803 bytes) - added by dd32 15 years ago.
canonical-pagination.patch (1021 bytes) - added by joostdevalk 11 years ago.
Canonical - Pagination Patch
pagination-rel-canonical.patch (709 bytes) - added by joostdevalk 11 years ago.
Patch for rel=canonical on paginated posts
pagination-rel-canonical.2.patch (3.0 KB) - added by joostdevalk 11 years ago.
rel-canonical patch which adds paginated post link API call
pagination-rel-canonical.3.patch (3.0 KB) - added by joostdevalk 11 years ago.
Patch for rel=canonical on paginated posts
11694-tests.patch (813 bytes) - added by boonebgorges 10 years ago.
Removing from trunk as per #30284
11694.2.diff (2.6 KB) - added by wonderboymusic 9 years ago.

Download all attachments as: .zip

Change History (53)

#1 @Denis-de-Bernardy
15 years ago

  • Version set to 2.9

#2 @Denis-de-Bernardy
15 years ago

  • Summary changed from WP should do sanity check for paginated posts and pages to WP should do sanity checks for paginated posts and pages

#3 @elky
15 years ago

The rel=canonical issue really ought to be split out in to a separate bug, imho.

#4 @dd32
15 years ago

  • Keywords has-patch added
  • Milestone changed from 3.0 to 3.1
  • Owner set to dd32
  • Status changed from new to accepted

I'm attaching a patch I've got locally for this.

This will need a more thought out patch however.

@dd32
15 years ago

#5 @nacin
14 years ago

  • Milestone changed from Awaiting Triage to Future Release

#6 @SergeyBiryukov
12 years ago

Closed #21278 as a duplicate. Related: #11857 (for archive pages).

As noted in ticket:21278:9, paginated comments are affected too.

Last edited 12 years ago by SergeyBiryukov (previous) (diff)

#7 @sirzooro
12 years ago

  • Cc sirzooro added

#8 @mdgl
12 years ago

See also #21579 where a similar issue occurs with paged comments. Invalid comment pages, for example http://example.com/content/comment-page-999/ do not return a 404 but equally do not find any comments! Is a new ticket needed or can we use this one to fix the 404 issue in both cases and #21579 to generate the proper rel="canonical" in both cases?

#9 @SergeyBiryukov
12 years ago

  • Summary changed from WP should do sanity checks for paginated posts and pages to WP should do sanity checks for paginated posts, pages and comments

Replying to mdgl:

Is a new ticket needed or can we use this one to fix the 404 issue in both cases

No need for a new ticket. Updated the summary.

#10 @here
11 years ago

See also #18660

@elky was correct 3 years ago, this is two issues.

1) site/post/## - where ## is non-existant - should 404 or 301 not 200 duplicate content
2) site/post/## - where ## > 1 - rel=canonical should be site/post/## not duplicate first page site/post based on :

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1663744

"rel="next" and rel="prev" are orthogonal concepts to rel="canonical". You can include both declarations."

Last edited 11 years ago by here (previous) (diff)

@joostdevalk
11 years ago

Canonical - Pagination Patch

#11 @joostdevalk
11 years ago

The patch I just added fixes point 1 in @here's comment above: ie. the fact that WP serves a 200 for a page that doesn't exist by, if the post is paginated, checking how much pages are in the post and acting accordingly, by either redirecting to the first page of the post or by providing a proper URL.

@joostdevalk
11 years ago

Patch for rel=canonical on paginated posts

#12 @joostdevalk
11 years ago

This second patch adds the page query variable to the canonical. If it's not used in conjunction with the other patch, this should check whether this page can actually exist.

#13 @joostdevalk
11 years ago

Looking at how it's dealt with in post-template.php a bit more, this probably should actually be done a bit different. I think _wp_link_page should be split up and part of that should be in link-template.php, which should then be used by _wp_link_page and the functions above...

@joostdevalk
11 years ago

rel-canonical patch which adds paginated post link API call

#14 @joostdevalk
11 years ago

The last ticket introduces get_paginated_post_link, for which the functionality is removed from _wp_link_page and now used in both _wp_link_page in post-template.php and rel_canonical in link-template.php. This looks a lot cleaner to me than what we had so far and also allows for future use of get_paginated_post_link within a patch for #18660.

Version 0, edited 11 years ago by joostdevalk (next)

@joostdevalk
11 years ago

Patch for rel=canonical on paginated posts

#15 @joostdevalk
11 years ago

v3 of the patch changes get_paginated_post_link into the semantically correct name get_paginated_post_url.

Last edited 11 years ago by joostdevalk (previous) (diff)

#16 @SergeyBiryukov
11 years ago

  • Keywords 3.7-early added

#17 @ocean90
11 years ago

  • Cc ocean90 added

#18 @wonderboymusic
11 years ago

  • Milestone changed from Future Release to 3.7

these are all marked 3.7-early

#19 @dd32
11 years ago

  • Owner dd32 deleted
  • Status changed from accepted to assigned

#20 @johnbillion
11 years ago

  • Keywords 3.7-early removed
  • Milestone changed from 3.7 to 3.8

#22 @SergeyBiryukov
11 years ago

#26123 was marked as a duplicate.

#23 @nacin
11 years ago

  • Keywords 3.9-early added
  • Milestone changed from 3.8 to Future Release

There is some cool work from joostdevalk here, but it's too late for 3.8. Maybe we can tackle the messiness of pagination in 3.9.

#24 @nacin
11 years ago

  • Component changed from Template to Posts, Post Types
  • Focuses template added

This ticket was mentioned in IRC in #wordpress-dev by helen. View the logs.


10 years ago

This ticket was mentioned in IRC in #wordpress-dev by SergeyBiryukov. View the logs.


10 years ago

@boonebgorges
10 years ago

Removing from trunk as per #30284

#29 @ocean90
9 years ago

#32505 was marked as a duplicate.

#30 @wonderboymusic
9 years ago

  • Milestone changed from Future Release to 4.4

I found an amazingly awesome bug in Rewrite - makes it where get_query_var( 'page' ) returns /4 .... no problem, let's just cast to (int) .... what's that you say? The cast results in 0?!?!?

See 11694.2.diff

This ticket was mentioned in Slack in #core by wonderboymusic. View the logs.


9 years ago

#32 @wonderboymusic
9 years ago

  • Owner set to wonderboymusic

#33 @wonderboymusic
9 years ago

In 34492:

Canonical/Rewrite: sanity check posts that are paged with <!--nextpage-->. Page numbers past the max number of pages are returning the last page of content and causing infinite duplicate content.

Awesome rewrite bug: the page query var was being set to '/4' in $wp. When cast to int, it returns 0 (Bless you, PHP). WP_Query calls trim( $page, '/' ) when setting its own query var. The few places that were checking page before posts were queried now have sanity checks, so that these changes work without flushing rewrites.

Adds/updates unit tests.

Props wonderboymusic, dd32.
See #11694.

#34 @wonderboymusic
9 years ago

In 34494:

After [34492], no need to import the global instance when we are, in fact, currently, that instance.

See #11694.

#35 @wonderboymusic
9 years ago

In 34496:

Canonical/Rewrite: After [34492], fix rel="canonical" URLs for paginated posts.

Props wonderboymusic, joostdevalk.
See #11694.

#36 @wonderboymusic
9 years ago

The only remaining piece is redirecting out-of-bounds comment pages - which can be tackled once we finish ripping apart WP_Comment_Query in #8071

#37 @wonderboymusic
9 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

Closing as fixed because #11248 is the last piece.

#38 @SergeyBiryukov
9 years ago

#34194 was marked as a duplicate.

#39 follow-up: @danielbachhuber
9 years ago

Changing rewrite regex means anyone filtering rewrite_rules will experience unexpected behavior with this change.

#40 in reply to: ↑ 39 ; follow-up: @wonderboymusic
9 years ago

Replying to danielbachhuber:

Changing rewrite regex means anyone filtering rewrite_rules will experience unexpected behavior with this change.

what's your point? you want to keep the bugs?

#41 in reply to: ↑ 40 @danielbachhuber
9 years ago

Replying to wonderboymusic:

what's your point? you want to keep the bugs?

My comment was observational, not inflammatory. Just noting that r34492 breaks backwards compatibility in the manner I specified. I've had to update WP-CLI's test accordingly.

#42 @dd32
7 years ago

Just noting that #40773 is a follow up to this.

[34492] only applies when <!--nextpage--> exists in a post, therefor, /post-name/1234567/ will only redirect if there's pages defined, if no pages it just displays the duplicate content.

This ticket was mentioned in Slack in #forums by sergey. View the logs.


7 years ago

#44 @SergeyBiryukov
4 years ago

In 47727:

Canonical: Redirect paged requests for non-paginated posts to the post permalink.

This avoids displaying duplicate content of the same post under different URLs and ensures the canonical URL is correct.

Previously, requests for invalid page numbers were only redirected to the post permalink if the post was actually paginated using the <!--nextpage--> marker.

Follow-up to [34492].

Props jeremyfelt, prografika, sachit.tandukar, subrataemfluence, hronak, ekatherine, henry.wright, chesio, dd32, SergeyBiryukov.
Fixes #40773. See #45337, #28081, #11694.

#45 @SergeyBiryukov
4 years ago

In 47760:

Canonical: Redirect paged requests for a static page assigned as the "Posts page".

This avoids displaying duplicate content of the home page under different URLs with appended page numbers.

This change only affects the <!--nextpage--> pagination (page query variable) and not the regular multiple posts pagination (paged query variable).

The posts page does not support the <!--nextpage--> pagination, so requests for invalid page numbers should be redirected to the page permalink, applying the logic previously implemented for single posts or pages.

Follow-up to [34492], [47727].

Props jeremyfelt, sachit.tandukar, SergeyBiryukov.
Fixes #45337. See #40773, #28081, #11694.

#46 @SergeyBiryukov
4 years ago

In 47761:

Canonical: Only redirect non-existing page requests to the post permalink if the post is found.

Follow-up to [47760].

See #45337, #40773, #28081, #11694.

Note: See TracTickets for help on using tickets.