WordPress.org

Make WordPress Core

Opened 6 years ago

Closed 7 months ago

Last modified 6 months ago

#11694 closed defect (bug) (fixed)

WP should do sanity checks for paginated posts, pages and comments

Reported by: Denis-de-Bernardy Owned by: wonderboymusic
Milestone: 4.4 Priority: normal
Severity: major Version: 2.9
Component: Posts, Post Types Keywords: has-patch 3.9-early
Focuses: template Cc:

Description

Create a post with two pages using the <!--nextpage--> tag. Publish, and browse the post's *third* page.

WP should return a 404 here, rather than the last page of the post.

Along the same lines, it should return the correct canonical urls for paginated posts. Currently, rel=canonical will only ever return the post's first page.

Attachments (7)

11694.diff (803 bytes) - added by dd32 6 years ago.
canonical-pagination.patch (1021 bytes) - added by joostdevalk 3 years ago.
Canonical - Pagination Patch
pagination-rel-canonical.patch (709 bytes) - added by joostdevalk 3 years ago.
Patch for rel=canonical on paginated posts
pagination-rel-canonical.2.patch (3.0 KB) - added by joostdevalk 3 years ago.
rel-canonical patch which adds paginated post link API call
pagination-rel-canonical.3.patch (3.0 KB) - added by joostdevalk 3 years ago.
Patch for rel=canonical on paginated posts
11694-tests.patch (813 bytes) - added by boonebgorges 18 months ago.
Removing from trunk as per #30284
11694.2.diff (2.6 KB) - added by wonderboymusic 7 months ago.

Download all attachments as: .zip

Change History (48)

#1 @Denis-de-Bernardy
6 years ago

  • Version set to 2.9

#2 @Denis-de-Bernardy
6 years ago

  • Summary changed from WP should do sanity check for paginated posts and pages to WP should do sanity checks for paginated posts and pages

#3 @elky
6 years ago

The rel=canonical issue really ought to be split out in to a separate bug, imho.

#4 @dd32
6 years ago

  • Keywords has-patch added
  • Milestone changed from 3.0 to 3.1
  • Owner set to dd32
  • Status changed from new to accepted

I'm attaching a patch I've got locally for this.

This will need a more thought out patch however.

@dd32
6 years ago

#5 @nacin
5 years ago

  • Milestone changed from Awaiting Triage to Future Release

#6 @SergeyBiryukov
4 years ago

Closed #21278 as a duplicate. Related: #11857 (for archive pages).

As noted in ticket:21278:9, paginated comments are affected too.

Last edited 4 years ago by SergeyBiryukov (previous) (diff)

#7 @sirzooro
4 years ago

  • Cc sirzooro added

#8 @mdgl
4 years ago

See also #21579 where a similar issue occurs with paged comments. Invalid comment pages, for example http://example.com/content/comment-page-999/ do not return a 404 but equally do not find any comments! Is a new ticket needed or can we use this one to fix the 404 issue in both cases and #21579 to generate the proper rel="canonical" in both cases?

#9 @SergeyBiryukov
4 years ago

  • Summary changed from WP should do sanity checks for paginated posts and pages to WP should do sanity checks for paginated posts, pages and comments

Replying to mdgl:

Is a new ticket needed or can we use this one to fix the 404 issue in both cases

No need for a new ticket. Updated the summary.

#10 @here
3 years ago

See also #18660

@elky was correct 3 years ago, this is two issues.

1) site/post/## - where ## is non-existant - should 404 or 301 not 200 duplicate content
2) site/post/## - where ## > 1 - rel=canonical should be site/post/## not duplicate first page site/post based on :

http://support.google.com/webmasters/bin/answer.py?hl=en&answer=1663744

"rel="next" and rel="prev" are orthogonal concepts to rel="canonical". You can include both declarations."

Last edited 3 years ago by here (previous) (diff)

@joostdevalk
3 years ago

Canonical - Pagination Patch

#11 @joostdevalk
3 years ago

The patch I just added fixes point 1 in @here's comment above: ie. the fact that WP serves a 200 for a page that doesn't exist by, if the post is paginated, checking how much pages are in the post and acting accordingly, by either redirecting to the first page of the post or by providing a proper URL.

@joostdevalk
3 years ago

Patch for rel=canonical on paginated posts

#12 @joostdevalk
3 years ago

This second patch adds the page query variable to the canonical. If it's not used in conjunction with the other patch, this should check whether this page can actually exist.

#13 @joostdevalk
3 years ago

Looking at how it's dealt with in post-template.php a bit more, this probably should actually be done a bit different. I think _wp_link_page should be split up and part of that should be in link-template.php, which should then be used by _wp_link_page and the functions above...

@joostdevalk
3 years ago

rel-canonical patch which adds paginated post link API call

#14 @joostdevalk
3 years ago

The last ticket introduces get_paginated_post_url, for which the functionality is removed from _wp_link_page and now used in both _wp_link_page in post-template.php and rel_canonical in link-template.php. This looks a lot cleaner to me than what we had so far and also allows for future use of get_paginated_post_url within a patch for #18660.

Last edited 3 years ago by joostdevalk (previous) (diff)

@joostdevalk
3 years ago

Patch for rel=canonical on paginated posts

#15 @joostdevalk
3 years ago

v3 of the patch changes get_paginated_post_link into the semantically correct name get_paginated_post_url.

Last edited 3 years ago by joostdevalk (previous) (diff)

#16 @SergeyBiryukov
3 years ago

  • Keywords 3.7-early added

#17 @ocean90
3 years ago

  • Cc ocean90 added

#18 @wonderboymusic
3 years ago

  • Milestone changed from Future Release to 3.7

these are all marked 3.7-early

#19 @dd32
3 years ago

  • Owner dd32 deleted
  • Status changed from accepted to assigned

#20 @johnbillion
3 years ago

  • Keywords 3.7-early removed
  • Milestone changed from 3.7 to 3.8

#22 @SergeyBiryukov
2 years ago

#26123 was marked as a duplicate.

#23 @nacin
2 years ago

  • Keywords 3.9-early added
  • Milestone changed from 3.8 to Future Release

There is some cool work from joostdevalk here, but it's too late for 3.8. Maybe we can tackle the messiness of pagination in 3.9.

#24 @nacin
2 years ago

  • Component changed from Template to Posts, Post Types
  • Focuses template added

This ticket was mentioned in IRC in #wordpress-dev by helen. View the logs.


2 years ago

This ticket was mentioned in IRC in #wordpress-dev by SergeyBiryukov. View the logs.


23 months ago

@boonebgorges
18 months ago

Removing from trunk as per #30284

#29 @ocean90
11 months ago

#32505 was marked as a duplicate.

#30 @wonderboymusic
7 months ago

  • Milestone changed from Future Release to 4.4

I found an amazingly awesome bug in Rewrite - makes it where get_query_var( 'page' ) returns /4 .... no problem, let's just cast to (int) .... what's that you say? The cast results in 0?!?!?

See 11694.2.diff

This ticket was mentioned in Slack in #core by wonderboymusic. View the logs.


7 months ago

#32 @wonderboymusic
7 months ago

  • Owner set to wonderboymusic

#33 @wonderboymusic
7 months ago

In 34492:

Canonical/Rewrite: sanity check posts that are paged with <!--nextpage-->. Page numbers past the max number of pages are returning the last page of content and causing infinite duplicate content.

Awesome rewrite bug: the page query var was being set to '/4' in $wp. When cast to int, it returns 0 (Bless you, PHP). WP_Query calls trim( $page, '/' ) when setting its own query var. The few places that were checking page before posts were queried now have sanity checks, so that these changes work without flushing rewrites.

Adds/updates unit tests.

Props wonderboymusic, dd32.
See #11694.

#34 @wonderboymusic
7 months ago

In 34494:

After [34492], no need to import the global instance when we are, in fact, currently, that instance.

See #11694.

#35 @wonderboymusic
7 months ago

In 34496:

Canonical/Rewrite: After [34492], fix rel="canonical" URLs for paginated posts.

Props wonderboymusic, joostdevalk.
See #11694.

#36 @wonderboymusic
7 months ago

The only remaining piece is redirecting out-of-bounds comment pages - which can be tackled once we finish ripping apart WP_Comment_Query in #8071

#37 @wonderboymusic
7 months ago

  • Resolution set to fixed
  • Status changed from assigned to closed

Closing as fixed because #11248 is the last piece.

#38 @SergeyBiryukov
7 months ago

#34194 was marked as a duplicate.

#39 follow-up: @danielbachhuber
6 months ago

Changing rewrite regex means anyone filtering rewrite_rules will experience unexpected behavior with this change.

#40 in reply to: ↑ 39 ; follow-up: @wonderboymusic
6 months ago

Replying to danielbachhuber:

Changing rewrite regex means anyone filtering rewrite_rules will experience unexpected behavior with this change.

what's your point? you want to keep the bugs?

#41 in reply to: ↑ 40 @danielbachhuber
6 months ago

Replying to wonderboymusic:

what's your point? you want to keep the bugs?

My comment was observational, not inflammatory. Just noting that r34492 breaks backwards compatibility in the manner I specified. I've had to update WP-CLI's test accordingly.

Note: See TracTickets for help on using tickets.