Make WordPress Core

Opened 9 months ago

Closed 2 months ago

Last modified 2 months ago

#60806 closed defect (bug) (fixed)

Remove query alteration from build_comment_query_vars_from_block

Reported by: cybr's profile Cybr Owned by: gziolo's profile gziolo
Milestone: 6.7 Priority: normal
Severity: major Version: 6.0
Component: Comments Keywords: has-patch has-unit-tests
Focuses: Cc:

Description

In the function build_comment_query_vars_from_block(), set_query_var( 'cpage', ... ) is called. This alters the query in unexpected ways.

The function is called via blocks core/comment-template, core/comments-pagination-next, and core/comments-pagination-numbers.

None of these blocks rely on cpage internally or indirectly. Adjusting cpage was probably vestigial of the initial development of the comment blocks.

This bug causes the canonical URL to be incorrect for singular posts (when not using an SEO plugin). As a result, all these sites' singular content will be misindexed or, at worst, deindexed. Some 200 plugins are also affected by this issue.

This only happens when comment pagination is turned on.

This bug was only discovered recently because blocks had to be parsed before wp_head() for it to become a canonical URL issue, which is happening because of FSE and other custom block implementations.

Discovery: https://wordpress.org/support/topic/the-seo-framework-adds-noindex-to-posts-with-multiple-comment-pages-%e2%9a%a0%ef%b8%8f/
Slack thread: https://wordpress.slack.com/archives/C02RQBWTW/p1710771272739539?thread_ts=1710709493.297079&cid=C02RQBWTW

I was requested to resolve this upstream via Gutenberg, but this is a more pertinent issue with Core. build_comment_query_vars_from_block() is also not part of Gutenberg but was created during its merger with WP 6.0.

I propose to remove using set_query_var() altogether in the method. No functionality changes are apparent, but it resolves all aforementioned issues. PR will follow soon.

Change History (37)

This ticket was mentioned in PR #6291 on WordPress/wordpress-develop by @Cybr.


9 months ago
#1

  • Keywords has-patch added; needs-patch removed

This ticket was mentioned in Slack in #core by cybr. View the logs.


9 months ago

@Cybr commented on PR #6291:


9 months ago
#3

You can test this by adding the comment block to a singular page using an FSE "block" theme and turning on comment pagination.
Inspect the canonical URL and test whether pagination still works.

@Cybr commented on PR #6291:


9 months ago
#4

Test test_build_comment_query_vars_from_block_sets_cpage_var() ought to be deleted, for it no longer serves a purpose.

#5 @Cybr
9 months ago

Adding a unit test to assert the canonical URL to ensure it won't be mangled would be a great addition. As a bonus, I think that would also reveal issues with the Query Block.

#6 @Cybr
9 months ago

The use of set_query_var() should also be reconsidered in comments_template() and wp_list_comments()—these functions also adjust the entire query only to serve their specific needs.

With that, Core would rid itself of such patches.

However, it will become nigh impossible to do so in comments_template() since that will break implementations via filter comments_template or the included custom theme files. wp_list_comments() will only set it after comments_template(), because of globals $overridden_cpage.

I think maintaining focus on the block issue would be best since all FSE sites are affected by this.

#7 @davidbaumwald
5 months ago

  • Milestone changed from Awaiting Review to 6.7

Moving in to the 6.7 milestone so this can be investigated and actioned.

@Cybr commented on PR #6291:


5 months ago
#8

I recommend removing that code as proposed in this PR and fixing the cause that deemed the code necessary.

@Bernhard Reiter commented on PR #6291:


5 months ago
#9

I recommend removing that code as proposed in this PR and fixing the cause that deemed the code necessary.

I wouldn't presently merge this PR as it would cause a regression of comments pagination, as @SantosGuillamot mentioned, so we'd be trading one bug for another 😕 (We have e2e test coverage for the issue that was originally fixed by https://github.com/WordPress/gutenberg/pull/39227, and I've verified that this PR would indeed break one of the tests. To reproduce, apply the diff locally and run npm run test:e2e -- editor/blocks/comments.spec.js from within the Gutenberg repo.)

I had a bit of a deeper look into comments pagination related functions. FWIW, the result of `get_previous_comments_link` is determined by the value of the `cpage` query var, as is get_next_comments_link. Both functions are widely used by Classic Themes.

That might be because of an off-by-one error. Patching the query should not be considered a fix for the original error.

However, that is what the (equally "classic") comments_template() function does as well: https://github.com/WordPress/wordpress-develop/blob/dea027eac82ca7244eff421642562b2d75918e19/src/wp-includes/comment-template.php#L1596-L1600

My impression is that what we're doing in the Comments Pagination blocks is largely following established WP Core practices, so we should probably continue to set the cpage query var (as it does seem to be the "correct" way that get_previous_comments_link and friends are controlled). There's probably a bug somewhere in the logic that we use to set the value of cpage (and/or maybe one of the other $comment_args?) that we'll need to fix.

---

Maybe we can add an e2e test to cover the bug that this PR seeks to fix; that would make it easier to develop a solution that fixes both this issue without introducing a regression of https://github.com/WordPress/gutenberg/pull/39227.

@santosguillamot commented on PR #6291:


5 months ago
#10

I've been running some tests and it seems cpage was behaving the same way even before introducing that new block in 6.0 while using comment_template: link.

According to my tests, when I set the option to “show last page displayed by default”, cpage is always a value greater than 1 when there is pagination. It takes the latest page. These are the use cases I tested:

  • Block theme in trunk using the new comments block.
  • Block theme in trunk using the legacy comments block, which uses comments_template.
  • Classic theme in trunk.
  • Classic theme in 5.9 branch.

I've been using the Query Monitor plugin to check the value of cpage. If I check it using get_query_var( 'cpage' ) in an init or wp_head filter, it is still an empty string at that point.

For this reason, I believe this would require a deeper discussion about what is the expected value of cpage with the mentioned settings.

@Bernhard Reiter commented on PR #6291:


5 months ago
#11

@SantosGuillamot Thank you for your investigation!

@sybrew It seems like we aren't able to find any discernible differences of how cpage is set by Block Themes vs Classic Themes. Could you let us know which URLs are specifically broken when using Block Themes (e.g. which <link> tags in the <head>, which <a>s in the <body>, etc.), and what Discussion settings will cause them to break? Knowing what to look out for -- and to compare to the Classic Themes case -- will help us resolve this issue.

@santosguillamot commented on PR #6291:


4 months ago
#12

Thanks a lot for the reproduction steps, those are really useful 🙂

I've been triaging it again, and I believe all the issues mentioned in this thread can be solved by changing this conditional to check $comment_args[https://github.com/WordPress/wordpress-develop/blob/c1ffe3309abbda7abecdd292ad43309f3ea59495/src/wp-includes/comment-template.php#L1608 'paged'] > 1 instead of > 0. It follows the same logic as the old comments implementation, which seems to be the appropriate approach: [link].

These are the things I reproduced:

## Canonical URL appends #comments

As reported by @sybrew here, the current implementation is adding #comments in the canonical URL.

With the suggested fix, the #comments is not added anymore and it works as expected.

## "Older"/"Newer" comments links keep working as expected
The modified code was introduced to solve an issue with the Older/Newer comments links here.

With the suggested fix, it keeps working as expected, while the rest of the issues are solved.

## Missing canonical URL
As reported by @ockham here, there wasn't a canonical tag caused by this logic.

With the suggested fix, the canonical link is added as expected.

---
With that in mind, I believe we should adapt the conditional to follow the old approach instead of removing it.

@sybrew, Could you please confirm if that solves the issues you were facing and if everything keeps working as expected on your side?

@Cybr commented on PR #6291:


4 months ago
#13

The main query should not be adjusted after it has been established. Using set_query_var() is an anti pattern, something that should be reserved for niche plugin authors only.

I recommend removing its usage and adjusting the off-by-one logic where necessary. The pagination query variables will flow from the requested URL and should not be tinkered with afterwards.

@santosguillamot commented on PR #6291:


4 months ago
#14

It seems the logic to set_query_var in that specific use case has been there for 16 years (since 2008): link. I would personally feel more comfortable replicating how the old comments system has always worked. There could be other plugins relying on cpage being the updated value because it has been like that for a long time. I'd love to know more opinions though.

@Cybr commented on PR #6291:


4 months ago
#15

Correct. This issue was introduced back in 2008 via a patch, but its implications weren't apparent because set_query_var() only runs the moment the comment fields are being outputted, which almost always happens near the end of the page, where most links have already been rendered.

With FSE, all the page's blocks, including the comment blocks, are parsed right before action wp_head (after wp). Because of this, when it's time to render the page, the main query is already affected, so all links on the page are changed.

<details>
<summary>Dump of actions until wp_head in the test scenario described in my earlier comment (click me to view)</summary>

The comment: https://github.com/WordPress/wordpress-develop/pull/6291#issuecomment-2251215907

mu_plugin_loaded
muplugins_loaded
registered_taxonomy
registered_taxonomy_category
registered_taxonomy_post_tag
registered_taxonomy_nav_menu
registered_taxonomy_link_category
registered_taxonomy_post_format
registered_taxonomy_wp_theme
registered_taxonomy_wp_template_part_area
registered_taxonomy_wp_pattern_category
registered_post_type
registered_post_type_post
registered_post_type_page
registered_post_type_attachment
registered_post_type_revision
registered_post_type_nav_menu_item
registered_post_type_custom_css
registered_post_type_customize_changeset
registered_post_type_oembed_cache
registered_post_type_user_request
registered_post_type_wp_block
registered_post_type_wp_template
registered_post_type_wp_template_part
registered_post_type_wp_global_styles
registered_post_type_wp_navigation
registered_post_type_wp_font_family
registered_post_type_wp_font_face
plugin_loaded
plugins_loaded
load_textdomain
sanitize_comment_cookies
wp_roles_init
setup_theme
unload_textdomain
after_setup_theme
auth_cookie_malformed
auth_cookie_valid
set_current_user
init
widgets_init
wp_register_sidebar_widget
wp_default_styles
wp_sitemaps_init
registered_taxonomy_faq_cat
registered_taxonomy_faq_tag
registered_post_type_faq
wp_loaded
wp_cache_set_last_changed
parse_request
parse_query
pre_get_posts
posts_selection
parse_term_query
pre_get_terms
metadata_lazyloader_queued_objects
send_headers
wp
template_redirect
wp_default_scripts
admin_bar_init
add_admin_bar_menus
parse_tax_query
loop_start
the_post
loop_no_results
render_block_core_template_part_file
parse_comment_query
pre_get_comments
comment_form_before
comment_form_top
comment_form_logged_in_after
comment_form
comment_form_after
loop_end
wp_head

</details>

In short, the old way was also wrong, and this mistake should not be repeated for the appeal of tradition. I already mentioned this in the ticket (comment:6) and its undesired implications because of existing filters (i.e., comments_template), among other things.

I believe the inconsistency following this patch in the comment block is acceptable. Especially since the Block Editor already largely deviates from the Classic Editor and is unaffected by the same filters, it's worth considering improving upon things with 16 years of gained knowledge.

@gziolo commented on PR #6291:


4 months ago
#16

Thank you @sybrew for all the detailed insights. It’s super helpful to work on the proper fix. In particular all the nuances between the classic and block themes are extremely important here. Excellent work pointing it out so precisely which is crucial to make the good final decision.

I have an idea to consider of how to apply the fix proposed but at the same time mitigate the impact explained by @sybrew. We could clone the global $wp_query before calling these functions that depend on it and restore it to the previous state afterward. A good example in the codebase for REST API preload in the editor:
https://github.com/WordPress/wordpress-develop/blob/9ed3553da6a61a38a1d4c5585e5ac3c9d13d351a/src/wp-includes/block-editor.php#L743L776

I saw the same pattern in other places.

@Bernhard Reiter commented on PR #6291:


4 months ago
#17

With FSE, all the page's blocks, including the comment blocks, are parsed right before action wp_head (after wp). Because of this, when it's time to render the page, the main query is already affected, so all links on the page are changed.

To corroborate this, here's an inconsistency in individual comments' permalinks that @darerodz and I found:

https://github.com/user-attachments/assets/9823196e-f915-4431-a8d1-844e2c7859c5

@santosguillamot commented on PR #6291:


4 months ago
#18

If we get rid of the set_query_var I feel we need to solve the mentioned issue it was solving.

As suggested by @ockham , I believe it could be solved passing a new argument to get_previous_comments_link and get_next_comments_link functions.

I started a draft pull request to modify the comments block and pass it as an argument. It would need modifications in core functions receive $page as an argument and only run get_query_var if $page doesn't exist:

In get_previous_comments_link and get_next_comments_link, wrap that line with something like this:

if ( empty( $page ) ) {
                $page = get_query_var( 'cpage' );
        }

Let me know what you think. I can help pushing that forward.

This ticket was mentioned in PR #7275 on WordPress/wordpress-develop by @santosguillamot.


3 months ago
#19

  • Keywords has-unit-tests added

Follow-up to https://github.com/WordPress/wordpress-develop/pull/6291

As discussed in the original pull request and ticket, it seems set_query_var should be removed. However, that change was breaking the comments pagination in some scenarios (see https://github.com/WordPress/wordpress-develop/pull/6291#pullrequestreview-2185357355). In this pull request I'm removing set_query_var while fixing the mentioned issue.

In order to do that, I'm passing a new $page argument to the relevant functions and only use the cpage query var if that is not defined.

This pull request has to be tested with https://github.com/WordPress/gutenberg/pull/63698#issuecomment-2318796798, which modifies the comment pagination blocks to pass the $page argument.

Trac ticket: https://core.trac.wordpress.org/ticket/60806

@santosguillamot commented on PR #6291:


3 months ago
#20

I've created one pull request in core with these changes plus the ones needed to avoid causing the old bug: https://github.com/WordPress/wordpress-develop/pull/7275.

With that, along the relevant Gutenberg PR to modify the blocks, it should be possible to remove the set_query_var while keeping the pagination links working as expected.

@Bernhard Reiter commented on PR #7275:


3 months ago
#21

Thank you, this is looking good! (Well, the argument order of get_next_comments_link -- max_page before page -- is a bit unfortunate, as is the inconsistency with get_prev_comments_link, but there's not a lot we can do about that 😅)

The tricky part is how to land this, and https://github.com/WordPress/gutenberg/pull/63698. It's a bit of a 🐔🥚 problem:

  • If we land this PR first, we'd re-introduce this bug.
  • If we land the GB PR first, we'd pass an extraneous argument to both get_next_comments_link and get_prev_comments_link -- even thought they don't accept that extra argument yet. This will cause an error.

I think we can solve this problem via the following strategy:

  1. In this PR, we only keep the changes to src/wp-includes/link-template.php -- i.e. the extra arguments added to get_next_comments_link and get_prev_comments_link. (In other words, we remove all changes that are also present in https://github.com/WordPress/wordpress-develop/pull/6291). We can then land this PR. (We might want to add a bit of unit test coverage though.)
  2. We add some safeguards to the GB PR. (I'll comment there with more detail.) We can then land that PR.
  3. We wait for the GB changes to be synced to Core. This will likely happen on the Friday or Monday before Beta 1, i.e. on Sept 27 or 30.
  4. In the (short) window that's left before Beta 1, we merge https://github.com/WordPress/wordpress-develop/pull/6291.

This plan should ensure that nothing breaks at any point.

@SantosGuillamot Could you take care of item number 1.? 🙌 😊

@santosguillamot commented on PR #7275:


3 months ago
#22

If we land the GB PR first, we'd pass an extraneous argument to both get_next_comments_link and get_prev_comments_link -- even thought they don't accept that extra argument yet. This will cause an error.

I just answered here, but I am not sure if this will cause an error. From what I understood, the extra argument passed to those functions will just be ignored, but it won't throw an error. If that's the case, I believe we could still manage everything in the same merge:

  1. Merge Gutenberg PR.
  2. Wait for GB changes to be synced to Core.
  3. Merge this PR.

Having said that, I might be mistaken. I'll start working on adding some unit test coverage until it is clarified.

@santosguillamot commented on PR #7275:


3 months ago
#23

I've added a couple of unit tests in this commit based on the existing ones for those functions. Let me know if you think that's not enough.

@Bernhard Reiter commented on PR #7275:


3 months ago
#24

Test coverage looks fine, since most code paths are shared with the case where the $page arg isn't set (but instead inferred from the cpage query arg), and those have some solid coverage already.

I just answered here, but I am not sure if this will cause an error. From what I understood, the extra argument passed to those functions will just be ignored, but it won't throw an error.

Ah yeah, looks like you're right; I replied there.

If that's the case, I believe we could still manage everything in the same merge: [...]

Alright, we can do that. I left one more minor note; otherwise this PR is looking good, code-wise. I'll give it some more smoke testing for good measure.

@Bernhard Reiter commented on PR #7275:


3 months ago
#25

Test coverage looks fine, since most code paths are shared with the case where the $page arg isn't set (but instead inferred from the cpage query arg), and those have some solid coverage already.

Ah, one more request -- we should probably add one test for each get_next_comments_link and get_prev_comments_link where we both pass the $page arg _and_ set the cpage query var (but to different values) and assert that the functions respect the value of $page and ignore the cpage query var 😊

@santosguillamot commented on PR #7275:


3 months ago
#26

Maybe it even makes sense to include it in the same test?

@Bernhard Reiter commented on PR #7275:


3 months ago
#28

Yep, looks good! Thank you 👍

@santosguillamot commented on PR #7275:


3 months ago
#29

The relevant Gutenberg pull request, that pass the $page argument in core blocks, has just been merged. As mentioned in this comment, I believe once those changes are synced, this should be ready.

@santosguillamot commented on PR #7275:


3 months ago
#30

Now that the packages sync has been committed, I've rebased this branch, and it should be ready. In my testing, the old bug is not there anymore and we are removing the conflicting set_query_var.

#31 @gziolo
2 months ago

  • Component changed from General to Comments

@gziolo commented on PR #7275:


2 months ago
#32

@desrosj, is it possible that the props bot doesn't pick up edits to the PR's description when the link to Trac gets added after the comment was created? See this:

https://github.com/user-attachments/assets/897986db-ba17-4495-aa0f-2aa2a85d1532

Let's see if the props-bot label helps here.

@gziolo commented on PR #7275:


2 months ago
#33

Still, no luck. No changes in https://github.com/WordPress/wordpress-develop/pull/7275#issuecomment-2320761796. There should be listed the author of the ticket that helped in the process.

#34 @gziolo
2 months ago

  • Owner set to gziolo
  • Resolution set to fixed
  • Status changed from new to closed

In 59081:

Comments: Pass $page as argument to comments functions

Removes query alteration from build_comment_query_vars_from_block by introducing a new way to pass the $page as argument to functions handling pagination for the comments.

Props cybr, santosguillamot, bernhard-reiter, gziolo.
Fixes #60806.

@Cybr commented on PR #7275:


2 months ago
#35

I'll just leave this comment here; may that resolve the props-bot issue 🕺

@gziolo commented on PR #7275:


2 months ago
#36

@sybrew, that helped. I listed you in props manually, anyway. Major props for all your help to identify and fix the issue😀

Changed committed with https://core.trac.wordpress.org/changeset/59081.

@gziolo commented on PR #6291:


2 months ago
#37

Landed the alternative patch with https://core.trac.wordpress.org/changeset/59081. Thank you for all the help.

Note: See TracTickets for help on using tickets.