Make WordPress Core

Opened 9 years ago

Closed 9 years ago

#35192 closed defect (bug) (fixed)

Comments_clauses filter

Reported by: firebird75's profile firebird75 Owned by: boonebgorges's profile boonebgorges
Milestone: 4.4.2 Priority: normal
Severity: normal Version: 4.4
Component: Comments Keywords: fixed-major
Focuses: Cc:

Description

Hello,

If I use this filter to modify the comment query :

$clauses = apply_filters_ref_array( 'comments_clauses', array( compact( $pieces ), &$this ) );

It works fine. The problem is that, due to change in WP 4.4, later on, you reach the fill_descendants function if you have some hierarchical comments. This one reuses the filtered fields from the comments_clauses filter, except the "where". So if you have modified the "from" and the "where", only the "from" is taken into consideration there :

$where = 'WHERE ' . implode( ' AND ', $where_clauses ) . ' AND comment_parent IN (' . implode( ',', array_map( 'intval', $parent_ids ) ) . ')';
			$comment_ids = $wpdb->get_col( "{$this->sql_clauses['select']} {$this->sql_clauses['from']} {$where} {$this->sql_clauses['groupby']} ORDER BY comment_date_gmt ASC, comment_ID ASC" );

As you can see, the "where" is overriden. This will generate some SQL errors, especially if you have modified the "from" to include some table join that you then reference into the "where" to filter further.

So either, we should have a way to filter as well as in the fill_descendants function, either the filtered from should be cleaned up to init it back to what it is by default. Otherwise, we have random behaviours here...

Attachments (2)

35192.diff (1.7 KB) - added by boonebgorges 9 years ago.
35192.2.diff (5.9 KB) - added by boonebgorges 9 years ago.

Download all attachments as: .zip

Change History (14)

#1 @boonebgorges
9 years ago

  • Milestone changed from Awaiting Review to 4.4.2

Hm, this is an unpleasant problem. It'd be far easier to reset the 'from' clause, but I'm afraid that this might result in incorrect query data in some cases.

Can you share your comment_clauses callback, or explain it in more details, so I can get a sense of a real-world example of how you're filtering here? Are you joining another table just in order to get more data in the query, or are you using the information from that join to exclude/include particular results?

I have a feeling that the proper solution here is to use (a version of) the post-comment_clauses-filter WHERE clause when building the query in fill_descendants(). Unfortunately, this is not straightforward - in order to remove the parent, parent__in, and parent__not_in clauses, it's probably going to be necessary to do some string manipulation.

I'll put this in 4.4.2 because it's a regression in 4.4, but if the changes required to make it work are too extensive, it may have to wait for 4.5.

@boonebgorges
9 years ago

#2 @boonebgorges
9 years ago

  • Keywords 2nd-opinion added

35192.diff is the kind of string manipulation I have in mind. It (a) stores the post-filter WHERE clause, and then (b) removes the parent etc clauses before building the query in fill_descendants(). I think more work needs to be done to deal with the various ways that a clause may or may not be preceded or followed by ' AND ' (ugh) but the patch should give you an idea of what I have in mind.

@firebird75 What do you think?

#3 @firebird75
9 years ago

The code is below. It works fine with WP 4.4 with the hacks I have put but it isn't clean as it will throw an SQL error in the fill_descendants query (because of the unability to filter there).

So first, let me tell you that I really appreciate the fact that you are looking into this ticket so quickly!

Then, if you want my honest feedback, I believe we need to keep it as simple as possible. The way the comments display code has been modified in WP 4.4 isn't really simple. I had a lot of fun to understand the code logic :)

Your patch will probably work but it makes it even more complex to understand :)

I am wondering if it is really a good idea to have a quick fix put in WP 4.4.2 now. I have a workaround that works. It isn't good and clean but it works. Maybe we should take the time to have a more complete code revisit in WP 4.5 rather than having multiple changes that will puzzle a lot of plugin authors...

My suggestion would be to have one and only one query for comments display. Having multiple queries has several drawbacks :

  • more load on the server
  • more risks to miss something for plugin authors and never notice this (the problem here only showed for nested comments which isn't the majority of sites right now)
  • more complex queries to create via filters

Then if you still want to keep the current code logic with a separate query for descendants, then we probably need a second filter there as well...

<?php
function my_get_comments_clauses($clauses)
{
        global $settings, $current_user;

        // WP 4.4 fix (fields is now comment_ID rather than * which makes it ambiguous if you have some joins)
        if ($clauses['fields'] !== '*')
                $clauses['fields'] = 'wp1.comment_ID';
                
        $clauses['where'] = str_replace('comment_approved','wp1.comment_approved',$clauses['where']);
        $clauses['where'] = str_replace('comment_post_ID','wp1.comment_post_ID',$clauses['where']);
        $clauses['where'] = str_replace($wpdb->posts.'.post_status','wp3.post_status',$clauses['where']);
        
        // WP 4.4 hack (remove the parent filter to display everything now rather than relying on descendant)
        $clauses['where'] = str_replace('AND comment_parent = 0','',$clauses['where']);

        $clauses['orderby'] = str_replace($wpdb->comments.'.comment_date_gmt','wp1.comment_date_gmt',$clauses['orderby']);
        $clauses['orderby'] = str_replace($wpdb->comments.'.comment_ID','wp1.comment_ID',$clauses['orderby']);

        $clauses['join'] = " AS wp1 LEFT JOIN ".$wpdb->posts." AS wp3 ON wp1.comment_post_id = wp3.ID ";

        $clauses['where'] = str_replace('user_id','wp1.user_id',$clauses['where']);
        $clauses['join'] .= " LEFT JOIN ".$wpdb->usermeta." as wp4 ON wp3.post_author = wp4.user_id ";
        $clauses['join'] .= " LEFT JOIN ".$wpdb->usermeta." as wp5 ON wp5.user_id = ".$current_user->ID." ";
        $where = " OR (wp4.meta_key = '".$settings['company_profile']."' AND wp5.meta_key = '".$settings['company_profile']."' AND wp4.meta_value = wp5.meta_value) ";
        $clauses['groupby'] = "wp1.comment_ID";
        
        $clauses['where'] .= " AND wp3.post_status = 'publish' AND (wp3.post_author = ".$current_user->ID." ".$where." OR wp3.post_type != 'my_custom_type') AND NOT EXISTS (SELECT * FROM ".$wpdb->commentmeta." AS wp2 WHERE wp1.comment_ID = wp2.comment_id AND wp2.meta_key = 'internal_update' AND wp2.meta_value = 1) ";

        return $clauses;
}

#4 @boonebgorges
9 years ago

  • Keywords 2nd-opinion removed

Thanks for the feedback, @firebird75 .

My suggestion would be to have one and only one query for comments display

This is the crux of the issue. It used to be that a post's comments were always fetched with a single query. But this meant fetching all of a post's comments on every comment-page, which could mean loading thousands of comment objects just to display 20 or 50 of them. While it's true that the overhauled logic in 4.4 sometimes results in a larger number of queries, they are now far more performant and far more cacheable than they previously were. See #8071 for a very extensive backstory.

An additional filter in fill_descendants() seems to me like the wrong way to target this problem. The vast majority of the time, the parameters used to fetch top-level comments should also be used to fetch the non-top-level comments. Thus we should only require plugins to filter the parameters in one place.

The "ideal" solution would be for us to pass an array of 'where' sub-clauses to the 'comments_clauses' filter, rather than the imploded clause. But this is a non-starter for backward compatibility reasons.

So, while I'm not a huge fan of the string manipulation option I've described here, I think it's the only way to ensure compatibility with existing uses of the filter, without requiring an unreasonable amount of work from plugin authors. And I don't see how waiting until 4.5 is going to change anything: as I noted earlier, a move back to a single query is not in the cards. I'll plug away and try to improve the implementation suggested in 35192.diff.

@boonebgorges
9 years ago

#5 @boonebgorges
9 years ago

  • Keywords has-patch added

Okey dokey: 35192.2.diff is a more general way to process the post-filter SQL. Basically, I excise the 'parent' etc clauses, including any AND that precedes or trails them, then I reassemble the WHERE clause, adding ANDs as necessary. A couple of tests demonstrate that proper syntax is maintained.

This approach should allow 'comments_clauses' filters to continue to work as they did prior to 4.4, since the SQL that comes out of that filter is the same SQL (with some mods) that is used to fetch children in fill_descendants().

@firebird75 Any chance you could apply this to your test environment to be sure it's generating the proper results?

#6 @firebird75
9 years ago

Hello Boonebgorges!

Thanks a lot for your outstanding support!

So few remarks about those changes :
1/ All the SQL errors I had are now gone away, nice!
2/ I get way more comments than I was without this fix through the query but there is still one case which this fix doesn't address and that I am only able to address by adding this filtering clause :

<?php
$clauses['where'] = str_replace('AND comment_parent = 0','',$clauses['where']);

Basically, this is this case :

  • Comment A (level 0)
    • Comment B (Level 1, ie reply to A)
      • Comment C (Level 2, ie reply to B)

If I set B to be a comment which isn't available to all users through my filters, then C will disappear as well, even if it isn't set to be unavailable to all users. I guess this is because the hierachy is somehow broken somewhere in the functions because I had removed B. I am able to workaround that with this code addition in my filtering function but then comment C isn't showing as being nested comment :

<?php
$clauses['where'] = str_replace('AND comment_parent = 0','',$clauses['where']);

Anyway, even if the fix isn't complete. It is already a pretty good step as it addresses the SQL error and most of the missing comments.

#7 @boonebgorges
9 years ago

I guess this is because the hierachy is somehow broken somewhere in the functions because I had removed B

Yeah. This breakage represents a more fundamental issue, which is that hierarchical comment queries now only work correctly for uninterrupted hierarchies. The same thing would likely happen if you included B as part of the comment__not_in param of WP_Comment_Query. This is something I didn't think of during 4.4 development, and even now, I'm not really sure what the "right" behavior should be. My inkling is that, if you specify that you want your comments returned hierarchically, you would expect that excluding B would exclude its descendants as well, but I could see this going either way. I'm going to open up a separate ticket for consideration.

In the meantime, thanks for sticking with me on this. I'm going to go ahead with the current fix for 4.4.2.

Last edited 9 years ago by boonebgorges (previous) (diff)

#8 @boonebgorges
9 years ago

  • Owner set to boonebgorges
  • Resolution set to fixed
  • Status changed from new to closed

In 36277:

Use the post-filter WHERE clause when querying for comment descendants.

The descendant query in WP_Comment_Query::fill_descendants() uses the clauses
of the main get_comment_ids() query as a basis, discarding the parent,
parent__in, and parent__not_in clauses. As implemented in WP 4.4 [34546],
the WHERE clause was assembled in such a way that any modifications applied
using the comments_clauses filter were not inherited by fill_descendants().
This resulted in descendant queries that did not always properly filter
results, and sometimes contained syntax errors.

The current changeset fixes the problem by using the post-filter WHERE clause
as the basis for the fill_descendants() query. This change requires a new
approach for eliminating the unneeded parent-related clauses: instead of
eliminating values in an associative array, we must use regular expressions.

Props boonebgorges, firebird75.
Fixes #35192.

#9 @boonebgorges
9 years ago

  • Keywords fixed-major added; has-patch removed
  • Resolution fixed deleted
  • Status changed from closed to reopened

Reopening for 4.4.2.

#10 @boonebgorges
9 years ago

@firebird75 #35431 discusses the issue of interrupted hierarchies. Let's talk it over there.

#11 @firebird75
9 years ago

@boonebgorges Thank you so much for your outstanding support in this situation! You really made a great job, thanks!!!

#12 @dd32
9 years ago

  • Resolution set to fixed
  • Status changed from reopened to closed

In 36357:

Comments: Use the post-filter WHERE clause when querying for comment descendants.

The descendant query in WP_Comment_Query::fill_descendants() uses the clauses
of the main get_comment_ids() query as a basis, discarding the parent,
parent__in, and parent__not_in clauses. As implemented in WP 4.4 [34546],
the WHERE clause was assembled in such a way that any modifications applied
using the comments_clauses filter were not inherited by fill_descendants().
This resulted in descendant queries that did not always properly filter
results, and sometimes contained syntax errors.

The current changeset fixes the problem by using the post-filter WHERE clause
as the basis for the fill_descendants() query. This change requires a new
approach for eliminating the unneeded parent-related clauses: instead of
eliminating values in an associative array, we must use regular expressions.

Merges [36277] to the 4.4 branch.
Props boonebgorges, firebird75.
Fixes #35192.

Note: See TracTickets for help on using tickets.