WordPress.org

Make WordPress Core

Opened 4 years ago

Closed 3 years ago

#30478 closed defect (bug) (fixed)

Comment query results returned in an inconsistent order on some versions of MySQL

Reported by: boonebgorges Owned by: boonebgorges
Milestone: 4.2 Priority: normal
Severity: normal Version:
Component: Comments Keywords: has-patch
Focuses: Cc:

Description

The default 'orderby' for comment queries is 'comment_date_gmt'. When multiple comments have the same value for 'comment_date_gmt', MySQL has no consistent way to break the tie. This is particularly evident when running the unit tests on older versions of MySQL, as the details of this tiebreaking algorithm appear to be non-deterministic in MySQL 5.0.x. See https://travis-ci.org/pento/develop.wordpress/jobs/41917240 and #30462.

Attachments (2)

30478.tests.diff (1.7 KB) - added by boonebgorges 4 years ago.
A couple of tests to demonstrate the underlying issue. Failing on recent versions of MySQL.
30478.diff (14.8 KB) - added by boonebgorges 3 years ago.

Download all attachments as: .zip

Change History (10)

#1 @boonebgorges
4 years ago

The failing test in this case is test_get_comments_by_user(). That test is not designed to assert anything about the *order* of comments returned by get_comments(), so my first step will be to force 'orderby=comment_ID' for that test, in order to fix the build.

A true fix for this issue will mean adding a second column to the ORDER BY clause for comment queries that will guarantee consistent order even when there's a tie in the first column. That second column will probably be 'comment_ID', and the order (DESC/ASC) should be determined based on the primary 'orderby'/'order' pair. That is, 'orderby=comment_date_gmt&order=desc' should translate to ORDER BY comment_date_gmt DESC, comment_ID DESC; 'orderby=comment_date_gmt&order=asc' would be ORDER BY comment_date_gmt ASC, comment_id ASC; 'orderby=comment_author_email&order=ASC' etc is a coin-flip (probably DESC for consistency).

#2 @boonebgorges
4 years ago

  • Keywords needs-patch needs-unit-tests added

#3 @boonebgorges
4 years ago

In 30548:

Declare an explicit 'order' in test_get_comments_by_user().

This change fixes broken builds on older versions of MySQL. See #30462.

Assertions regarding indeterminate secondary ORDER BY clauses will go in
separate test methods. See #30478.

#4 @boonebgorges
4 years ago

  • Milestone changed from Awaiting Review to Future Release

I've investigated the issue a bit more, and quelle surprise! it's more complicated than I thought.

Comment queries almost always have an ORDER BY clause (the ability to omit this clause will come in 4.1; see #29902). When no 'orderby' argument is provided, the clause defaults to ORDER BY comment_date_gmt DESC. Let's say you have multiple comments with the same date_modified_gmt. Different versions of MySQL appear to act as follows:

  • 5.0-5.1: Fall back on the primary key, using the order specified in the existing ORDER BY clause. So, implicitly, ORDER BY comment_date_gmt DESC, comment_ID DESC.
  • 5.5+: Fall back on the primary key, using MySQL's default sort order. So, implicitly, ORDER BY comment_date_gmt DESC, comment_ID ASC.

Fixing this will require a non-trivial refactoring of the way that the ORDER BY clause is built. In [12518], support for an array of 'orderby' values was introduced. However, 'order' ASC/DESC has always just been tacked on at the end. So it's not currently easy for us to tack a final comment_ID DESC onto this clause. What's needed is support for independent orderby/order pairs. And if we're going to build it internally, we may as well expose it in 'orderby' parameter. In other words, #17065, but for comments.

Let's try to do this for 4.2.

@boonebgorges
4 years ago

A couple of tests to demonstrate the underlying issue. Failing on recent versions of MySQL.

This ticket was mentioned in Slack in #core by boone. View the logs.


4 years ago

@boonebgorges
3 years ago

#6 @boonebgorges
3 years ago

  • Keywords has-patch added; needs-patch needs-unit-tests removed
  • Milestone changed from Future Release to 4.2

30478.diff is a first pass at the improved orderby functionality in comment queries. The new syntax is exactly the same as in the case of WP_Query: 'orderby' => array( 'comment_date_gmt' => 'ASC', 'user_id' => 'DESC' ), etc. A couple of comments about the implementation:

  • comment_ID is the only guaranteed-unique column, so it must always be used as the tiebreaker in order to resolve the issue described in this ticket. So in 30478.diff, if it's detected that 'comment_ID' is not one of the columns explicitly declared in 'orderby', it's tacked on at the end. The ASC|DESC value for the 'comment_ID' clause is determined as follows: if ordering by 'comment_date' or 'comment_date_gmt', inherit that ASC|DESC, otherwise grab the first ASC or DESC available in the collected orderby clauses and use it. The patch contains unit tests describing these various cases.
  • Previously, the orderby clause did not use a table alias, except when joining against commentmeta (so, comment_date ASC rather than wp_comments.comment_date ASC). Reproducing this logic ended up being overly complicated, so I've gone ahead and added the $wpdb->comments alias for all values of orderby. This required changing a few existing unit tests for the new SQL syntax. In certain edge cases, this might affect plugins that are doing sloppy manipulation of the SQL string at the 'comments_clauses' filter. I assume that we aren't concerned with this kind of breakage.
  • For demonstration purposes, I've added sleep() statements into test_orderby_meta() so that I could regularize the failure conditions that very occasionally pop up (they depend on comment_date_gmt being different, which is why sleeping for a second or two does the trick). See https://travis-ci.org/aaronjorbin/develop.wordpress/jobs/44397988. The patch fixes these cases. These sleep() commands should not be committed - they're just for testing/demonstration.

I think this is a nice enhancement to the API, bringing it closer in line with WP_Query. It also has the benefit of fixing the odd orderby bugs we see in some Travis builds. Feedback welcome.

#7 @boonebgorges
3 years ago

The most recent patch should be updated to support the new 'name' clause of WP_Meta_Query introduced in [31312]. See #31045.

#8 @boonebgorges
3 years ago

  • Owner set to boonebgorges
  • Resolution set to fixed
  • Status changed from new to closed

In 31467:

Improve 'orderby' syntax for WP_Comment_Query.

Since [29027], WP_Query has supported an array of values for the $orderby
parameter, with field names as array keys and ASC/DESC as the array values.
This changeset introduces the same syntax to WP_Comment_Query.

We leverage the new support for multiple ORDER BY clauses to fix a bug that
causes comments to be queried in an indeterminate order when sorting by the
default comment_date_gmt and comments share the same value for
comment_date_gmt. By always including a comment_ID subclause at the end of
the ORDER BY statement, we ensure that comments always have a unique fallback
for sorting.

This changeset also includes improvements paralleling those introduced to
WP_Query in [31312] and [31340], which allow $orderby to accept array keys
from specific $meta_query clauses. This change lets devs sort by multiple
clauses of an associated meta query. See #31045.

Fixes #30478. See #31265.

Note: See TracTickets for help on using tickets.