WordPress.org

Make WordPress Core

Opened 19 months ago

Last modified 8 days ago

#47280 new enhancement

SQL_CALC_FOUND_ROWS is deprecated as of MySQL 8.0.17

Reported by: javorszky Owned by:
Milestone: 5.7 Priority: normal
Severity: normal Version:
Component: Database Keywords: has-patch
Focuses: Cc:

Description

Per https://dev.mysql.com/doc/refman/8.0/en/information-functions.html#function_found-rows

The SQL_CALC_FOUND_ROWS query modifier and accompanying FOUND_ROWS() function are deprecated as of MySQL 8.0.17 and will be removed in a future MySQL version. As a replacement, considering executing your query with LIMIT, and then a second query with COUNT(*) and without LIMIT to determine whether there are additional rows.

This is not yet immediately important because most hosts are on 5.5, or 5.6, rarely 5.7, but given the speed with which trac tickets move that impact very core functionalities, I thought it best to open this ticket to get the work started.

This impacts all the 6 places where it's being used, though one of them is in the WP_Query definition.

Change History (10)

#1 @johnbillion
19 months ago

  • Keywords needs-patch added
  • Milestone changed from Awaiting Review to Future Release
  • Severity changed from minor to normal
  • Version trunk deleted

#2 follow-up: @johnbillion
14 months ago

@javorszky Is this something you're willing to work on? As SQL_CALC_FOUND_ROWS can get slow for a large data set, does the replacment SELECT COUNT(*) WHERE ... method recommended in the MySQL docs have a performance advantage?

#3 @javorszky
14 months ago

No. For various reasons I have stepped back from contributing to WordPress in any way except for this ticket via comments.

Regarding the performance advantage, as long as the select count(column) from table where ... happens on indices, and uses only one column, it should be okay.

Also per mysql docs, there's not much in terms of alternatives.

Alternative is to not have the number of total rows available any more, which I suspect would break bc.

#4 in reply to: ↑ 2 @morgantocker
6 months ago

Former MySQL Product Manager here. Just a reply to this question specifically:

does the replacment SELECT COUNT(*) WHERE ... method recommended in the MySQL docs have a performance advantage?

Yes it can. What happens with SQL_CALC_FOUND_ROWS is that it disables the optimizations that MySQL can apply when there is a LIMIT.

The corresponding COUNT(*) query will have its own optimizations available too, such as covering indexes.

So the problem with SQL_CALC_FOUND_ROWS is you kind of get the worst of both worlds - with neither types of optimizations applying. Issuing two queries means one more network roundtrip, but on the MySQL-side both of the queries will be more efficient.

This ticket was mentioned in PR #330 on WordPress/wordpress-develop by morgo.


6 months ago

  • Keywords has-patch added; needs-patch removed

Signed-off-by: Morgan Tocker <tocker@…>

Trac ticket: https://core.trac.wordpress.org/ticket/47280

This ticket was mentioned in Slack in #core by antpb. View the logs.


8 days ago

#7 @SergeyBiryukov
8 days ago

  • Milestone changed from Future Release to 5.7

#8 @wpe_bdurette
8 days ago

Awesome to see this is happening. The GitHub PR only includes fixes for two of the occurrences SQL_CALC_ROWS_FOUND. There are others. I'm working on a PR to address all instances in core.

The usage in class-wp-query.php as some interesting implications. There is a filter on the posts query allowing for a completely arbitrary query to be run. Options that come to mind are to try to parse that query to replace the fields with COUNT(*) or simply wrap the query (e.g., SELECT COUNT(*) from ($query) p). For safety, I plan to do the latter, but I have no idea how that will play with the query optimizer.

Edit: Realizing the above won't work. At a minimum I need to strip any LIMIT clauses off the original query.

Last edited 8 days ago by wpe_bdurette (previous) (diff)

#9 @morgantocker
8 days ago

@wpe_bdurette I started a reply before seeing your edit. But I will include it here since it might still be useful.

Consider this example:

CREATE TABLE t1 (id INT NOT NULL);
INSERT INTO t1 VALUES (1),(2),(3),(4),(5);
SELECT * FROM t1 LIMIT 3;
mysql> SELECT COUNT(*) FROM (SELECT * FROM t1 LIMIT 3) p; # applies LIMIT, returns incomplete total
+----------+
| COUNT(*) |
+----------+
|        3 |
+----------+
1 row in set (0.00 sec)


mysql> SELECT COUNT(*) FROM (SELECT * FROM t1) p; # correct
+----------+
| COUNT(*) |
+----------+
|        5 |
+----------+
1 row in set (0.00 sec)

In terms of efficiency:

In MySQL 5.7+ it is just as efficient as rewriting the query to not use the subquery. You should be able to see this in explain:

mysql [localhost:8022] {msandbox} (test) > explain SELECT COUNT(*) FROM (SELECT * FROM t1) p;
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
|  1 | SIMPLE      | t1    | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    5 |   100.00 | NULL  |
+----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)

mysql [localhost:8022] {msandbox} (test) > show warnings;
+-------+------+---------------------------------------------------------------+
| Level | Code | Message                                                       |
+-------+------+---------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select count(0) AS `COUNT(*)` from `test`.`t1` |
+-------+------+---------------------------------------------------------------+
1 row in set (0.00 sec)

The optimization that rewrites subqueries in the from clause is called "derived_merge". It is a MySQL 5.7+ feature, and enabled by default. Earlier versions will be less efficient:

mysql [localhost:8022] {msandbox} (test) > set optimizer_switch="derived_merge=off";
Query OK, 0 rows affected (0.00 sec)

mysql [localhost:8022] {msandbox} (test) > explain SELECT COUNT(*) FROM (SELECT * FROM t1) p;
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------+
| id | select_type | table      | partitions | type | possible_keys | key  | key_len | ref  | rows | filtered | Extra |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------+
|  1 | PRIMARY     | <derived2> | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    5 |   100.00 | NULL  |
|  2 | DERIVED     | t1         | NULL       | ALL  | NULL          | NULL | NULL    | NULL |    5 |   100.00 | NULL  |
+----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)

mysql [localhost:8022] {msandbox} (test) > show warnings;
+-------+------+-------------------------------------------------------------------------------------------------------------------------+
| Level | Code | Message                                                                                                                 |
+-------+------+-------------------------------------------------------------------------------------------------------------------------+
| Note  | 1003 | /* select#1 */ select count(0) AS `COUNT(*)` from (/* select#2 */ select `test`.`t1`.`id` AS `id` from `test`.`t1`) `p` |
+-------+------+-------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

I don't think the worse performance in 5.6 and lower will be that impactful for the typical sized database, but I will let others judge that one. There is technically a difference.

#10 @wpe_bdurette
8 days ago

Thanks @morgantocker! That's super helpful.

Whether or not I use a subselect or just replace the select_expr from the original query, I'm going to have to do some surgery on a query string in PHP. This is less than ideal without a full parser. It strikes me as safer to strip the LIMIT clause and move to a subselect, so that's what I'll do.

Note: See TracTickets for help on using tickets.