Opened 22 months ago
Last modified 4 weeks ago
#47280 new enhancement
SQL_CALC_FOUND_ROWS is deprecated as of MySQL 8.0.17
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | Future Release | Priority: | normal |
Severity: | normal | Version: | |
Component: | Database | Keywords: | has-patch early |
Focuses: | Cc: |
Description
Per https://dev.mysql.com/doc/refman/8.0/en/information-functions.html#function_found-rows
The SQL_CALC_FOUND_ROWS query modifier and accompanying FOUND_ROWS() function are deprecated as of MySQL 8.0.17 and will be removed in a future MySQL version. As a replacement, considering executing your query with LIMIT, and then a second query with COUNT(*) and without LIMIT to determine whether there are additional rows.
This is not yet immediately important because most hosts are on 5.5, or 5.6, rarely 5.7, but given the speed with which trac tickets move that impact very core functionalities, I thought it best to open this ticket to get the work started.
This impacts all the 6 places where it's being used, though one of them is in the WP_Query definition.
Change History (13)
#1
@
22 months ago
- Keywords needs-patch added
- Milestone changed from Awaiting Review to Future Release
- Severity changed from minor to normal
- Version trunk deleted
#3
@
17 months ago
No. For various reasons I have stepped back from contributing to WordPress in any way except for this ticket via comments.
Regarding the performance advantage, as long as the select count(column) from table where ...
happens on indices, and uses only one column, it should be okay.
Also per mysql docs, there's not much in terms of alternatives.
Alternative is to not have the number of total rows available any more, which I suspect would break bc.
#4
in reply to:
↑ 2
@
9 months ago
Former MySQL Product Manager here. Just a reply to this question specifically:
does the replacment
SELECT COUNT(*) WHERE ...
method recommended in the MySQL docs have a performance advantage?
Yes it can. What happens with SQL_CALC_FOUND_ROWS
is that it disables the optimizations that MySQL can apply when there is a LIMIT
.
The corresponding COUNT(*)
query will have its own optimizations available too, such as covering indexes.
So the problem with SQL_CALC_FOUND_ROWS
is you kind of get the worst of both worlds - with neither types of optimizations applying. Issuing two queries means one more network roundtrip, but on the MySQL-side both of the queries will be more efficient.
This ticket was mentioned in PR #330 on WordPress/wordpress-develop by morgo.
9 months ago
- Keywords has-patch added; needs-patch removed
Signed-off-by: Morgan Tocker <tocker@…>
Trac ticket: https://core.trac.wordpress.org/ticket/47280
This ticket was mentioned in Slack in #core by antpb. View the logs.
3 months ago
#8
@
3 months ago
Awesome to see this is happening. The GitHub PR only includes fixes for two of the occurrences SQL_CALC_ROWS_FOUND
. There are others. I'm working on a PR to address all instances in core.
The usage in class-wp-query.php
as some interesting implications. There is a filter on the posts query allowing for a completely arbitrary query to be run. Options that come to mind are to try to parse that query to replace the fields with COUNT(*)
or simply wrap the query (e.g., SELECT COUNT(*) from ($query) p
). For safety, I plan to do the latter, but I have no idea how that will play with the query optimizer.
Edit: Realizing the above won't work. At a minimum I need to strip any LIMIT clauses off the original query.
#9
@
3 months ago
@wpe_bdurette I started a reply before seeing your edit. But I will include it here since it might still be useful.
Consider this example:
CREATE TABLE t1 (id INT NOT NULL); INSERT INTO t1 VALUES (1),(2),(3),(4),(5); SELECT * FROM t1 LIMIT 3;
mysql> SELECT COUNT(*) FROM (SELECT * FROM t1 LIMIT 3) p; # applies LIMIT, returns incomplete total +----------+ | COUNT(*) | +----------+ | 3 | +----------+ 1 row in set (0.00 sec) mysql> SELECT COUNT(*) FROM (SELECT * FROM t1) p; # correct +----------+ | COUNT(*) | +----------+ | 5 | +----------+ 1 row in set (0.00 sec)
In terms of efficiency:
In MySQL 5.7+ it is just as efficient as rewriting the query to not use the subquery. You should be able to see this in explain:
mysql [localhost:8022] {msandbox} (test) > explain SELECT COUNT(*) FROM (SELECT * FROM t1) p; +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+ | 1 | SIMPLE | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 5 | 100.00 | NULL | +----+-------------+-------+------------+------+---------------+------+---------+------+------+----------+-------+ 1 row in set, 1 warning (0.00 sec) mysql [localhost:8022] {msandbox} (test) > show warnings; +-------+------+---------------------------------------------------------------+ | Level | Code | Message | +-------+------+---------------------------------------------------------------+ | Note | 1003 | /* select#1 */ select count(0) AS `COUNT(*)` from `test`.`t1` | +-------+------+---------------------------------------------------------------+ 1 row in set (0.00 sec)
The optimization that rewrites subqueries in the from clause is called "derived_merge". It is a MySQL 5.7+ feature, and enabled by default. Earlier versions will be less efficient:
mysql [localhost:8022] {msandbox} (test) > set optimizer_switch="derived_merge=off"; Query OK, 0 rows affected (0.00 sec) mysql [localhost:8022] {msandbox} (test) > explain SELECT COUNT(*) FROM (SELECT * FROM t1) p; +----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------+ | id | select_type | table | partitions | type | possible_keys | key | key_len | ref | rows | filtered | Extra | +----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------+ | 1 | PRIMARY | <derived2> | NULL | ALL | NULL | NULL | NULL | NULL | 5 | 100.00 | NULL | | 2 | DERIVED | t1 | NULL | ALL | NULL | NULL | NULL | NULL | 5 | 100.00 | NULL | +----+-------------+------------+------------+------+---------------+------+---------+------+------+----------+-------+ 2 rows in set, 1 warning (0.00 sec) mysql [localhost:8022] {msandbox} (test) > show warnings; +-------+------+-------------------------------------------------------------------------------------------------------------------------+ | Level | Code | Message | +-------+------+-------------------------------------------------------------------------------------------------------------------------+ | Note | 1003 | /* select#1 */ select count(0) AS `COUNT(*)` from (/* select#2 */ select `test`.`t1`.`id` AS `id` from `test`.`t1`) `p` | +-------+------+-------------------------------------------------------------------------------------------------------------------------+ 1 row in set (0.00 sec)
I don't think the worse performance in 5.6 and lower will be that impactful for the typical sized database, but I will let others judge that one. There is technically a difference.
#10
@
3 months ago
Thanks @morgantocker! That's super helpful.
Whether or not I use a subselect or just replace the select_expr from the original query, I'm going to have to do some surgery on a query string in PHP. This is less than ideal without a full parser. It strikes me as safer to strip the LIMIT clause and move to a subselect, so that's what I'll do.
This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.
6 weeks ago
This ticket was mentioned in Slack in #core by hellofromtonya. View the logs.
4 weeks ago
#13
@
4 weeks ago
- Keywords early added
- Milestone changed from 5.7 to Future Release
With 5.7 Beta 1 landing in less than 6 days and no activity, we agreed during core scrub today to punt this ticket to Future Release
. Marking it as early
for 5.8.
If any maintainer or committer feels this can be resolved in time, or wishes to assume ownership during a specific cycle, feel free to update the milestone accordingly.
@javorszky Is this something you're willing to work on? As
SQL_CALC_FOUND_ROWS
can get slow for a large data set, does the replacmentSELECT COUNT(*) WHERE ...
method recommended in the MySQL docs have a performance advantage?