WordPress.org

Make WordPress Core

Opened 5 years ago

Closed 5 years ago

Last modified 5 years ago

#31171 closed defect (bug) (duplicate)

Very slow db query in get_lastpostmodified() which is called by WP feeds results in large table scan and sort using filesort

Reported by: archon810 Owned by:
Milestone: Priority: normal
Severity: normal Version: 4.1
Component: Feeds Keywords:
Focuses: performance Cc:
PR Number:

Description

Hey guys,

In my quest to improve performance of core Wordpress functionality (see #31071 and #31072), I'm back with another optimization that is quite significant for large WP installations.

As far as I can tell, the issue comes up in a function called _get_last_post_time(), which is used by get_lastpostmodified() and in turn all the feeds feed-rss2.php, etc.

The query is as follows:

SELECT post_modified_gmt FROM wp_posts WHERE post_status = 'publish' AND post_type IN ('post', 'page', 'attachment') ORDER BY post_modified_gmt DESC LIMIT 1

In our database with 286k rows in wp_posts, this query scans 22,471 rows and then sorts using filesort - an especially nasty operation.

Here is the result of EXPLAIN:

+----+-------------+----------+-------+-----------------------------------------------------------------+------------------+---------+------+-------+-----------------------------+
| id | select_type | table    | type  | possible_keys                                                   | key              | key_len | ref  | rows  | Extra                       |
+----+-------------+----------+-------+-----------------------------------------------------------------+------------------+---------+------+-------+-----------------------------+
|  1 | SIMPLE      | wp_posts | range | type_status_date,post_status,post_type_mime_type,post_type_date | type_status_date | 124     | NULL | 22471 | Using where; Using filesort |
+----+-------------+----------+-------+-----------------------------------------------------------------+------------------+---------+------+-------+-----------------------------+

No index is used, as you can see.

Testing it with SQL_NO_CACHE, this query runs anywhere between 1s and 3+s. Needless to say, feeds get hit a lot by various crawlers and this query gets run a lot too.

The fix, which results in several orders of magnitude of speed improvements is as follows:

CREATE INDEX `modified_gmt_status_type` ON `wp_posts`(`post_modified_gmt`, `post_status`, `post_type`);

EXPLAIN says this afterwards:

+----+-------------+----------+-------+-----------------------------------------------------------------+--------------------------+---------+------+------+--------------------------+
| id | select_type | table    | type  | possible_keys                                                   | key                      | key_len | ref  | rows | Extra                    |
+----+-------------+----------+-------+-----------------------------------------------------------------+--------------------------+---------+------+------+--------------------------+
|  1 | SIMPLE      | wp_posts | index | type_status_date,post_status,post_type_mime_type,post_type_date | modified_gmt_status_type | 132     | NULL |   13 | Using where; Using index |
+----+-------------+----------+-------+-----------------------------------------------------------------+--------------------------+---------+------+------+--------------------------+

Not only is this query faster to query the database, it even returns the result from the index directly, thus running even faster.

Please consider testing this and adding to WP core.

Change History (2)

#1 follow-up: @nacin
5 years ago

  • Milestone Awaiting Review deleted
  • Resolution set to duplicate
  • Status changed from new to closed

Duplicate of #15499.

#2 in reply to: ↑ 1 @archon810
5 years ago

Replying to nacin:

Duplicate of #15499.

Good call, though the solution proposed in #15499 is not as optimized. I'll comment there.

Note: See TracTickets for help on using tickets.