#8351 closed enhancement (fixed)
Poorly optimized queries in wp_get_calendar()
Reported by: |
|
Owned by: | |
---|---|---|---|
Milestone: | 3.1 | Priority: | normal |
Severity: | normal | Version: | |
Component: | Optimization | Keywords: | has-patch commit |
Focuses: | Cc: |
Description
Performance of the queries in is too bad wp_get_archives() and wp_get_calendar() if you have a lot of records in wp_posts table (so bad that hosting company suspends the account).
1. wp_get_archives():
EXPLAIN SELECT DISTINCT YEAR(post_date) AS `year`, MONTH(post_date) AS `month`, count(ID) as posts FROM wp_posts WHERE post_type = 'post' AND post_status = 'publish' GROUP BY YEAR(post_date), MONTH(post_date) ORDER BY post_date DESC +----+-------------+----------+------+------------------+------------------+---------+-------------+------+-----------------------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+------+------------------+------------------+---------+-------------+------+-----------------------------------------------------------+ | 1 | SIMPLE | wp_posts | ref | type_status_date | type_status_date | 124 | const,const | 3594 | Using where; Using index; Using temporary; Using filesort | +----+-------------+----------+------+------------------+------------------+---------+-------------+------+-----------------------------------------------------------+
Nothing you can do with this query, as grouping by expression cannot use indices by definition. The only thing is to remove DISTINCT from the query, as grouping by year and month guarantees that every (year, month) pair in the result set will be unique. Same for yearly and daily archives.
2. wp_get_calendar():
Although we can do almost nothing to wp_get_archives() (due to DB design deficiencies), we can significantly boost wp_get_calendar().
First, wp_get_calendar() checks if we have any posts at all:
SELECT ID from wp_posts WHERE post_type = 'post' AND post_status = 'publish' ORDER BY post_date DESC LIMIT 1
Since we only need to check the existence of the record, we don't have any reason to use ORDER BY (although it does nothing harmful as MySQL is able to use index).
If the query succeeded, WP starts searching for the posts in past months:
SELECT DISTINCT MONTH(post_date) AS month, YEAR(post_date) AS year FROM wp_posts WHERE post_date < '2008-11-01' AND post_type = 'post' AND post_status = 'publish' ORDER BY post_date DESC LIMIT 1
What is wrong: since we need only ONE record (see LIMIT 1), it will be unique in the result set anyway, DISTINCT is not needed. Here's the EXPLAIN of the query:
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+ | 1 | SIMPLE | wp_posts | range | type_status_date,post_type | type_status_date | 132 | NULL | 3238 | Using where; Using index; Using temporary | +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
As you can see, MySQL needs a temporary table (it doesn't matter that it will have only one row, it will be created). If we remove DISTINCT, we can get rid of the temporary table:
EXPLAIN SELECT MONTH(post_date) AS month, YEAR(post_date) AS year FROM wp_posts WHERE post_date < '2008-11-01' AND post_type = 'post' AND post_status = 'publish' ORDER BY post_date DESC LIMIT 1 +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+--------------------------+ | 1 | SIMPLE | wp_posts | range | type_status_date,post_type | type_status_date | 132 | NULL | 3238 | Using where; Using index | +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+--------------------------+
Then WP looks for the posts written in the next months:
SELECT DISTINCT MONTH(post_date) AS month, YEAR(post_date) AS year FROM wp_posts WHERE post_date > '2008-11-01' AND MONTH( post_date ) != MONTH( '2008-11-01' ) AND post_type = 'post' AND post_status = 'publish' ORDER BY post_date ASC LIMIT 1 +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+ | 1 | SIMPLE | wp_posts | range | type_status_date,post_type | type_status_date | 132 | NULL | 356 | Using where; Using index; Using temporary | +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
Same as above: DISTINCT is not needed. But we still can rewrite the query in a better way:
EXPLAIN SELECT MONTH(post_date) AS month, YEAR(post_date) AS year FROM wp_posts WHERE post_date > '2008-11-31 23:59:59' AND post_type = 'post' AND post_status = 'publish' ORDER BY post_date ASC LIMIT 1;+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+--------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+--------------------------+ | 1 | SIMPLE | wp_posts | range | type_status_date,post_type | type_status_date | 132 | NULL | 1 | Using where; Using index | +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+--------------------------+
We got rid of Using temporary AND we were able to make MySQL scan much less rows!
Next, WP looks for the days in the month when at least one post was published:
SELECT DISTINCT DAYOFMONTH(post_date) FROM wp_posts WHERE MONTH(post_date) = '11' AND YEAR(post_date) = '2008' AND post_type = 'post' AND post_status = 'publish' AND post_date < '2008-11-24 18:30:04'
Only range optimization can be used here. But, as I have already mentioned, MySQL is unable to use indices for expressions. Thus, our range search will only be limited from the top:
+----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+ | 1 | SIMPLE | wp_posts | range | type_status_date,post_type | type_status_date | 132 | NULL | 3594 | Using where; Using index; Using temporary | +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
It is very easy to rewrite the query:
SELECT DISTINCT DAYOFMONTH(post_date) FROM wp_posts WHERE post_date >= '2008-11-01 00:00:00' AND post_type = 'post' AND post_status = 'publish' AND post_date < '2008-11-24 18:30:04' +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+ | 1 | SIMPLE | wp_posts | range | type_status_date,post_type | type_status_date | 132 | NULL | 267 | Using where; Using index; Using temporary | +----+-------------+----------+-------+----------------------------+------------------+---------+------+------+-------------------------------------------+
As you can see, MySQL needed to scan only 267 rows instead 3594, and the more records you have in your wp_posts table, the more performance is.
Finally, WP fetches the days and post titles:
SELECT post_title, DAYOFMONTH(post_date) as dom FROM wp_posts WHERE YEAR(post_date) = '2008' AND MONTH(post_date) = '11' AND post_date < '2008-11-24 18:30:04' AND post_type = 'post' AND post_status = 'publish'
The above optimization can be applied here.
Attachments (3)
Change History (23)
#1
@
16 years ago
- Keywords early added
- Milestone changed from 2.7 to 2.8
Patch seems good, but we're too close to the 2.7 final release to change queries. Let's schedule this for early 2.8 and consider backporting to a 2.7 dot release.
#7
@
16 years ago
- Cc vladimir@… added
- Resolution fixed deleted
- Status changed from closed to reopened
Attched is the patch that fixes the bug for branches/2.7 - no posts for subsequent months will be shown and queries become optimized.
NB: the patch is for branches/2.7 and has not been tested on trunk.
#9
@
16 years ago
- Resolution set to fixed
- Status changed from reopened to closed
There probably won't be another 2.7 release, and we had to revert some of the calendar queries because they broke the calendar. If you want to try a new patch for that against 2.8, reopen.
#10
@
16 years ago
- Component changed from General to Optimization
- Keywords needs-patch added; has-patch tested removed
- Resolution fixed deleted
- Status changed from closed to reopened
we had to revert some of the calendar queries because they broke the calendar
I saw that. The latest patch takes this into account.
If you want to try a new patch for that against 2.8, reopen.
OK, will create a patch for 2.8.
#11
@
16 years ago
- Summary changed from Poorly optimized queries in wp_get_archives() and wp_get_calendar() to Poorly optimized queries in wp_get_calendar()
#12
@
16 years ago
- Milestone changed from 2.8 to Future Release
punting to future, pending new patch
#15
@
15 years ago
- Keywords has-patch needs-testing added; needs-patch removed
Added a new patch against rev. 11862
#17
@
15 years ago
- Keywords tested commit added; needs-testing removed
calendar is working fine and as expected here. calendar queries run as so:
without:
SELECT DISTINCT MONTH(post_date) AS month, YEAR(post_date) AS year FROM www_posts WHERE post_date < '2009-11-01' AND post_type = 'post' AND post_status = 'publish' ORDER BY post_date DESC LIMIT 1 0.7ms SELECT DISTINCT MONTH(post_date) AS month, YEAR(post_date) AS year FROM www_posts WHERE post_date > '2009-11-01' AND MONTH( post_date ) != MONTH( '2009-11-01' ) AND post_type = 'post' AND post_status = 'publish' ORDER BY post_date ASC LIMIT 1 0.8ms SELECT DISTINCT DAYOFMONTH(post_date) FROM www_posts WHERE MONTH(post_date) = '11' AND YEAR(post_date) = '2009' AND post_type = 'post' AND post_status = 'publish' AND post_date < '2010-03-01 19:58:02' 1.4ms SELECT post_title, DAYOFMONTH(post_date) as dom FROM www_posts WHERE YEAR(post_date) = '2009' AND MONTH(post_date) = '11' AND post_date < '2010-03-01 19:58:02' AND post_type = 'post' AND post_status = 'publish' 7.4ms
with:
SELECT MONTH(post_date) AS month, YEAR(post_date) AS year FROM www_posts WHERE post_date < '2009-11-01' AND post_type = 'post' AND post_status = 'publish' ORDER BY post_date DESC LIMIT 1 0.7ms SELECT MONTH(post_date) AS month, YEAR(post_date) AS year FROM www_posts WHERE post_date > '2009-11-30 23:59:59' AND post_type = 'post' AND post_status = 'publish' ORDER BY post_date ASC LIMIT 1 0.8ms SELECT DISTINCT DAYOFMONTH(post_date) FROM www_posts WHERE post_date >= '2009-11-01 00:00:00' AND post_type = 'post' AND post_status = 'publish' AND post_date <= '2009-11-30 23:59:59' 0.6ms SELECT post_title, DAYOFMONTH(post_date) as dom FROM www_posts WHERE post_date >= '2009-11-01 00:00:00' AND post_date <= '2009-11-30 23:59:59' AND post_type = 'post' AND post_status = 'publish' 0.8ms
Patch for the Ticket #8351