#24044 closed defect (bug) (fixed)
Add index to wp_options to aid/improve performance
Reported by: | DanBUK | Owned by: | pento |
---|---|---|---|
Milestone: | 5.3 | Priority: | normal |
Severity: | normal | Version: | |
Component: | Options, Meta APIs | Keywords: | has-patch has-dev-note |
Focuses: | performance | Cc: |
Description
I work for a hosting company and we manage many '000s of WP sites for people. In order to improve the performance of all of these instances I find myself adding an index onto wp_options autoload many times daily.
I'm aware this is a small gain, but any gain is good.
This index will be hit on every single page load.
Attachments (9)
Change History (96)
#1
follow-up:
↓ 40
@
11 years ago
Could you show some statistics for how this helps performance on your sites? See ticket:14258#comment:3 for why we have not done this.
#2
@
11 years ago
- Summary changed from Add index to wp_options to aid improve performance to Add index to wp_options to aid/improve performance
This may be the case, but there are some godforsaken ugly plugins out there which mean that wp_options ends up '000s of rows. In these situations this will help with performance.
Also the maintenance of such an index is not placing a lot of burden on the DB. Overall this is reducing the work to deliver the page, that is what matters.
#3
@
11 years ago
Also the index will still be used for statistics, rather than assuming a full table scan every time. The index has a high level of cardinality.
#4
follow-up:
↓ 5
@
11 years ago
And assuming that all wp_options tables will have > 30% autoload == yes isn't prudent, there are many ways in which people use the wp_options table.
#5
in reply to:
↑ 4
@
11 years ago
Replying to DanBUK:
And assuming that all wp_options tables will have > 30% autoload == yes isn't prudent, there are many ways in which people use the wp_options table.
I disagree; it's a very fair assumption. "Ugly" plugins that add thousands of rows to a table are still very, very likely to autoload all of them, because that's the default. The alternative is each get_option() is an extra query, which you'd probably notice as a host.
The maintenance of an index is not placing a lot of burden, no. But if the index is just sitting there waiting to be ignored, then it doesn't do us much good. The larger issue is that I haven't seen any evidence — in terms of performance or otherwise — to actually show that an index here would reduce the work to deliver the page.
I'm not saying you've added an index to thousands of sites yet assuredly have no performance improvements to show for it. I'm saying, show us the measurable improvement so we can see it for ourselves.
#9
@
11 years ago
On a table with above 1k rows in wp_options
Before autoload index:
mysql> explain select option_name, option_value from wp_options where autoload = 'yes'; +----+-------------+------------+------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | wp_options | ALL | NULL | NULL | NULL | NULL | 1460 | Using where | +----+-------------+------------+------+---------------+------+---------+------+------+-------------+
After adding index named i1
mysql> create index i1 on wp_options(autoload, option_name, option_value(50)); mysql> explain select option_name, option_value from wp_options where autoload = 'yes'; +----+-------------+------------+------+---------------+------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+------+---------------+------+---------+-------+------+-------------+ | 1 | SIMPLE | wp_options | ref | i1 | i1 | 62 | const | 178 | Using where | +----+-------------+------------+------+---------------+------+---------+-------+------+-------------+
After the index was created the total rows searched dropped from 1460 to 178.
#10
@
11 years ago
- Keywords dev-feedback added
On a WordPress 'app' in which lots of non-autoloaded transients are used, I got similar results, running the same queries as @krazybean.
Before:
mysql> explain select option_name, option_value from wp_options where autoload = 'yes'; +----+-------------+------------+------+---------------+------+---------+------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+------+---------------+------+---------+------+------+-------------+ | 1 | SIMPLE | wp_options | ALL | NULL | NULL | NULL | NULL | 8196 | Using where | +----+-------------+------------+------+---------------+------+---------+------+------+-------------+ 1 row in set (0.01 sec)
After:
mysql> create index i1 on wp_options(autoload, option_name, option_value(50)); Query OK, 0 rows affected (0.52 sec) Records: 0 Duplicates: 0 Warnings: 0 mysql> explain select option_name, option_value from wp_options where autoload = 'yes'; +----+-------------+------------+------+---------------+------+---------+-------+------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+------+---------------+------+---------+-------+------+-------------+ | 1 | SIMPLE | wp_options | ref | i1 | i1 | 62 | const | 155 | Using where | +----+-------------+------------+------+---------------+------+---------+-------+------+-------------+ 1 row in set (0.00 sec)
Understood this may not affect the general population, but I wonder how we could best help folks who make extensive use of transients in certain situations.
#11
follow-up:
↓ 41
@
11 years ago
What's the purpose behind adding option_value to that index? It looks like it would only serve to increase the length of the index.
What's the performance difference on these queries? Reducing rows examined from 1460 to 178 (or even 10,000 to 100) is still peanuts. DanBUK mentioned adding this index many times daily, but I am trying to figure out if that was simply cathartic or if it was actually making a difference.
#12
@
11 years ago
I was just trying to replicate his results.
I wish I had tracked the difference when I added it to a much larger version of that same app—important queries went from several seconds in some cases to < 1 in all. I assure you it was directly related to adding an autoload index.
Again, I know this isn't the most common configuration, but it's allowing me to work with tons of transients at once with decent page load times.
If you can tell me what sort of feedback/results you need, I'll find a way to get it for you.
#13
@
11 years ago
- Keywords reporter-feedback added; dev-feedback removed
There are a few things to consider here.
First up, there's unlikely to be any benefit to adding (option_name, option_value(50))
to the index. There are no queries in WordPress core that would use them.
So, if we were going to add an index here, it would just be on autoload
, which would have a very low cardinality. For some sites, this would potentially avoid a table scan, but I'm inclined to think that most sites would have >30% of rows with autoload
set to yes
.
Anyway, to determine if this index would be helpful to you, and the degree of helpfulness, it would be most useful to see how long the query takes to run with and without the index, rather than looking at the number of rows read. (If you're running on a test machine, run the query at least twice, to ensure the cache is warmed up.
If you're not using InnoDB, I'd also be curious to see how switching to that, with a large innodb_buffer_pool_size
, changes the query time.
Finally, if you're keeping stats on page generation time, I'd be interested to to see how this changes in production with an autoload
index.
#14
@
11 years ago
Agreed on adding that specific index—again, just trying to replicate results. Simply adding an autoload index is what made the monumental change in my case.
I'll run these tests and post the results here. Thanks!
#16
@
11 years ago
On an InnoDB database with about 615K rows in wp_options
, here's what came across, running locally in MAMP. I started with an autoload
index, removed it, and added it back.
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'; 628 rows in set (0.01 sec) ALTER TABLE wp_options DROP INDEX autoload; SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'; 628 rows in set (2.30 sec) CREATE INDEX autoload ON wp_options(autoload, option_name); SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'; 628 rows in set (0.00 sec)
Does that help?
#17
@
11 years ago
Also, I went through these steps again and ran the SELECT query a few more times after removing the index as requested. Still got:
628 rows in set (2.24 sec) 628 rows in set (2.24 sec) 628 rows in set (2.23 sec)
Then:
CREATE INDEX autoload ON wp_options(autoload, option_name); SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'; 628 rows in set (0.01 sec) 628 rows in set (0.01 sec) 628 rows in set (0.00 sec)
#19
@
11 years ago
Because the Transients API has been perfect for a specific project needing lots of individual external API calls cached.
Again, not arguing for wp_options
size, just trying to show where the index can make a big difference.
#20
follow-up:
↓ 22
@
11 years ago
I figured it would be transients.
You should enable an external object cache, that way transients aren't stored in the database. If the lack of an index on this table forces hosting companies to consider the availability of an external object cache for their users, I'm not necessarily going to be bothered by that.
I find it hard to justify an index that benefits an infinitesimal number of sites, probably slows down many others, especially were the sites affected have better steps to take to improve performance.
#21
@
11 years ago
Actually, in this specific case, object caching _was_ enabled and kept getting overloaded, so this was the only option after having to disable it.
Again, totally understood this is not a common case, but I also wanted to demonstrate that it could make a big difference for some.
Here's a patch just in case. :)
#22
in reply to:
↑ 20
;
follow-ups:
↓ 23
↓ 24
@
11 years ago
Sorry I never came back with statistics. I'll try and find some of the customer sites I've worked on and generate some data.
Replying to pento:
If you're not using InnoDB, I'd also be curious to see how switching to that, with a large
innodb_buffer_pool_size
, changes the query time.
In almost all cases everyone should be using InnoDB for a WP database, excluding the tables that have FULLTEXT indexes of course. (Then again, full text indexing should really be shifted to something like Solr, but that is another story...) The nature of the wp_comments especially cannot scale with MyISAM, the write level of a busy WP site means that you'll end up with table locks. And the number of plugins that utilise wp_meta or wp_options to store post views, or similar that get updated on every view... Again leads to locking that blocks the viewing of the site.
Just by having a large buffer pool that doesn't change how the MySQL query planner/statistic collator/table scanning works. It's not about if the data is in RAM or not, the wp_options table even in the cases I've seen with up to 10k rows, isn't actually a lot of data, and will either end up in the buffer pool or the OS file system caches. The issue is that the planning has to examine all the rows in the table, every time the page is loaded. That's dependent upon CPU time.
Heck you can put your buffer pool up to 5x your database size and you'll see 0 performance improvement over 1x. Ideally the buffer pool should be as big as the dataset that is accessed often. Ergo if you have a 10GB InnoDB dataset and 9GB of that is archive of logs/actions/very old posts/etc you only really need a 1GB buffer pool.
But that is the thing, we are moving into the realm of database optimisation, and to be honest from my view that isn't always seen by developers. Don't even get me started on Magento.
Replying to nacin:
You should enable an external object cache, that way transients aren't stored in the database. If the lack of an index on this table forces hosting companies to consider the availability of an external object cache for their users, I'm not necessarily going to be bothered by that.
Quite often the hosting company cannot control the customer fully. The customer may have limited budget and we work within their constraints.
Also there are lots of plugins out there that are already using transients, most users of WP that are not developers/sys admins/etc do not understand that maybe they have used the inappropriate method of storing data.
Replying to nacin:
I find it hard to justify an index that benefits an infinitesimal number of sites, probably slows down many others, especially were the sites affected have better steps to take to improve performance.
How do you think this will slow things down? It is an index of boolean cardinality, therefore the binary tree will have only two branches (YES|NO). The addition/updating of rows only require very small index changes. I really cannot understand how you think it will slow things down.
#23
in reply to:
↑ 22
@
11 years ago
Replying to DanBUK:
But that is the thing, we are moving into the realm of database optimisation, and to be honest from my view that isn't always seen by developers. Don't even get me started on Magento.
My main basis for this is that developers don't see the application later in production on much busier sites that their test box that takes 10 requests per minute. Once you are dealing with a site that takes 100 requests per second minor changes to indexes in this suggestion make a huge difference. The Read/Write ratio is what counts.
#24
in reply to:
↑ 22
;
follow-up:
↓ 25
@
11 years ago
Replying to DanBUK:
But that is the thing, we are moving into the realm of database optimisation, and to be honest from my view that isn't always seen by developers.
Pardon me :) I'm just watching over this ticket, playing devil's advocate and working to turn assumptions into data. On the other hand, pento spent something like four years at MySQL AB and Oracle. So feel free to continue to geek out with him.
#25
in reply to:
↑ 24
@
11 years ago
Replying to nacin:
I'm just watching over this ticket, playing devil's advocate and working to turn assumptions into data.
That's needed! I've been using this for a while now, but never saw a ton of good data on it either.
How can we test the impact of the index on a DB on the other extreme and test the assumption it might slow things down?
#26
@
11 years ago
Okay, here are some test results! All of this was on MySQL 5.5, I didn't test on other versions. The MyISAM results should be largely the same on other versions, the InnoDB results may be a little slower on older versions, but I don't expect there to be massive differences.
Anyway, here are the variables tested:
- Engine: InnoDB or MyISAM
- Rows: 100, 1000, 10000, 100000, 1000000
- Percentage of rows selected: 5%, 20%, 50%, 80%, 95%
The full testing script and results are attached, but here's the summary:
- Adding this index causes nearly all queries to be slower, except for InnoDB at 100k rows. I have no idea why this is the case.
- Once MyISAM gets to 100K rows, adding the index makes it exponentially slower.
- I killed the test script before it finished MyISAM 1M rows, because each query was taking > 2 minutes to run.
#27
@
11 years ago
- Keywords reporter-feedback removed
I've performed some slightly different testing. With some restrictions.
As we are only issuing queries with autoload=yes the results for =no are not relevant.
With any site that has a high level of read/write concurrency MyISAM won't scale. (You would be surprised how many modules update wp_options on every page load.)
The results indicate that only in the case when 50% or more of the rows have autoload=yes things slow down. But this is by a negligible amount, even with 100k rows in wp_options the difference was only 0.02s (6.45%). Yet in the case of 100k rows and 20% of the rows have autoload=yes the gains in speed are 0.07s (35%).
Therefore I propose that the gains offered by adding this index outweigh the losses.
#28
@
11 years ago
And just for completeness, similar results as I mentioned for innodb are the same for MyISAM.
#29
follow-up:
↓ 31
@
11 years ago
With attachment:dan_test.php, I still had the same result of MyISAM getting really slow for large tables:
myisam_VARCHAR20_Size_100000_Percent_50 'YES' 0.97143130302429 myisam_VARCHAR20_INDEXED_Size_100000_Percent_50 'YES' 11.403997087479 >> Slower myisam_VARCHAR20_Size_100000_Percent_20 'YES' 0.96122975349426 myisam_VARCHAR20_INDEXED_Size_100000_Percent_20 'YES' 5.91277384758 >> Slower myisam_VARCHAR20_Size_100000_Percent_5 'YES' 0.94139242172241 myisam_VARCHAR20_INDEXED_Size_100000_Percent_5 'YES' 1.2993580341339 >> Slower
As much as I'd love to ignore MyISAM, the vast majority of WordPress sites use it, including WordPress.com. While we're clearly getting different results here, we can't take the risk that other sites will suffer the same slowness I'm seeing.
#30
@
11 years ago
So, the utility of this index is (roughly) inversely proportional to the percentage of rows where autoload = 'yes'? That'd make sense, considering, in my case, 0.001% of rows are such.
To me, this now goes back to:
Replying to nacin:
"Ugly" plugins that add thousands of rows to a table are still very, very likely to autoload all of them, because that's the default.
If that's the case, is there any way we can 'police' this behavior better with plugins, or define and encourage a best practice for it?
#31
in reply to:
↑ 29
;
follow-up:
↓ 32
@
11 years ago
Replying to pento:
How do you have your MySQL instance configured? (I'm really surprised your getting such slow results.)
I have the following:-
thread_cache_size = 8
innodb_buffer_pool_size = 1G
key_buffer_size = 128M
table_cache = 1024
(Running on a 2CPU / 2GB KVM Centos 6 VM, gah I just noticed it's only MySQL 5.1, will install 5.5 and re-test.)
I added my MyISAM results as an attachment earlier.
#32
in reply to:
↑ 31
@
11 years ago
Replying to DanBUK:
(Running on a 2CPU / 2GB KVM Centos 6 VM, gah I just noticed it's only MySQL 5.1, will install 5.5 and re-test.)
I'm getting similar results on 5.5 MyISAM:-
myisam_VARCHAR20_Size_100000_Percent_50 'YES' 0.23201403617859 myisam_VARCHAR20_INDEXED_Size_100000_Percent_50 'YES' 0.29787020683289 >> Slower myisam_VARCHAR20_Size_100000_Percent_20 'YES' 0.13488883972168 myisam_VARCHAR20_INDEXED_Size_100000_Percent_20 'YES' 0.12055621147156 >> Faster myisam_VARCHAR20_Size_100000_Percent_5 'YES' 0.086687040328979 myisam_VARCHAR20_INDEXED_Size_100000_Percent_5 'YES' 0.030073165893555 >> Faster
#33
@
11 years ago
I've had a bit more of a play around with my MySQL settings, but I haven't managed to make performance suck less on MyISAM.
I'm using a standard VVV install for testing. You're welcome to play around with it and see if you can get better performance out of it, but I'm strongly inclined to not add this index. VVV is a pretty popular base for WordPress dev shops to create sites with these days, so if I'm running into performance problems, there's a good chance a bunch of other sites will, too.
#34
@
10 years ago
At Pagely on a semi-regular basis we end up with sites with hundreds of thousands of rows in wp_options. The cause of this is normally storing sessions in wp_options. In a case like this having the index on autoload is huge win.
Before index
mysql> explain select option_name, option_value from wp_options where autoload = 'yes'; +----+-------------+------------+------+---------------+------+---------+------+--------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+------+---------------+------+---------+------+--------+-------------+ | 1 | SIMPLE | wp_options | ALL | NULL | NULL | NULL | NULL | 807663 | Using where | +----+-------------+------------+------+---------------+------+---------+------+--------+-------------+
After index:
mysql> explain select option_name, option_value from wp_options where autoload = 'yes'; +----+-------------+------------+------+---------------+----------+---------+-------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +----+-------------+------------+------+---------------+----------+---------+-------+------+-----------------------+ | 1 | SIMPLE | wp_options | ref | autoload | autoload | 62 | const | 265 | Using index condition | +----+-------------+------------+------+---------------+----------+---------+-------+------+-----------------------+
We are talking about reducing query time by 1 second in this instance. As the average size of wp_options grows and grows its a nice low cost index to add that protects people from plugins that use wp_options as a temp data store.
#36
@
10 years ago
Even without thousands of rows, we found benefits in adding this index ourselves. I'd advocate for it to be added in core.
#37
@
10 years ago
I hope you don't mind me barging in here... A referrer from ticket #29637 to my site brought me to this ticket, and apparently it has been submitted more than once (#14258). So there is some demand for this.
In a live-situation, adding an index on the autoload column resulted in a drop of examined rows, from 33927 to 1508. I call that significant. The real performance gain depends on a lot of factors, I know, but I just wanted to share these numbers.
PS: before one starts to install and test with MySQL 5.5 (or 5.6 for that matter), notice the innodb_buffer_pool_instances
*and* innodb_buffer_pool_size
settings. One can even increase InnoDB performance by adjusting innodb_write_io_threads
and innodb_read_io_threads
. As long as you have the RAM. Don't use MyISAM (IMO).
#38
@
10 years ago
- Keywords close added
I think we need to pull the trigger and make a decision here rather than letting this ticket waste away.
All the testing shows that adding it reduces performance for MyISAM which if we take a "first, do no harm" approach, means that this comes accross as a non starter. I think we should close as maybelater and if evidence of it not harming performance for MyISAM comes up, we can revisit.
This of course doesn't prevent individual site owners from adding the index themselves.
#39
@
10 years ago
- Keywords close removed
- Milestone Awaiting Review deleted
- Resolution set to maybelater
- Status changed from new to closed
Agreed, I'm open to revisiting this when we have more InnoDB users. As InnoDB use is steadily increasing, we'll get there eventually. :-)
In the mean time, for folks who happen upon this ticket, you have two options:
- Try adding this index yourself, and see if it improves performance for you
- Use an external object cache to avoid hitting the DB
#40
in reply to:
↑ 1
@
9 years ago
Replying to nacin:
Could you show some statistics for how this helps performance on your sites? See ticket:14258#comment:3 for why we have not done this.
Here are some real world stats. Here you can see when the index was added to the options table:
Here you can see average transaction times decrease when the index was added:
And here you can see the load on the database decrease dramatically when the index was added:
I think adding the index is highly valuable. This comes from a site running WooCommerce where a large percentage of the records in the options table are transients, so there's no reason for them all to be loaded on every page load.
#41
in reply to:
↑ 11
;
follow-up:
↓ 42
@
9 years ago
Replying to nacin:
What's the purpose behind adding option_value to that index? It looks like it would only serve to increase the length of the index.
What's the performance difference on these queries? Reducing rows examined from 1460 to 178 (or even 10,000 to 100) is still peanuts. DanBUK mentioned adding this index many times daily, but I am trying to figure out if that was simply cathartic or if it was actually making a difference.
How about reducing 553,749 to 711?
#42
in reply to:
↑ 41
@
9 years ago
Replying to fightthecurrent:
Replying to nacin:
What's the purpose behind adding option_value to that index? It looks like it would only serve to increase the length of the index.
What's the performance difference on these queries? Reducing rows examined from 1460 to 178 (or even 10,000 to 100) is still peanuts. DanBUK mentioned adding this index many times daily, but I am trying to figure out if that was simply cathartic or if it was actually making a difference.
How about reducing 553,749 to 711?
Nice graphs @fightthecurrent.
We all know the performance gain - on InnoDB - with having an index on the autoload column of the wp_options table.
The following is somewhat based on assumptions, and needs some investigation (it's early Sunday morning here...): nowadays at least most hosting providers are on MySQL version 5.5+, which has InnoDB as the default storage engine. The WordPress installer (wp-admin/install.php
) already checks the MySQL version being used, and there is no engine definition in wp-admin/includes/schema.php
. Thus making an extra check on MySQL storage engine quite possible:
ITTT: If MySQL version >= 5.5 && storage engine = InnoDB, add an index on the autoload column.
There is more than one way to lookup the storage engine: http://stackoverflow.com/questions/213543/how-can-i-check-mysql-engine-type-for-a-specific-table. For plugins that require FULLTEXT indexes on table columns: MySQL >= 5.5.4 supports FULLTEXT indexes in InnoDB (http://dev.mysql.com/doc/refman/5.6/en/fulltext-restrictions.html).
#43
@
9 years ago
- Resolution maybelater deleted
- Status changed from closed to reopened
- Version 3.6 deleted
I was having unusual (and climbing) CPU usages and frequent showings in the slow query log of this query, so I added the index for each Wordpress blog on my SQL server (118 of them), CPU usage for MySQL alone dropped by a factor of 3+ (from about 30-35% to 10%). NO OTHER changes were made..
The databases are a mix of MyISAM (older blogs) and InnoDB (newer ones), ALL blogs except 1 are current Wordpress 4.3. MySQL is 5.5.44 running on Ubuntu 14.04 64 bit on a VMware ESXi host backed by enterprise grade SAN storage. The MySQL server is quad (virtual) core, 8GB ram.
The change to all blogs was made at 13:20 (see picture linked below). The remaining blogs had the index added over a 10 minute period starting around 13:30
[http://imgur.com/qOce6KT
]
This is a fairly busy MySQL server handing about 2100 queries per second pushing about 100Mbits/sec of traffic continuously.
I encourage that this index be added as the drop in resource consumption is significant nad it has essentially no downside to having it there.
#44
follow-up:
↓ 49
@
9 years ago
- Severity changed from normal to major
Here are some results from a test some months old:
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'; 628 rows in set (2.24 sec) 628 rows in set (2.24 sec) 628 rows in set (2.23 sec)
CREATE INDEX autoload ON wp_options(autoload, option_name); SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'; 628 rows in set (0.01 sec) 628 rows in set (0.01 sec) 628 rows in set (0.00 sec)
As you can see, the difference is huge. I may roll out a local copy of this site to do some test, if it helps.
#45
follow-up:
↓ 46
@
9 years ago
From my understanding, MySQL 5.5 by default uses InnoDB. Looks like statistics have most likely gone up since this ticket was created. About 70% are on MySQL 5.5+, and 2% are on MariaDB :joy:
So unless there are hosts that are switching from InnoDB to MyISAM by default, seems like the vast majority are now on InnoDB.
And judging from the results people have been posting in the past 4 months, seems like there are huge advantages to adding the index. Time to revisit?
#46
in reply to:
↑ 45
@
9 years ago
Replying to KalenJohnson:
[...]
So unless there are hosts that are switching from InnoDB to MyISAM by default, seems like the vast majority are now on InnoDB.
Unfortunately, upgrading your MySQL version to 5.5+ doesn't convert your existing MyISAM tables automagically... You'll need to do that yourself, either through MySQL/phpMyAdmin or using a plugin.
And judging from the results people have been posting in the past 4 months, seems like there are huge advantages to adding the index. Time to revisit?
Agreed! :) There is a number of plugins to convert MyISAM tables to InnoDB, and it's up to "us" (I work for a hosting company) to inform clients and customers about such important changes, and to recommend them to convert their existing database tables.
#47
@
9 years ago
So if we go around in circles... As I have with my view of how it's been received.
The original resistance to adding this index was due to MyISAM usage; even by WordPress.org. Which at any time is a bad idea. We are now at a time when InnoDB or other newer storage engines exist and are used by default.
- Every lookup of a page on a wordpress site checks the wp_options for how the plugins should operate.
- They depend upon autoload being true
- It's cardinality of index is very high, its either true or false.
Just add the index...
#49
in reply to:
↑ 44
@
9 years ago
I have a wp_options table with over 600,000 records, mostly WooCommerce sessions and transients, which used to load the initial "autoload" query in 2.5 seconds, just like @francogilio. After creating a secondary index on "autoload", it's loading in 0.01 seconds and the site is MUCH faster in general, possibly because this query prevented table updates while it was being scanned without the index.
To me, such a small index is not something that should be so strictly opposed, considering it affects a core query executed by every page. It's just likely to speed many sites up, including some sites whose owners don't know phpMyAdmin from a bar of soap.
#51
@
9 years ago
Lets create an conditional alter statement which adds an index only when innodb is used! We have the year 2016, innodb is used as the default db in all serious hosting companies today.
Make a cut and do not stuck in the past. This small change will save energy, costs and natural resources worldwide. Save a tree, add an index to wp_options!
#52
@
9 years ago
I am just looking into performance issues on someone's WordPress site and why after 3 years has nothing been done?
There is no logic to making autoload of type VARCHAR(20) as it wastes space and is inefficient as shown above and an index would drastically improve performance for all the many installs out there.
I would have thought a type of CHAR(1) would be best with Y and N but obviously this would take development time whereas adding the index would not.
#53
@
9 years ago
Robin has a good point. Would MyISAM still suffer a performance hit at all if the column was an indexed CHAR(1) instead of an indexed VARCHAR(20)?
#54
@
8 years ago
+1 to changing autoload field to another type like TINYINT or ENUM('no','yes') if you dont add an index to the autoload field
University of Southern California's (USC) WP site was hit by this issue luckily during move-in week. We use WP as an Enterprise portal. All USC members (up to 100k users) access our WP portal as a gateway for services. Every users's portal page is unique = no page caching.
USC has:
mysql InnoDB
WP multisite
lots of users which indirectly has lots of wp_options rows
uses transients to store complex queries/data
does not use an object cache (work in progress)
huge database server sized for our usage + growth (8 vCPUs, 16GB RAM)
One day our database CPU went to 100%. I noticed lots of slow queries doing full table scans. When I ran 'show processlist' I saw hundreds of these -
SELECT option_name, option_value FROM mywp_options WHERE autoload = 'yes';
I immediately dropped an index on it and the cpu usage went from 100% to under 5%:
create index uscautoload on wp_options(autoload, option_name);
Crisis averted and I get to keep my job. I was the one who pushed to use WP as our portal
#55
@
8 years ago
On MariaDB 10.1.9
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes' (262 total, Query took 0.0156 seconds.)
CREATE INDEX autoload ON wp_options(autoload, option_name); SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes' (262 total, Query took 0.0000 seconds.)
#56
@
8 years ago
Is this ever likely to happen?
It seems it was actually requested here 6 years ago and closed as won't fix: #14258
There are two issues here:
- There is a missing index (probably resulting in a table scan each time)
- An unsuitable data type has been used of VARCHAR(20) (CHAR(1) would be much better with Y/N or better still BIT with 0/1)
The two issues compound each other and I feel these are fairly basic relational database design principles.
The first would require no product changes and would be low risk to implement
The second would require some development work (table could be altered to VARCHAR(3) for a smaller improvement)
Still after 6 years of asking, nothing has yet been done.
Robin
#57
@
8 years ago
The test results showed that while the speed improvement is common for innodb
table, but is the exact opposite for myisam
tables. Therefore, it's not a change with a low risk as it has the potential to make many sites slower.
In the last few years, with mysql 5.5+ becoming much more popular, it's likely that new sites are using innodb
, but I don't know if we have any stats on innodb
vs myisam
. @pento - Any idea on that?
An issue with only implementing the change on sites that are using innodb
for the options table is that people move DBs around without knowing the versions. This was seen a lot during the switch from utf8
to utf8mb4
For the sites that benefit from this index, I do encourage them to add the index themselves (with a prefix). Core DB migrations don't remove individually added indexes.
If someone wants to work up some tests on BIT, with an index, without an index, on multiple versions of MySQL using both innodb and myisam than we can explore that as an option. A change like that would need to be committed early in a dev cycle (like we are right now) but can't be made without data showing it would be beneficial for some (though ideally, it would be beneficial for everyone) and have no little to no harm for everyone.
#58
@
8 years ago
It should be easy enough to write code that checks for the storage engine and does the best for what it finds.
It should be equally as easy to decide that InnoDB is the way to go for wp_options on MySQL (5.5+), given the the update frequency and granularity, an simply ensure that that's the case, possibly during install and core updates.
In fact, the code can even run some test queries with and without the index, and leave the better one.
It's a wonderful world of WordPress out there :)
Personally, I've switched to InnoDB and created the autoload index on wp_options and all of my sites smile at me as they zoom by.
#59
@
8 years ago
@jorbin, thanks for the reply.
I'm just wondering but could an upgrade script not possibly move WordPress (or at least that table) to InnoDB maybe with a prompt during the upgrade stating the benefits of InnoDB as otherwise people may just carry on using MyISAM without realising there is a choice and the issue will never be able to be solved with possible further fragmentation in future.
I helped a client solve an issue where they were using MyISM and Dutch as the codepage because the previous hosting provider had these set as default (the website was not Dutch).
Also the Ticket URLs in emails never work either and display this error:
No handler matched request to /ticket/24044#comment:58
This is because the # is converted into %23
#60
@
8 years ago
So i just finished a set of tests using ~750 real wp_options tables on mysql 5.6.
Here are the takeaways.
For 90% of my sample set didn't matter either way. Those sites has 2-500 options, and autoloaded 90%+ of them. With either Myisam or innodb index or not, the performance difference is in noise land.
The queries normally took around 0.001 seconds and the speed difference was in the 0.0002 range.
In general MyISAM is slightly slower in these cases, and Innodb is slightly faster, but I wouldn't consider either difference meaningful.
The final 10% are mostly 2 cases.
Sites with huge #s of options all autoloaded, this is slower in both myisam and innodb, but the absolute performance already sucked, and the php performance of autoloading 100k options is going to be a couple orders of magnitude worse then the db side so I think we can just ignore that outlier.
The remaining case seems to be 20% or less of the options autoloaded.
So my overall take is this change would be mostly a wash for the majority of site, with big performance wins for innodb in the case of small autoload, large # of rows (myisam is also helped in this situation). With the biggest downside being for innodb as well, large rows all autoloaded.
I'm not seeing the same huge performance losses for myisam on a large # of rows, though I would imagine that depending on the database configuration and amount of memory that could be an issue.
The one thing none of this testing takes into account though is the worst case situation. If you are scanning 100k rows in wp_options to return 500, and do that query at volume you can easily run your database out of cpu, and then performance of that query will fall apart.
Summary from my run at.
https://gist.github.com/jeichorn/d33dfcb6ba687dd84fb65945f59a4f5a
#61
@
8 years ago
Going back to a point that was mentioned a while back, transients likely account for the vast majority of options that are not autoloaded, so perhaps the solution should be applied only when a site has enough of them to make a difference.
Personally, I've seen loads of them created by WooCommerce, and some by Yoast SEO. Perhaps this is not a core issue, but one that should be addressed according to the plugins' use of transients. Problem is I can't think of a way to enforce this as a core edict, only as an appeal to the respective plugin developers.
Maybe there can be an upgrade or periodic task to look at the ratio of autoloaded options to the total rows and if it exceeds, say, 0.5, the table can be altered. Alternatively, the number of transients can be counted and if greater than, say, 500, the table can be altered.
#62
@
8 years ago
I am doing some post-op investigation after our outage. It looks like there are multiple triggers to our failure and not due just to the lack of the autoload index. I am trying to figure out two things:
1) when is this query triggered - select option_name, option_value from wp_options where autoload = 'yes';
2) why is the query being run multiple times? During our outage, there were hundreds of the query running.
I think a high traffic site exacerbates this problem. Info on our install:
- WP multisite
- five 8 CPU 16GB RAM WP instances
- one 8CPU 16GB database MySQL InnoDB
- 100 WP requests per second during peak traffic. These are dynamic requests, not static/assets
- Our wp_options row count fluctuates in the 50-100k range, autoload='yes' count is in the 100s
We are planning to add a caching layer, either Redis or memcached, which will increase our use of transients. Will we run into a problem even with the autoload index when we have thousands of options autoloaded? I need to find an answer to my two questions above.
Running the query below on our test system without an index, runs really fast even though it is doing a full table scan (.01 seconds); but the row count on wp_options is 3k. This is why I suspect the lack of index alone is not an issue at least for our case.
select option_name, option_value from wp_options where autoload = 'yes';
#63
@
8 years ago
@jkhongusc yeah so thats what i mentioned at the end of my notes. Table scan queries even when fast are heavy cpu users on the database. If you do your benchmarks at high concurrency you should see a larger effect from the index.
#64
@
8 years ago
@jeichorn -
I was implying that there may be a flaw in the logic when all options are loaded; essentially when calling wp_load_alloptions() . wp_options is site-wide data. IMO autoloading data should optimally occur once per server instance, across all instances pseudo-simultaneously.
I submit that in a high traffic environment, like ours having 100 (dynamic) requests per second, could result in a similar database CPU lockup even with the autoload index. When the 'alloption' cache is deleted, WP will attempt to refresh the cache for every single request until that cache is added. If there are lots of transients (thousands) the query could take too long to fulfill before the next set of requests come in. Then a cascading issue occurs and the database CPU locked at 100%. Sounds like a design flaw that will only hit high traffic sites.
#65
@
8 years ago
It looks like the function wp_load_alloptions() is called many times, but it caches its results using wp_cache_add().
Alas, it doesn't cache options when is_multisite() is true.
#67
@
8 years ago
Having an index on a boolean column can make a significant difference - more so in more recent versions of MySQL and MariaDB.
Let's just check this in the real world, as that's what matters (data from a live server):
> explain SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'; +------+-------------+------------+------+---------------+------+---------+------+-------+-------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+------------+------+---------------+------+---------+------+-------+-------------+ | 1 | SIMPLE | wp_options | ALL | NULL | NULL | NULL | NULL | 60453 | Using where | +------+-------------+------------+------+---------------+------+---------+------+-------+-------------+ > ALTER TABLE wp_options ADD INDEX (autoload); > explain SELECT option_name, option_value FROM wp_options WHERE autoload = 'no'; +------+-------------+------------+------+---------------+----------+---------+-------+-------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+------------+------+---------------+----------+---------+-------+-------+-----------------------+ | 1 | SIMPLE | wp_options | ref | autoload | autoload | 82 | const | 30228 | Using index condition | +------+-------------+------------+------+---------------+----------+---------+-------+-------+-----------------------+ 1 row in set (0.00 sec) > explain SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes'; +------+-------------+------------+------+---------------+----------+---------+-------+------+-----------------------+ | id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra | +------+-------------+------------+------+---------------+----------+---------+-------+------+-----------------------+ | 1 | SIMPLE | wp_options | ref | autoload | autoload | 82 | const | 2329 | Using index condition | +------+-------------+------------+------+---------------+----------+---------+-------+------+-----------------------+
As you can see, there is quite a difference with an index, particularly for the 'yes' case.
And in both cases the server does choose to use the index rather than do a tablescan - so the optimiser thinks that it's more beneficial.
Now, mind that with the InnoDB storage engine a secondary index (such as autoload) causes a double lookup. That is, once it finds an entry in such an index, it has a pointer to the primary key where it then has to look up the rest of the row. So there is a cost to using a secondary index (rather than scanning the table using the clustered PRIMARY KEY), and the optimiser knows it.
With this in mind, I think we can safely say that having the index makes sense. It may not make a huge
difference per query, but WP does a lot of queries and thus every optimisation helps.
It is absolutely true that having more indexes slightly slows down writes, but that's not an argument against having indexes as such - it's an argument in having ineffective (superfluous) indexes.
This one is clearly useful. And since most operations on this table are in fact reads, not writes, the balance is really seriously in favour of having the index (as also indicated by the EXPLAIN output above).
In MyISAM the gains would not necessarily be the same, however it will benefit. I would also note that any new install now will use InnoDB, as that has been the default engine in MySQL and MariaDB for some years already now. Granted, old installed may well still be using MyISAM, but if those installs cared for performance (or scaled up) they'd have done something about the engine they use (ALTER TABLE ... ENGINE=INNODB is all that's needed for that) as well as some decent server tuning. It's not something we want to be worrying about here, pandering to an archaic lowest denominator hurts users now.
Thanks
#68
@
7 years ago
So I run a multisite install and our based _options table has 596386 rows in it
- 298024 rows with a keyname that starts with "_transient_timeout_jpp" (JetPack related)
- 298108 rows that start with "_transient_" but not "_transient_timeout_jpp"
- 254 rows that do not start with "_transient_".
Due to this we had severe performance issues with the autoload query from this table. Adding the index on autoload immediately improved performance. Adding the index by default seems to be such a small thing to do by default for the cases where people exceed the norm (which will be the majority of small websites).
Hopefully this ticket can be implemented so others don't have the same problem.
#70
@
7 years ago
@pento curious if your opinions have changed at all since your findings in https://core.trac.wordpress.org/ticket/24044#comment:33? Perhaps we could detect if the database is InnoDB and add the index conditionally as suggested elsewhere in the thread?
#72
@
6 years ago
- Severity changed from major to normal
Adding this index will improve concurrency for all sites, even though the perceived performance gain is minimal.
Doing 50000 row table scans may take only 0.03 seconds - but the machine is using 100% CPU for those 0.03 seconds. CPU that could be used by PHP, the web server, the firewall or any other service running on the server. (This is also wasted energy, in most cases energy generated from fossil fuels for billions of page views generated by WordPress.)
Also, in ordinary LRU caching schemes, reading the entire table from disk to do a table scan will cause valuable file system memory caching space to be wasted.
The argument that most users only have a few hundred rows in the table is invalid - adding this index will be completely unnoticeable for them.
The cost of adding the index is minimal/non-existent and the benefits are obvious for those users that do benefit from it.
I measured this on a customer:
Before index - query appeared in slow query log:
# Query_time: 0.038143 Lock_time: 0.000081 Rows_sent: 1156 Rows_examined: 50089
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes';
After index, turned on profiling because it was no longer in slow queries log:
show profiles;
+----------+------------+---------------------------------------------------------------------------------+
| Query_ID | Duration | Query |
+----------+------------+---------------------------------------------------------------------------------+
| 2 | 0.00617425 | SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes' |
+----------+------------+---------------------------------------------------------------------------------+
#73
@
6 years ago
- Type changed from enhancement to defect (bug)
Here we are, 5 years and 4 months on. Gutenberg has been worked on and the lack of a boolean index has been wasting millions of CPU cycles, possibly even contributing towards global warming; who knows. What we do know is that WordPress "dev team" or what ever is the term, seem stubborn to adding an index that is logical and has been for the last 5 years and 4 months. I only remembered this ticket as I stumbled upon some agencies blog mentioning this radical change to WordPress, oh it's a GUI update to the editor, let's see if they ever added that index...
#74
@
6 years ago
OMG, I wish I had seen this thread somewhat sooner. We have added an index on the autoload column last week and our performance has increased significantly! I agree with @DanBUK that the lack of this index has caused a lot of global energy. Not joking. Check this link to show the difference with/without index (CPU and Response times): https://www.use.com/OhOsu
It might not make a difference for small sites, but for big sites this is really a major improvement!
#75
@
6 years ago
Could we get motion on this seriously? It's such an easy fix that doesn't break BC
This does improve performances a lot, without indexes my admin dashboard usually takes 4s to load, with this indexes it takes 2s
With query debug enabled you can see a lot of slow options queries before taking 0.08s, and then they all disappear when the index is added
We can also take it further and add an index on option_values of size 20 because the query of update_transient does a lookup on the option_value and is therefore super slow (0.06s each seriously),
I added this index as well and the performances went from 2s to .5s all the transient update queries now take 0.0001s
Also each update transient does the request 2 times (for the timeout as well) and the index of 20 will make sure that the timeout is always indexed (I think it fits in 10 chars if I'm not mistaken)
#76
@
5 years ago
Any updates on this ticket?
It is a tiny optimization that can improve a lot the performances in a lot of cases!
#77
@
5 years ago
This is always a problem on larger WordPress sites and I can't understand why after 9 years no one has added this index (there is an older ticket for this recorded as won't fix: #14258).
Also there is no logic to making the autoload column VARCHAR(20) as it wastes space and is inefficient. It should be either BIT/CHAR(1). However this would take dev time though whereas a simple index would not.
What is the reason for not implementing this change? It is a minor change with a major positive impact.
In the 9 years that have passed since this was first flagged has anyone ever experienced a negative impact to this change?
In that time a lot has moved to the cloud and we are charged by each CPU cycle so as well as being slower it also costs more.
#78
@
5 years ago
- Keywords commit added
- Milestone changed from Awaiting Review to 5.3
Hey, everyone! How are things going over on this ticket?
Since we have a couple of months until 5.3 is likely released, let's add this index now and get some folks testing it, to check that it doesn't have a negative impact.
#79
@
5 years ago
- Owner set to pento
- Resolution set to fixed
- Status changed from reopened to closed
In 45805:
#80
follow-up:
↓ 81
@
5 years ago
Just a little clarification this affects only new websites installation and not previous installation, I am right?
#81
in reply to:
↑ 80
@
5 years ago
Replying to Mte90:
Just a little clarification this affects only new websites installation and not previous installation, I am right?
No, this change will be applied to all installations when WP is updated ( by using dbDelta() ) but as already said it's very unlikely that this change will affect any of your sites negatively.
#82
@
5 years ago
It's great news this is finally being added by default but what will happen if this index already exists as users have added it themselves? Will this part fail gracefully if they happened to name it using the same convention? If not then I guess two indexes would exist, not that this should cause any issues and it is certainly better having this in place. Hopefully it won't cause the entire upgrade to fail midway but maybe someone is able to test this?
Running various instances of WordPress on my hosting servers I do expect to see a noticeable positive impact (assuming people are keeping up to date), especially on the larger/busier sites.
#83
follow-up:
↓ 84
@
5 years ago
@robinwilson16: If they use the same index name, a warning will be generated when the DB upgrade happens (on single sites, when wp-admin
is visited next, on multisites, when a network admin does the DB upgrade). If they have a different index name, MySQL will create the index, IIRC. Having a duplicate index is sub-optimal, but not a deal breaker.
As a general rule, we don't try to handle compatibility with altered schemas during upgrades: if you've changed the schema, you'll need to handle an incompatibilities caused.
#84
in reply to:
↑ 83
@
5 years ago
@pento ok thank you for letting me know.
Replying to pento:
@robinwilson16: If they use the same index name, a warning will be generated when the DB upgrade happens (on single sites, when
wp-admin
is visited next, on multisites, when a network admin does the DB upgrade). If they have a different index name, MySQL will create the index, IIRC. Having a duplicate index is sub-optimal, but not a deal breaker.
As a general rule, we don't try to handle compatibility with altered schemas during upgrades: if you've changed the schema, you'll need to handle an incompatibilities caused.
#85
follow-up:
↓ 86
@
5 years ago
@DanBUK should get honorable mention in the next release. For real.
Its funny reading these threads from 5-8 years ago. The original posts make perfect sense, and the push-back just seems so arbitrary. And then - no more push-back. (Tho this is really very similar to the push-back I read in make.core regarding things like major version auto-updates. The push-back from leads is real, almost un-open-source-like in its essence. But the reasoning behind it just doesn't stand the test of time).
#86
in reply to:
↑ 85
@
5 years ago
Totally agree! Thanks @DanBUK at least we got there in the end!
At a college we used a large popular college MI system that had zero clustered indexes on any tables. It took a lot of evidence and perseverance to convince the software suppliers they needed these with them just saying their software was "special" and there was no benefit! I had to provide sustained evidence of the massive improvements (processes went from running all night to taking about a minute) and then one day they suddenly agreed with me and used my scripts to apply the change to all customers across the UK. It's difficult to fix performance issues when it is poor dev practices and not the servers, plus amending databases can invalidate support agreements.
Replying to apedog:
@DanBUK should get honorable mention in the next release. For real.
Its funny reading these threads from 5-8 years ago. The original posts make perfect sense, and the push-back just seems so arbitrary. And then - no more push-back. (Tho this is really very similar to the push-back I read in make.core regarding things like major version auto-updates. The push-back from leads is real, almost un-open-source-like in its essence. But the reasoning behind it just doesn't stand the test of time).
#87
@
5 years ago
- Keywords has-dev-note added; commit removed
Mentioned in the Miscellaneous Developer Focused Changes dev note for 5.3: https://make.wordpress.org/core/2019/10/15/miscellaneous-developer-focused-changes-in-5-3/
Patch to add index onto wp_options autoload