WordPress.org

Make WordPress Core

Opened 15 months ago

Last modified 10 days ago

#24044 new enhancement

Add index to wp_options to aid/improve performance

Reported by: DanBUK Owned by:
Milestone: Awaiting Review Priority: normal
Severity: normal Version: 3.6
Component: Options, Meta APIs Keywords: has-patch
Focuses: performance Cc:

Description

I work for a hosting company and we manage many '000s of WP sites for people. In order to improve the performance of all of these instances I find myself adding an index onto wp_options autoload many times daily.

I'm aware this is a small gain, but any gain is good.

This index will be hit on every single page load.

Attachments (7)

autoload_key.patch (552 bytes) - added by DanBUK 15 months ago.
Patch to add index onto wp_options autoload
24044.diff (566 bytes) - added by cliffseal 6 months ago.
24044-test.php (6.2 KB) - added by pento 6 months ago.
24044-test-results.txt (5.7 KB) - added by pento 6 months ago.
dan_results.txt (1.7 KB) - added by DanBUK 6 months ago.
test results
dan_test.php (3.3 KB) - added by DanBUK 6 months ago.
dan_results_myisam.txt (1.7 KB) - added by DanBUK 6 months ago.
test results myisam

Download all attachments as: .zip

Change History (41)

DanBUK15 months ago

Patch to add index onto wp_options autoload

comment:1 nacin15 months ago

Could you show some statistics for how this helps performance on your sites? See ticket:14258#comment:3 for why we have not done this.

comment:2 DanBUK15 months ago

  • Summary changed from Add index to wp_options to aid improve performance to Add index to wp_options to aid/improve performance

This may be the case, but there are some godforsaken ugly plugins out there which mean that wp_options ends up '000s of rows. In these situations this will help with performance.

Also the maintenance of such an index is not placing a lot of burden on the DB. Overall this is reducing the work to deliver the page, that is what matters.

comment:3 DanBUK15 months ago

Also the index will still be used for statistics, rather than assuming a full table scan every time. The index has a high level of cardinality.

comment:4 follow-up: DanBUK15 months ago

And assuming that all wp_options tables will have > 30% autoload == yes isn't prudent, there are many ways in which people use the wp_options table.

comment:5 in reply to: ↑ 4 nacin15 months ago

Replying to DanBUK:

And assuming that all wp_options tables will have > 30% autoload == yes isn't prudent, there are many ways in which people use the wp_options table.

I disagree; it's a very fair assumption. "Ugly" plugins that add thousands of rows to a table are still very, very likely to autoload all of them, because that's the default. The alternative is each get_option() is an extra query, which you'd probably notice as a host.

The maintenance of an index is not placing a lot of burden, no. But if the index is just sitting there waiting to be ignored, then it doesn't do us much good. The larger issue is that I haven't seen any evidence — in terms of performance or otherwise — to actually show that an index here would reduce the work to deliver the page.

I'm not saying you've added an index to thousands of sites yet assuredly have no performance improvements to show for it. I'm saying, show us the measurable improvement so we can see it for ourselves.

comment:6 toscho15 months ago

  • Cc info@… added

comment:7 pento15 months ago

  • Cc gary@… added

comment:8 DeanMarkTaylor14 months ago

  • Cc DeanMarkTaylor added

comment:9 krazybean13 months ago

On a table with above 1k rows in wp_options

Before autoload index:

mysql> explain select option_name, option_value from wp_options where autoload = 'yes';
+----+-------------+------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table      | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+------------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | wp_options | ALL  | NULL          | NULL | NULL    | NULL | 1460 | Using where |
+----+-------------+------------+------+---------------+------+---------+------+------+-------------+

After adding index named i1

mysql> create index i1 on wp_options(autoload, option_name, option_value(50));

mysql> explain select option_name, option_value from wp_options where autoload = 'yes';
+----+-------------+------------+------+---------------+------+---------+-------+------+-------------+
| id | select_type | table      | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+----+-------------+------------+------+---------------+------+---------+-------+------+-------------+
|  1 | SIMPLE      | wp_options | ref  | i1            | i1   | 62      | const |  178 | Using where |
+----+-------------+------------+------+---------------+------+---------+-------+------+-------------+

After the index was created the total rows searched dropped from 1460 to 178.

Last edited 13 months ago by SergeyBiryukov (previous) (diff)

comment:10 cliffseal6 months ago

  • Keywords dev-feedback added

On a WordPress 'app' in which lots of non-autoloaded transients are used, I got similar results, running the same queries as @krazybean.

Before:

mysql> explain select option_name, option_value from wp_options where autoload = 'yes';
+----+-------------+------------+------+---------------+------+---------+------+------+-------------+
| id | select_type | table      | type | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+------------+------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | wp_options | ALL  | NULL          | NULL | NULL    | NULL | 8196 | Using where |
+----+-------------+------------+------+---------------+------+---------+------+------+-------------+
1 row in set (0.01 sec)

After:

mysql> create index i1 on wp_options(autoload, option_name, option_value(50));
Query OK, 0 rows affected (0.52 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> explain select option_name, option_value from wp_options where autoload = 'yes';
+----+-------------+------------+------+---------------+------+---------+-------+------+-------------+
| id | select_type | table      | type | possible_keys | key  | key_len | ref   | rows | Extra       |
+----+-------------+------------+------+---------------+------+---------+-------+------+-------------+
|  1 | SIMPLE      | wp_options | ref  | i1            | i1   | 62      | const |  155 | Using where |
+----+-------------+------------+------+---------------+------+---------+-------+------+-------------+
1 row in set (0.00 sec)

Understood this may not affect the general population, but I wonder how we could best help folks who make extensive use of transients in certain situations.

comment:11 nacin6 months ago

What's the purpose behind adding option_value to that index? It looks like it would only serve to increase the length of the index.

What's the performance difference on these queries? Reducing rows examined from 1460 to 178 (or even 10,000 to 100) is still peanuts. DanBUK mentioned adding this index many times daily, but I am trying to figure out if that was simply cathartic or if it was actually making a difference.

comment:12 cliffseal6 months ago

I was just trying to replicate his results.

I wish I had tracked the difference when I added it to a much larger version of that same app—important queries went from several seconds in some cases to < 1 in all. I assure you it was directly related to adding an autoload index.

Again, I know this isn't the most common configuration, but it's allowing me to work with tons of transients at once with decent page load times.

If you can tell me what sort of feedback/results you need, I'll find a way to get it for you.

comment:13 pento6 months ago

  • Keywords reporter-feedback added; dev-feedback removed

There are a few things to consider here.

First up, there's unlikely to be any benefit to adding (option_name, option_value(50)) to the index. There are no queries in WordPress core that would use them.

So, if we were going to add an index here, it would just be on autoload, which would have a very low cardinality. For some sites, this would potentially avoid a table scan, but I'm inclined to think that most sites would have >30% of rows with autoload set to yes.

Anyway, to determine if this index would be helpful to you, and the degree of helpfulness, it would be most useful to see how long the query takes to run with and without the index, rather than looking at the number of rows read. (If you're running on a test machine, run the query at least twice, to ensure the cache is warmed up.

If you're not using InnoDB, I'd also be curious to see how switching to that, with a large innodb_buffer_pool_size, changes the query time.

Finally, if you're keeping stats on page generation time, I'd be interested to to see how this changes in production with an autoload index.

comment:14 cliffseal6 months ago

Agreed on adding that specific index—again, just trying to replicate results. Simply adding an autoload index is what made the monumental change in my case.

I'll run these tests and post the results here. Thanks!

Last edited 6 months ago by cliffseal (previous) (diff)

comment:15 nacin6 months ago

  • Component changed from Performance to Options and Meta
  • Focuses performance added

comment:16 cliffseal6 months ago

On an InnoDB database with about 615K rows in wp_options, here's what came across, running locally in MAMP. I started with an autoload index, removed it, and added it back.

SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes';

628 rows in set (0.01 sec)

ALTER TABLE wp_options DROP INDEX autoload;
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes';

628 rows in set (2.30 sec)

CREATE INDEX autoload ON wp_options(autoload, option_name);
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes';

628 rows in set (0.00 sec)

Does that help?

comment:17 cliffseal6 months ago

Also, I went through these steps again and ran the SELECT query a few more times after removing the index as requested. Still got:

628 rows in set (2.24 sec)
628 rows in set (2.24 sec)
628 rows in set (2.23 sec)

Then:

CREATE INDEX autoload ON wp_options(autoload, option_name);
SELECT option_name, option_value FROM wp_options WHERE autoload = 'yes';
628 rows in set (0.01 sec)
628 rows in set (0.01 sec)
628 rows in set (0.00 sec)

comment:18 nacin6 months ago

Why do you have 615K rows in your wp_options table?

comment:19 cliffseal6 months ago

Because the Transients API has been perfect for a specific project needing lots of individual external API calls cached.

Again, not arguing for wp_options size, just trying to show where the index can make a big difference.

comment:20 follow-up: nacin6 months ago

I figured it would be transients.

You should enable an external object cache, that way transients aren't stored in the database. If the lack of an index on this table forces hosting companies to consider the availability of an external object cache for their users, I'm not necessarily going to be bothered by that.

I find it hard to justify an index that benefits an infinitesimal number of sites, probably slows down many others, especially were the sites affected have better steps to take to improve performance.

comment:21 cliffseal6 months ago

Actually, in this specific case, object caching _was_ enabled and kept getting overloaded, so this was the only option after having to disable it.

Again, totally understood this is not a common case, but I also wanted to demonstrate that it could make a big difference for some.

Here's a patch just in case. :)

Last edited 6 months ago by cliffseal (previous) (diff)

cliffseal6 months ago

comment:22 in reply to: ↑ 20 ; follow-ups: DanBUK6 months ago

Sorry I never came back with statistics. I'll try and find some of the customer sites I've worked on and generate some data.

Replying to pento:

If you're not using InnoDB, I'd also be curious to see how switching to that, with a large innodb_buffer_pool_size, changes the query time.

In almost all cases everyone should be using InnoDB for a WP database, excluding the tables that have FULLTEXT indexes of course. (Then again, full text indexing should really be shifted to something like Solr, but that is another story...) The nature of the wp_comments especially cannot scale with MyISAM, the write level of a busy WP site means that you'll end up with table locks. And the number of plugins that utilise wp_meta or wp_options to store post views, or similar that get updated on every view... Again leads to locking that blocks the viewing of the site.

Just by having a large buffer pool that doesn't change how the MySQL query planner/statistic collator/table scanning works. It's not about if the data is in RAM or not, the wp_options table even in the cases I've seen with up to 10k rows, isn't actually a lot of data, and will either end up in the buffer pool or the OS file system caches. The issue is that the planning has to examine all the rows in the table, every time the page is loaded. That's dependent upon CPU time.

Heck you can put your buffer pool up to 5x your database size and you'll see 0 performance improvement over 1x. Ideally the buffer pool should be as big as the dataset that is accessed often. Ergo if you have a 10GB InnoDB dataset and 9GB of that is archive of logs/actions/very old posts/etc you only really need a 1GB buffer pool.

But that is the thing, we are moving into the realm of database optimisation, and to be honest from my view that isn't always seen by developers. Don't even get me started on Magento.

Replying to nacin:

You should enable an external object cache, that way transients aren't stored in the database. If the lack of an index on this table forces hosting companies to consider the availability of an external object cache for their users, I'm not necessarily going to be bothered by that.

Quite often the hosting company cannot control the customer fully. The customer may have limited budget and we work within their constraints.

Also there are lots of plugins out there that are already using transients, most users of WP that are not developers/sys admins/etc do not understand that maybe they have used the inappropriate method of storing data.

Replying to nacin:

I find it hard to justify an index that benefits an infinitesimal number of sites, probably slows down many others, especially were the sites affected have better steps to take to improve performance.

How do you think this will slow things down? It is an index of boolean cardinality, therefore the binary tree will have only two branches (YES|NO). The addition/updating of rows only require very small index changes. I really cannot understand how you think it will slow things down.

comment:23 in reply to: ↑ 22 DanBUK6 months ago

Replying to DanBUK:

But that is the thing, we are moving into the realm of database optimisation, and to be honest from my view that isn't always seen by developers. Don't even get me started on Magento.

My main basis for this is that developers don't see the application later in production on much busier sites that their test box that takes 10 requests per minute. Once you are dealing with a site that takes 100 requests per second minor changes to indexes in this suggestion make a huge difference. The Read/Write ratio is what counts.

comment:24 in reply to: ↑ 22 ; follow-up: nacin6 months ago

Replying to DanBUK:

But that is the thing, we are moving into the realm of database optimisation, and to be honest from my view that isn't always seen by developers.

Pardon me :) I'm just watching over this ticket, playing devil's advocate and working to turn assumptions into data. On the other hand, pento spent something like four years at MySQL AB and Oracle. So feel free to continue to geek out with him.

comment:25 in reply to: ↑ 24 cliffseal6 months ago

Replying to nacin:

I'm just watching over this ticket, playing devil's advocate and working to turn assumptions into data.

That's needed! I've been using this for a while now, but never saw a ton of good data on it either.

How can we test the impact of the index on a DB on the other extreme and test the assumption it might slow things down?

pento6 months ago

pento6 months ago

comment:26 pento6 months ago

Okay, here are some test results! All of this was on MySQL 5.5, I didn't test on other versions. The MyISAM results should be largely the same on other versions, the InnoDB results may be a little slower on older versions, but I don't expect there to be massive differences.

Anyway, here are the variables tested:

  • Engine: InnoDB or MyISAM
  • Rows: 100, 1000, 10000, 100000, 1000000
  • Percentage of rows selected: 5%, 20%, 50%, 80%, 95%

The full testing script and results are attached, but here's the summary:

  • Adding this index causes nearly all queries to be slower, except for InnoDB at 100k rows. I have no idea why this is the case.
  • Once MyISAM gets to 100K rows, adding the index makes it exponentially slower.
  • I killed the test script before it finished MyISAM 1M rows, because each query was taking > 2 minutes to run.

DanBUK6 months ago

test results

DanBUK6 months ago

comment:27 DanBUK6 months ago

  • Keywords reporter-feedback removed

I've performed some slightly different testing. With some restrictions.

As we are only issuing queries with autoload=yes the results for =no are not relevant.

With any site that has a high level of read/write concurrency MyISAM won't scale. (You would be surprised how many modules update wp_options on every page load.)

The results indicate that only in the case when 50% or more of the rows have autoload=yes things slow down. But this is by a negligible amount, even with 100k rows in wp_options the difference was only 0.02s (6.45%). Yet in the case of 100k rows and 20% of the rows have autoload=yes the gains in speed are 0.07s (35%).

Therefore I propose that the gains offered by adding this index outweigh the losses.

DanBUK6 months ago

test results myisam

comment:28 DanBUK6 months ago

And just for completeness, similar results as I mentioned for innodb are the same for MyISAM.

comment:29 follow-up: pento6 months ago

With attachment:dan_test.php, I still had the same result of MyISAM getting really slow for large tables:

myisam_VARCHAR20_Size_100000_Percent_50 'YES' 0.97143130302429
myisam_VARCHAR20_INDEXED_Size_100000_Percent_50 'YES' 11.403997087479
>> Slower

myisam_VARCHAR20_Size_100000_Percent_20 'YES' 0.96122975349426
myisam_VARCHAR20_INDEXED_Size_100000_Percent_20 'YES' 5.91277384758
>> Slower

myisam_VARCHAR20_Size_100000_Percent_5 'YES' 0.94139242172241
myisam_VARCHAR20_INDEXED_Size_100000_Percent_5 'YES' 1.2993580341339
>> Slower

As much as I'd love to ignore MyISAM, the vast majority of WordPress sites use it, including WordPress.com. While we're clearly getting different results here, we can't take the risk that other sites will suffer the same slowness I'm seeing.

comment:30 cliffseal6 months ago

So, the utility of this index is (roughly) inversely proportional to the percentage of rows where autoload = 'yes'? That'd make sense, considering, in my case, 0.001% of rows are such.

To me, this now goes back to:

Replying to nacin:

"Ugly" plugins that add thousands of rows to a table are still very, very likely to autoload all of them, because that's the default.

If that's the case, is there any way we can 'police' this behavior better with plugins, or define and encourage a best practice for it?

comment:31 in reply to: ↑ 29 ; follow-up: DanBUK6 months ago

Replying to pento:

How do you have your MySQL instance configured? (I'm really surprised your getting such slow results.)

I have the following:-

thread_cache_size = 8
innodb_buffer_pool_size = 1G
key_buffer_size = 128M
table_cache = 1024

(Running on a 2CPU / 2GB KVM Centos 6 VM, gah I just noticed it's only MySQL 5.1, will install 5.5 and re-test.)
I added my MyISAM results as an attachment earlier.

comment:32 in reply to: ↑ 31 DanBUK6 months ago

Replying to DanBUK:

(Running on a 2CPU / 2GB KVM Centos 6 VM, gah I just noticed it's only MySQL 5.1, will install 5.5 and re-test.)

I'm getting similar results on 5.5 MyISAM:-

myisam_VARCHAR20_Size_100000_Percent_50 'YES' 0.23201403617859
myisam_VARCHAR20_INDEXED_Size_100000_Percent_50 'YES' 0.29787020683289
>> Slower

myisam_VARCHAR20_Size_100000_Percent_20 'YES' 0.13488883972168
myisam_VARCHAR20_INDEXED_Size_100000_Percent_20 'YES' 0.12055621147156
>> Faster

myisam_VARCHAR20_Size_100000_Percent_5 'YES' 0.086687040328979
myisam_VARCHAR20_INDEXED_Size_100000_Percent_5 'YES' 0.030073165893555
>> Faster

comment:33 pento5 months ago

I've had a bit more of a play around with my MySQL settings, but I haven't managed to make performance suck less on MyISAM.

I'm using a standard VVV install for testing. You're welcome to play around with it and see if you can get better performance out of it, but I'm strongly inclined to not add this index. VVV is a pretty popular base for WordPress dev shops to create sites with these days, so if I'm running into performance problems, there's a good chance a bunch of other sites will, too.

comment:34 jeichorn10 days ago

At Pagely on a semi-regular basis we end up with sites with hundreds of thousands of rows in wp_options. The cause of this is normally storing sessions in wp_options. In a case like this having the index on autoload is huge win.

Before index

mysql> explain select option_name, option_value from wp_options where autoload = 'yes';
+----+-------------+------------+------+---------------+------+---------+------+--------+-------------+
| id | select_type | table      | type | possible_keys | key  | key_len | ref  | rows   | Extra       |
+----+-------------+------------+------+---------------+------+---------+------+--------+-------------+
|  1 | SIMPLE      | wp_options | ALL  | NULL          | NULL | NULL    | NULL | 807663 | Using where |
+----+-------------+------------+------+---------------+------+---------+------+--------+-------------+

After index:

mysql> explain select option_name, option_value from wp_options where autoload = 'yes';
+----+-------------+------------+------+---------------+----------+---------+-------+------+-----------------------+
| id | select_type | table      | type | possible_keys | key      | key_len | ref   | rows | Extra                 |
+----+-------------+------------+------+---------------+----------+---------+-------+------+-----------------------+
|  1 | SIMPLE      | wp_options | ref  | autoload      | autoload | 62      | const |  265 | Using index condition |
+----+-------------+------------+------+---------------+----------+---------+-------+------+-----------------------+

We are talking about reducing query time by 1 second in this instance. As the average size of wp_options grows and grows its a nice low cost index to add that protects people from plugins that use wp_options as a temp data store.

Last edited 10 days ago by DrewAPicture (previous) (diff)
Note: See TracTickets for help on using tickets.