Opened 13 years ago
Closed 13 years ago
#17998 closed defect (bug) (fixed)
dbDelta uses unrestricted "show all" which doesn't scale or work with DB sharding plugins
Reported by: | brianlayman | Owned by: | |
---|---|---|---|
Milestone: | 3.3 | Priority: | normal |
Severity: | major | Version: | 3.2 |
Component: | Database | Keywords: | |
Focuses: | Cc: |
Description
The dbdelta function does a straight "show tables" which is not scalable to large multisite installs.
The routine is built to look at and upgrade or create all of the tables in the 'global' table list (blogs, sitemeta & etc) as well as tables for the individual site being upgraded (comments, options & etc). In the current version, it asks MySQL to list all of the tables it knows and then iterates that entire list picking out just those tables listed above. On large sites, this simple "show tables" request can take 5+ minutes to complete.
This patch uses the standard MySQL show tables call with a where clause to limit the query to the just the global tables and the tables that start with the prefix for the site we are looking at.
It should be fully compatible with both standalone and MS WordPress installations.
It requires MySQL 5.0 and therefore will only work with WordPress 3.2+
I use this in the field with two independent clients each with more than 10,000 sites in their MS network. With over 90,000 tables, the one site would have taken 8 days to complete its upgrade. With this patch, it took less than 4 hours to complete the upgrade. The other network was timing out with every new user that signed up. This caused inconsistent issues ranging from tables never being created for the blogs to the default themes and settings never being applied. Sometimes, if another blog was created right away, a blog creation would succeed. I think this was when the query result was still cached by the MySQL server.
Attachments (4)
Change History (29)
#3
@
13 years ago
- Keywords reporter-feedback added
If you only ugrade the tables for a particular site, that means that you will have to run it for each site in the network. How is that faster?
#4
@
13 years ago
Because this routine is already called for each site. Specifying a where clause that restricts the records to only those records likely to match the criteria in the if statement later in this routine, you reduce the possible result set from tens of thousands of records (or more) to perhaps a dozen or two records. That's how it is faster, going minutes to milliseconds. With this routine called thousands of times in an upgrade, the improvement is multiplied many many times.
The if statement is still required as plugins may have added additional per site tables.
#5
@
13 years ago
- Keywords reporter-feedback removed
- Milestone changed from Awaiting Review to 3.3
Ok, then that makes sense.
#8
@
13 years ago
We've had similar issues, but they're due to the fact that our data is sharded, so each table may be in a different database or a different server. This means that just running SHOW TABLES
and then looping through them will never work. The only solution that I can think of is to loop through each table in $cqueries
and check that table's existence individually. It's not as performant, but it opens a spot for plugins like HyperDB to route that query to the right database.
Thoughts?
#9
@
13 years ago
Why does dbDelta bother with SHOW TABLES at all? Why not just loop
over the table names and do DESCRIBE TABLE on each one?
foreach ( $cqueries as $table => $qry ) { $tablefields = $wpdb->get_results( "DESCRIBE $table" ); if ( $tablefields ) { // Generate ALTER TABLE statements } else { // Use CREATE TABLE statement } }
Related:
http://lists.automattic.com/pipermail/hyperdb/2011-October/000228.html
http://lists.automattic.com/pipermail/hyperdb/2011-October/000229.html
#10
@
13 years ago
- Keywords needs-patch added; has-patch removed
- Summary changed from New Site & Upgrade process uses unrestricted "show all" causing timeouts to dbDelta uses unrestricted "show all" which doesn't scale or work with DB sharding plugins
Fully agree, it's wasteful an doesn't scale to run a SHOW TABLES. This issue breaks dbDelta with our multidb plugin too (http://premium.wpmudev.org/project/multi-db) for the same reason. Plugins that use it trigger db errors as dbDelta tries to create an already existing table.
If it could be switched to a "DESCRIBE $table" then sharding plugins can intercept that and direct it to the proper DB.
#12
@
13 years ago
Someone already has https://core.trac.wordpress.org/ticket/19007
#13
@
13 years ago
Took a shot at andy's approach. 17998.diff. Wholly untested. I left stuff improperly indented so as not to muddle the diff.
#14
@
13 years ago
Updated it so we actually use the value for the foreach of cqueries, in 17998.2.diff.
#15
@
13 years ago
Ack! This original patch wasn't applied to 3.3 yet? It's Milestone 3.3, it really should have been included in beta 1...
I was just going to argue that this ticket shouldn't be changed as it is the most compatible with 3.2 and earlier and it's too late in the game to switch to all new code for this release.
The change in the original is required in order to update large sites. It's been tested already on several large sites I've managed (210,000 blogs on one 12,000 on another), so I know it works and is stable.
Since there's another ticket for the alternative method, I'd really like to see it applied in 3.4 and this one make it in 3.3.
#16
@
13 years ago
17998.2.diff generates some warnings during scratch install, but otherwise works:
WordPress database error: [Table 'wordpress.wp_test_users' doesn't exist] DESCRIBE wp_test_users;
#17
@
13 years ago
17998.3.diff does a simple suppress of potential errors from DESCRIBE.
Adds a where clause to the show all command