Make WordPress Core

Opened 15 years ago

Last modified 14 months ago

#10404 reopened enhancement

dbDelta creates duplicate indexes when index definition contains spaces

Reported by: denis-de-bernardy's profile Denis-de-Bernardy Owned by:
Milestone: Priority: normal
Severity: normal Version: 2.8.1
Component: Database Keywords: has-patch
Focuses: Cc:

Description

I was adding a much needed index in wp_object_term_relationships, and testing revealed it was getting added multiple times.

This works as intended:

CREATE TABLE $wpdb->term_relationships (
 object_id bigint(20) unsigned NOT NULL default 0,
 term_taxonomy_id bigint(20) unsigned NOT NULL default 0,
 term_order int(11) NOT NULL default 0,
 PRIMARY KEY  (object_id,term_taxonomy_id),
 UNIQUE KEY reverse_pkey (term_taxonomy_id,object_id),
 KEY term_taxonomy_id (term_taxonomy_id)
) $charset_collate;");

This doesn't:

CREATE TABLE $wpdb->term_relationships (
 object_id bigint(20) unsigned NOT NULL default 0,
 term_taxonomy_id bigint(20) unsigned NOT NULL default 0,
 term_order int(11) NOT NULL default 0,
 PRIMARY KEY  (object_id,term_taxonomy_id),
 UNIQUE KEY reverse_pkey (term_taxonomy_id, object_id),
 KEY term_taxonomy_id (term_taxonomy_id)
) $charset_collate;");

the only difference between the two is a space in the reverse_pkey column list. we should remove spaces in there to avoid potential bugs.

Attachments (5)

dbDelta_match_extra_spacing.diff (2.1 KB) - added by jbutkus 11 years ago.
Ignore whitespace issues within dbDelta duplicates recognition
10404.patch_case_insensitive_key_matching (2.1 KB) - added by charlestonsw 10 years ago.
Update case insensitive compare on dbDelta plus minor phpDoc updates.
10404.patch_caseinsensitive_and_noindexname_compare (5.5 KB) - added by charlestonsw 10 years ago.
Case insensitive index AND simple index with NO NAME patch (see comments for details)
10404_make_key_case_insensitive_wp_4_1.diff (1.4 KB) - added by charlestonsw 9 years ago.
4.1 patch candidate for case sensitivity fix
10404_case_sensitivity_fix_CORRECT_4_1.diff (917 bytes) - added by charlestonsw 9 years ago.
Case insensitive KEY without syntax errors.

Download all attachments as: .zip

Change History (28)

#1 @nacin
14 years ago

  • Keywords needs-testing added

Patch should fix it, but I can't seem to fully reproduce. Can you test and confirm?

#2 @nacin
14 years ago

  • Keywords needs-testing removed
  • Milestone changed from 2.9 to Future Release
  • Type changed from defect (bug) to task (blessed)

Patch does nothing, I jumped the gun there. The big problem here is that dbDelta wants exacting whitespace standards. The two spaces after "PRIMARY" is documented in the codex, but it also expects exactly one space elsewhere, and as you can see here, exactly no spaces separating field names in an index.

I'm setting this to Future Release and a blessed task to see if someone will pick it up, document dbDelta, and make it less fussy over whitespace.

See also #11322. At the very least, $wp_queries should lead by example.

#3 @nacin
13 years ago

  • Type changed from task (blessed) to enhancement

@jbutkus
11 years ago

Ignore whitespace issues within dbDelta duplicates recognition

#4 @jbutkus
11 years ago

  • Cc butkus.justas@… added
  • Keywords has-patch added

I get, that this ticket is rather old, but I had experienced the very same issues with WP 3.4.2.

My patch is created against trunk version (https://core.svn.wordpress.org/trunk).

In one place (matching columns) preg_match regexp is extended by adding '\s*' instead of plain ' ' (single space - 0x20).

For indexes - I provided a function, which attempts to normalize indexes, by unifying various edge cases, namely: remove multiple whitespaces, replace 'INDEX' for 'KEY' at the beginning of declaration, etc.

These are based on DB DDLs used by our plugin (All-in-One Event Calendar by Timely). If someone encounters another edge case, which is not covered by these modifications - I may pay time to fix for these as well.

#5 @charlestonsw
11 years ago

WP3.5.1 running WPMU is having the same issue.

Here is the error after a recent plugin activation:

WordPress database error Too many keys specified; max 64 keys allowed for query ALTER TABLE wp_2_store_locator ADD INDEX (sl_store) made by require('wp-blog-header.php'), require_once('wp-load.php'), require_once('wp-config.php'), require_once('wp-settings.php'), include_once('/plugins/store-locator-le/store-locator-le.php'), wpCSL_plugin__slplus->__construct, call_user_func_array, SLPlus_Activate::update, SLPlus_Activate->install_main_table, SLPlus_Activate->dbupdater, dbDelta

Here is the dbDelta structure from my plugin:

        	$sql = "CREATE TABLE $table_name (
                	sl_id mediumint(8) unsigned NOT NULL auto_increment,
                	sl_store varchar(255) NULL,
                	sl_address varchar(255) NULL,
                	sl_address2 varchar(255) NULL,
                	sl_city varchar(255) NULL,
                	sl_state varchar(255) NULL,
                	sl_zip varchar(255) NULL,
                	sl_country varchar(255) NULL,
                	sl_latitude varchar(255) NULL,
                	sl_longitude varchar(255) NULL,
                	sl_tags mediumtext NULL,
                	sl_description text NULL,
                	sl_email varchar(255) NULL,
                	sl_url varchar(255) NULL,
                	sl_hours varchar(255) NULL,
                	sl_phone varchar(255) NULL,
                	sl_fax varchar(255) NULL,
                	sl_image varchar(255) NULL,
                	sl_private varchar(1) NULL,
                	sl_neat_title varchar(255) NULL,
                	sl_linked_postid int NULL,
                	sl_pages_url varchar(255) NULL,
                	sl_pages_on varchar(1) NULL,
                	sl_option_value longtext NULL,
                	sl_lastupdated  timestamp NOT NULL default current_timestamp,   		 
                	PRIMARY KEY  (sl_id),
                	INDEX (sl_store),
                	INDEX (sl_longitude),
                	INDEX (sl_latitude)
                	)
                	$charset_collate
                	";
dbDelta($sql)

The table now has a number of duplicate indexes:

sl_store
sl_store_2
sl_store_3
...
sl_store_21

sl_latitude
sl_latitude_2
...
sl_latitude_21

sl_longitude
sl_longitude_2
...
sl_longitude_21

Am I using dbDelta incorrectly or is this still a known issue? Any workarounds?

Last edited 11 years ago by SergeyBiryukov (previous) (diff)

#6 follow-up: @RonStrilaeff
11 years ago

I also get that this is old but it apparently there is still a problem with dbDelta creating duplicate indexes. I am creating a table within the init action hook like this:

function sv_create_favorites_table(){
	global $wpdb;
	$table_name = $wpdb->prefix . "favorites"; 
	$sql = "CREATE TABLE IF NOT EXISTS $table_name (
		fav_id bigint(20) not null auto_increment,
		post_id bigint(20) not null,
		user_id bigint(20) not null,
		fav_date timestamp not null,
		PRIMARY KEY  (fav_id),
		KEY (post_id),
		KEY (user_id)
		);";	
		
	require_once(ABSPATH . 'wp-admin/includes/upgrade.php');
	dbDelta($sql);
};

add_action( 'init', 'sv_create_favorites_table');

... and the only way I could prevent it from creating post_id_2, 3, 4, (and user_id_2, 3, 4), etc was to add the "IF NOT EXISTS" clause to the CREATE TABLE statement (as in my snippet above).

Last edited 11 years ago by RonStrilaeff (previous) (diff)

#7 @SergeyBiryukov
11 years ago

  • Component changed from General to Database

#8 in reply to: ↑ 6 @charlestonsw
11 years ago

Replying to RonStrilaeff:

... and the only way I could prevent it from creating post_id_2, 3, 4, (and user_id_2, 3, 4), etc was to add the "IF NOT EXISTS" clause to the CREATE TABLE statement (as in my snippet above).

How does dbDelta handle the "IF NOT EXISTS" when the data structure changes, for example adding a column during a plugin update? Standard SQL would not add the column because the table exists.

That could be a bad side effect of the workaround.

#9 @sebet
11 years ago

I've ran into the duplicate key name problem when adding an index to a table. After some debugging on the dbDelta function I've found why this was happening. This criteria:

if (!(($aindex = array_search($index_string, $indices)) === false)) {

will only match an existing index if the column name includes the index length.

Example:

The comparison for this index will not match and will cause a duplicate index because the length was not specified

KEY movie_type_idx (movie_type),

The comparison for this index will match and the index will not be duplicated because the length was included

KEY movie_type_idx (movie_type(255)),

#10 follow-up: @charlestonsw
10 years ago

When using dbDelta you MUST use an uppercase KEY qualifier on indices. If you use lowercase the query string comparison will not match the database meta data which comes back with an uppercase "KEY". The loop that compares the index portion of the SQL statement is case sensitive.

The patch file above: 10404.patch_case_insensitive_key_matching has phpDoc updates, some comments, and most importantly turns the index comparison into a case insensitive operation.

The next issue I am investigating is the way the query string is built. In my specific instance I am using this to build a key:

create table x (
id mediumint(8) not null autoincrement,
KEY (id)
)

That will never match the index comparison check as the database meta data return info that dbDelta builds into the following string:

KEY id (id)

Even with the string insensitive search the resulting KEY (ID) does not match KEY ID (ID).

I can fix this by updating my create table command to match the meta data exactly by replaceing KEY (id) in my data definition with KEY id (id), but IMO dbDelta should be a little more forgiving. Otherwise there are a lot of plugins creating extra indices and thus server load on millions of WordPress installations out there. The "Too many keys specified; max 64 keys allowed" message is far too prevalent.

#11 in reply to: ↑ 10 @SergeyBiryukov
10 years ago

Replying to charlestonsw:

The patch file above: 10404.patch_case_insensitive_key_matching has phpDoc updates, some comments, and most importantly turns the index comparison into a case insensitive operation.

Looks like 10404.patch_case_insensitive_key_matching only contains four changes to the inline docs and one unrelated change in wp-admin/includes/post.php. Could you upload the correct patch?

@charlestonsw
10 years ago

Update case insensitive compare on dbDelta plus minor phpDoc updates.

#12 @charlestonsw
10 years ago

Revised patch added.

There are some phpDoc updates, more comments, and only a couple of lines of code changed to call strtoupper() in the appropriate places.

Still coding and testing a patch to help convert simple index creation statements like "KEY (id)" to the meta data generated "KEY id (id)" pattern to eliminate the mis-match on simple keys.

@charlestonsw
10 years ago

Case insensitive index AND simple index with NO NAME patch (see comments for details)

#13 @charlestonsw
10 years ago

A new patch file "10404.patch_caseinsensitive_and_noindexname_compare​" has been uploaded to help reduce the number of duplicate indices that are created by plugins that do not use the WordPress index creation methodology. Someone should probably make some notes around dbDelta in the WordPress docs as well as the syntax is rather specific. What the patch has:

  • A good bit more phpDoc info.
  • A LOT more comments around each code block.
  • Case insensitive comparison of index declaration blocks by shifting both the incoming index creation string and the metadata build strings to uppercase.
  • If the original version of the index comparison fails to find an EXACT match, it then falls back to check for "simple index keys". These are keys WITHOUT an index name defined, such as KEY (id) to index the id field.

The last part, which is well commented in the code, looks at the metadata-derived strings and converts the string that has been built as KEY <idxname> (<fldname>) and drops the optional index name portion if the index name EXACTLY MATCHES the field name portion.

In other words it converts the metadata-derived KEY <idxname> (<fldname>) to KEY (<fldname>) if both idxname and fldname are identical.

Since many plugins seem to like to use the shorthand KEY (<fldname>) syntax for building keys, this may help reduce the "too many keys" errors and improve performance during record insert/update operations on the plugin-provided tables.

This does not fix ALL possible dbDelta "too many keys" errors but it should help reduce the most common errors.

For developers the best method for using dbDelta with regard to key building is:

  • Always use an uppercase KEY directive.
  • Always include an index name in lowercase.
  • Always include your field list in lowercase.
  • Do not add field lengths to the keys.
  • When creating compound keys always eliminate whitespace. The means no spaces before or after parens or around commas.

In other words:
KEY mycompoundkey (id,slid)

Not:
key ( id , slid )

Personally I find the later (spaces around parens) hard to do after training myself to start ADDING spaces around parens to meet the WordPress coding standards which dictate logic operations to be ( a !== b ) versus (a!==b). :/

HTH some other code geeks out there.

@charlestonsw
9 years ago

4.1 patch candidate for case sensitivity fix

@charlestonsw
9 years ago

Case insensitive KEY without syntax errors.

#15 @charlestonsw
9 years ago

file: 10404_case_sensitivity_fix_CORRECT_4_1.diff

Added a patch based on 4.1 source to address the case sensitivity issue on the KEY and index names. This will prevent the following from creating duplicate indices with dbDelta():

CREATE ... key (A,A);
CREATE ... KEY (B,B);

Tested in 4.1 via manual code test.

Do we need to add phpUnit testing?

#16 @charlestonsw
9 years ago

An article I wrote tonight: http://www.storelocatorplus.com/wordpress-dbdelta-better-in-4-2-not-yet-perfect/ DBDelta Review in 4.2.2. This is after cleaning up several websites that had n+ indices on the same exact key phrase. The clean-up happened after tracing performance issues. Turns out having a few-dozen of the same index declaration on a table can make for some very slow add/update operations on those tables.

I saw some https://wordpress.slack.com/archives/core/p1414565793002046 threads on Slack last October about forcing dbDelta() to adhere to a "subset of clean MySQL". No problems here, but we have a notable issue, IMO, when forcing plugin and theme developers to use this subset of SQL degrades the performance of websites that employ otherwise-well-written plugins. Especially when the new docs that tell devs "the right way" were just updated 6 months ago. There are 22,000+ plugins that wrote their CREATE TABLE command years ago and I doubt they will ever read that doc page again.

If you scan the plugin (and theme) source files you will find numerous cases where these two things happen quite frequently:

1) PRIMARY KEY is NOT followed by double spaces.

2) KEY IS followed by more than ONE space.

Item #1 does not cause long term performance issues. However it does force a data structure I/O on every single theme/plugin update. I just fixed this double-space issue in my plugin. For clients that are hosting 250,000+ locations in my custom table that one "small thing" that was hidden from the user was causing a VERY LONG WAIT during a simple plugin update while MySQL recreate, unnecessarily, a primary key.

Item #2 is a bigger issue. This creates duplicate indices on a table. If a table has 5 indices defined and the user updates 10 times over a couple of years they have 50+ indices. It is only a matter of time before MySQL starts throwing out more problematic maximum index count errors and warnings. Upgrades take longer. Data I/O takes longer. All-in-all not good for the performance of the WordPress site.

The question is:

Do you force developers to adhere to very strict standards when using dbDelta()?

If so, http://codex.wordpress.org/Creating_Tables_with_Plugins#Creating_or_Updating_the_Table the online doc updates are a step in the right direction but they need to be a LOT MORE DETAILED.

Also, you need to consider that many users of WordPress plugins and themes will suffer the hidden consequences of poor performance as the "index cruft" builds up through subsequent updates.

Yes, core is clean, but WordPress on a install site is an all-inclusive system.

IMO, dbDelta() should do whatever it can, within reason, to address basic variances in SQL statement structures. At the very least addressing idiosyncrasies in the KEY command itself "double space here, single space there" should be dealt with to make it consistent. That would be far more disruptive to core, however.

I'm willing to write some "clean up inbound SQL for dbDelta" code and build on the recent phpUnit tests that was noted on another ticket to validate the changes, but I don't want to burn time on Core code that won't get used.

My plugin now adheres to all of the quirks of dbDelta, so this is not a "for my plugin" concern. It is a concern for all those sites that now have 64 indexes in place, are taking a performance hit every single day without anyone having a clue.

The next question is how flexible do you make this? What is the real performance hit? dbDelta is run how often v. index queries and updates? If there is a concern, wrap a "DBDELTA_STRICT_DEFINES" constant around code you don't want to add the overhead to. Combine that with the newly-introduced globals list and skip the extra "massaging of SQL" for those things as well. The performance hit could be minimal while making dbDelta more code friendly.

As an aside, there are a LOT of plugins that use whitespace formatting to make nice readable code, using extra spaces to line up the column names and index names, then the field declaration. That is what my code used to look like, an was far easier to read the data structure from. Clean readable code and the dbDelta functional requirements are in direct opposition.

Possibly big changes going into dbDelta()... scary stuff given the potential impact. Maybe starting with a grep of plugin & theme code on the repo and getting some idea of malformed SQL is warranted to determine just how far-reaching this issue is.

Core Team comments?

#17 @charlestonsw
8 years ago

Is it time to decompose this ticket into its several components and close this "mother ticket"?

The issues:

  • Missing spaces in the field list not creating indexes.

Creates Index:

 UNIQUE KEY reverse_pkey (term_taxonomy_id, object_id),

Does Not Create Index:

 UNIQUE KEY reverse_pkey (term_taxonomy_id,object_id),
  • Not specifying key length creates duplicate indexes.

Duplicates:

  KEY movie_type_idx (movie_type),

No Dupes:

 KEY movie_type_idx (movie_type(255)),
  • Case sensitive KEY keyword (must be upper) creating duplicate indexes:

Duplicates:

  key my_key ( slug ),

No Duplicates:

  KEY MY_KEY ( SLUG ),
  • Not specifying an index name creates duplicate indexes.

Duplicates:

  KEY (id ),

No Dupes:

  KEY id ( id ),
  • Double vs. single space after PRIMARY KEY not creating primary key.

Works:

 PRIMARY KEY my_pkey ( id ),

Does Not Work:

 PRIMARY KEY  my_pkey ( id ),
  • Double vs. single space after KEY creating duplicate indexes.

Works:

 PRIMARY KEY my_pkey ( id ),

Does Not Work:

 PRIMARY KEY  my_pkey ( id ),
  • Case sensitive KEY fields creating duplicate indexes:

Duplicates:

  KEY my_key ( slug ),

No Duplicates:

  KEY MY_KEY ( SLUG ),

In 4.4, some of these appear to be fixed.

How do we best decompose this issue into several tickets that address each corner case?

In short , the dbDelta() function is (was?) very particular on how an SQL statement had to be crafted. Rather than this "mega dbDelta" ticket, should we not break it into bite-sized chunks and close this one out and work through it's children?

#18 @charlestonsw
8 years ago

Decomposed into a half-dozen tickets. Many may be fixed in 4.4.

#34874 dbDelta Case Sensitive Key Names Duplicates Indexes Database normal normal Awaiting Review defect (bug) 12/06/15

#34869 dbDelta Index Definition Spaces Duplicate Indexes Database normal normal Awaiting Review defect (bug) 12/06/15

#34871 dbDelta Lowercase KEY Keyword Duplicates Indexes Database normal normal Awaiting Review defect (bug) 12/06/15

#34872 dbDelta Missing Index Name Creates Duplicate Indexes Database normal normal Awaiting Review defect (bug) 12/06/15

#34870 dbDelta Not Specifying Key Length Duplicate Indexes Database normal normal Awaiting Review defect (bug) 12/06/15

#34873 dbDelta PRIMARY KEY Single Space Recreates Index Database normal normal Awaiting Review enhancement 12/06/15

#32528 Create a new search_form_content hook in get_search_form() General normal normal Awaiting Review feature request 06/17/15

Closing time for 10404 and start tracking the "child" tickets?

#19 @jdgrimes
8 years ago

I've added a patch on #34872, which includes some basic unit tests to get us started. I may work up a more extensive patch that covers more of these issues.

#20 @pento
8 years ago

  • Milestone Future Release deleted
  • Resolution set to invalid
  • Status changed from new to closed

Closing this ticket in favour of the various sub-tickets.

#21 @arena
4 years ago

  • Resolution invalid deleted
  • Status changed from closed to reopened

Sorry to be back on this ticket but i have the issue on wp trunk version (5.5-alpha-47528)

this is a CREATE TABLE in schema.php

CREATE TABLE $wpdb->postmeta (
	meta_id bigint(20) unsigned NOT NULL auto_increment,
	post_id bigint(20) unsigned NOT NULL default '0',
	meta_key varchar(255) default NULL,
	meta_value longtext,
	PRIMARY KEY  (meta_id),
	KEY post_id (post_id),
	KEY meta_key (meta_key($max_index_length))
) $charset_collate;

This is my CREATE TABLE in a plugin

CREATE TABLE $wpdb->mp_mailmeta (
	meta_id bigint(20) unsigned NOT NULL auto_increment,
	mp_mail_id bigint(20) unsigned NOT NULL default '0',
	meta_key varchar(255) NOT NULL default '',
	meta_value longtext,
	PRIMARY KEY  (meta_id),
	KEY mp_mail_id (mp_mail_id),
	KEY meta_key (meta_key)
) $charset_collate;

On deactivation/activation of my plugin i have this :

The plugin generated xxx characters of unexpected output during activation. If you notice “headers already sent” messages, problems with syndication feeds or other issues, try deactivating or removing this plugin.

and in apache log

WordPress database error Duplicate key name 'mp_mail_id' for query ALTER TABLE wp_mailpress_mailmeta ADD KEY `mp_mail_id` (`mp_mail_id`) made by activate_plugin, do_action('activate_mailpress/MailPress.php'), WP_Hook->do_action, WP_Hook->apply_filters, MailPress::install, include('/plugins/mailpress/mp-admin/includes/install/mailpress.php'), dbDelta, referer: http://127.0.0.1/wordpress/wp-admin/plugins.php?plugin_status=all&paged=1&s

thank you for your help !

Regards

could be related to #34870 ?

#22 @andrija
14 months ago

Notorious dbDelta - I've got an issue with it as well.

Here's my SQL query that gets triggered whenever I deactivate/activate the plugin:

CREATE TABLE $table_name (
    id mediumint(9) UNSIGNED NOT NULL AUTO_INCREMENT,
    page_id bigint(20) UNSIGNED NOT NULL UNIQUE,
    PRIMARY KEY  (id)
) $charset_collate;

There are other columns, but the issue happens even with this reduced version - every time I reactivate the plugin, a new index appears.

So currently I have:
PRIMARY
page_id
page_id_2
page_id_3

That happens only when page_id is UNIQUE, of course - if it's not UNIQUE, not indexes are created.

Also - that doesn't happen on another column which is varchar(255), only on the bigint(20) column.

Also - if there was an index created and I remove the word UNIQUE from my SQL query, then reactivate the plugin - the index is not removed from the table.

Last edited 14 months ago by andrija (previous) (diff)

#23 @andrija
14 months ago

Regarding my previous comment - everything works just fine if I use lowercase unsigned instead of UNSIGNED.

Note: See TracTickets for help on using tickets.