Make WordPress Core

Opened 9 years ago

Closed 9 years ago

#33376 closed enhancement (fixed)

Create index on user_email

Reported by: chriscct7's profile chriscct7 Owned by: chriscct7's profile chriscct7
Milestone: 4.5 Priority: normal
Severity: normal Version:
Component: Database Keywords: has-patch commit
Focuses: performance Cc:

Description

In order to improve performance, an index could be added on user_email (indexes already exist on user_login and user_nicename), to make it faster to query against.

In addition, the db structure could be altered to add a UNIQUE constraint on user_nicename, user_email, and user_nicename for those columns on the wp_users table, which would improve performance when a site has many users, and eliminates to run a pre-insert SELECT check when inserting users.

The patch attached adds the schema changes proposed which would apply to new installs automatically, as well as the upgrade routine for existing installs (though the db_version compared against would likely need to be altered by the time this could get merged).

In addition to the performance improvements, this would also eliminate the chance of race conditions such as proposed by #30959.

Attachments (4)

33376.patch (1.7 KB) - added by chriscct7 9 years ago.
33376-no-upgrade.patch (986 bytes) - added by chriscct7 9 years ago.
33376 without updating existing sites
33376-enhanced-upgrade.patch (2.1 KB) - added by chriscct7 9 years ago.
33376 with deduplication of user_email routine
33376.2.patch (772 bytes) - added by ocean90 9 years ago.

Download all attachments as: .zip

Change History (18)

@chriscct7
9 years ago

#1 @chriscct7
9 years ago

  • Owner set to chriscct7
  • Status changed from new to assigned

#2 follow-up: @knutsp
9 years ago

This may fail for user_email, as a plugin could have allowed users with the same email.

#3 in reply to: ↑ 2 ; follow-up: @chriscct7
9 years ago

Replying to knutsp:

This may fail for user_email, as a plugin could have allowed users with the same email.

Replying to knutsp:

This may fail for user_email, as a plugin could have allowed users with the same email.

To do so would either require direct SQL calls or overriding tons of checks for email_exists in order to avoid the fact that WordPress prevents users from sharing an email address. I'm sure a plugin could be devised, if so desired, that could break on any nearly any change to WordPress core.

A plugin that allows the same email to be shared would also break other issues like #9568 (login via an email address; which user gets logged in when they share an email?), as well as developers ability to count on predicable results to core functions like get_user_by('email', $value). As such, changing the tables to not support duplicates simply complements checks WordPress already has.

If it was desired, we could skip the upgrades on existing installs (patch for that attached), but in any case. A better option would be to query all rows where duplicates exist and update the emails to be username+{email}. Patch also attached for this.

@chriscct7
9 years ago

33376 without updating existing sites

@chriscct7
9 years ago

33376 with deduplication of user_email routine

#4 in reply to: ↑ 3 ; follow-up: @DrewAPicture
9 years ago

Replying to chriscct7:

Replying to knutsp:

This may fail for user_email, as a plugin could have allowed users with the same email.

To do so would either require direct SQL calls or overriding tons of checks for email_exists in order to avoid the fact that WordPress prevents users from sharing an email address. I'm sure a plugin could be devised, if so desired, that could break on any nearly any change to WordPress core.

A plugin that allows the same email to be shared would also break other issues like #9568 (login via an email address; which user gets logged in when they share an email?), as well as developers ability to count on predicable results to core functions like get_user_by('email', $value). As such, changing the tables to not support duplicates simply complements checks WordPress already has.

The Allow Multiple Accounts plugin by @coffee2code has been a popular choice for years for allowing multiple accounts with the same email. So it's not really unheard of.

And for what it's worth, it leverages the pre_user_email hook to do it.

#5 @chriscct7
9 years ago

  • Keywords early removed

While I agree it's not unheard of (Scott's plugin is a great example), the question is whether the few users doing something WordPress doesn't natively support (5k out of many many millions) and in fact has code to prevent against are enough reason to block a potential performance impact for all of the other users (the clear majority of WordPress user's don't do that), which would be very significant on large sites with many users (particularly when going a get user by email call). While I admit if this goes in it makes a plugin like Scott's a bit harder to accomplish, it wouldn't be impossible (you could for example store users in a custom user table or add a table that maps additional usernames to an email, or something along those lines. And the third provided patch takes care of users who already share an email via a plugin like that.

#6 follow-up: @knutsp
9 years ago

If this goes in, please use localpart+username@… to make duplicate email unique and not vice versa. Gmail supports localpart+somestring@… as aliases for localpart@…. Then, for all users with Gmail, and maybe more, this will work out of the box.

Note:
My main business' main system depends on "Allow Multiple Accounts" plugin (case: same employee sometimes employed by several employers, but system currently only allows each employee to attach to one employer). We will adapt and change our system, but I think a warning should be given months ahead. If I spot trouble here, I guess many will have.

Core has allowed duplicate emails on users for a decade (?), even if it's not the internal default and intention.

I do understand and respect the reasons for this (proposed) change, though.

Last edited 9 years ago by knutsp (previous) (diff)

#7 in reply to: ↑ 6 @chriscct7
9 years ago

Replying to knutsp:

If this goes in, please use localpart+username@… to make duplicate email unique and not vice versa. Gmail supports localpart+somestring@… as aliases for localpart@…. Then, for all users with Gmail, and maybe more, this will work out of the box.

The patch does exactly this :-) so no worries there.

Core has allowed duplicate emails on users for a decade (?), even if it's not the internal default and intention.

Core actually hasn't supported this. Calling any register user function checks fora unique email and has for a while. Plugins like multiple email employ methods that work around all of WordPress's internal checks.

That being said this is still a proposal and there is alot more that will be done on this. If you subscribe to the ticket, you'll get an email anytime someone comments or the status of this proposal changes.

Last edited 9 years ago by chriscct7 (previous) (diff)

#8 @jorbin
9 years ago

In theory, these changes sound like they will improve performance, but I would like to see some actual benchmarks vs both MyISAM and Innodb.

The unique changes for one worry me. While core hasn't directly supported non-unique user_email, it is something that I know many installs have supported. I would be especially hesitant to essentially remove the ability for this to function without some clear numbers showing a lot of benefit.

#9 @dd32
9 years ago

Unfortunately I feel the UNIQUE index is probably out of reach for us, and since we're already enforcing uniqueness at the application layer, a unique index vs non-unique index doesn't seem like the performance benefit would outweigh any compatibility issues.

#10 @chriscct7
9 years ago

  • Keywords needs-refresh added; dev-feedback removed
  • Summary changed from Create index on user_email and UNIQUE on user_nicename, user_email and user_login to Create index on user_email

Yeah at this point UNIQUE is out of range. Index could still be done

Last edited 9 years ago by chriscct7 (previous) (diff)

#11 @ocean90
9 years ago

#35909 was marked as a duplicate.

#12 in reply to: ↑ 4 @ocean90
9 years ago

Replying to DrewAPicture:

The Allow Multiple Accounts plugin by @coffee2code has been a popular choice for years for allowing multiple accounts with the same email. So it's not really unheard of.

That's an interesting plugin. I'm curious if any of the sites are using a persistent object cache because there is a useremail cache group which gets used by WP_User::get_data_by().

cc: @pento

@ocean90
9 years ago

#13 @pento
9 years ago

  • Keywords commit added; needs-refresh removed
  • Milestone changed from Future Release to 4.5

WP.com has a UNIQUE index on user_email, but I'm inclined to agree that we can't practically do that for all sites. Also, enforcing UNIQUE keys has burned us in the past when we later decide to make something non-UNIQUE.

The performance difference between UNIQUE and non-UNIQUE indexes is negligible, so let's commit 33376.2.patch (with $wp_db_version bump, of course).

#14 @ocean90
9 years ago

  • Resolution set to fixed
  • Status changed from assigned to closed

In 36654:

Schema: Add an index to wp_users.user_email.

Improves lookup of an email address on large user tables.

See #9568.
Fixes #33376.

Note: See TracTickets for help on using tickets.