Make WordPress Core

Opened 14 years ago

Last modified 6 years ago

#15706 reviewing enhancement

Allow wildcarded domains in multisite limited email domains

Reported by: djcp's profile djcp Owned by: westi's profile westi
Milestone: Priority: normal
Severity: normal Version:
Component: Login and Registration Keywords: needs-patch
Focuses: multisite Cc:

Description

Here at blogs.law.harvard.edu, we want to allow all harvard.edu subdomains to create blogs in our multisite install. There are hundreds of domains and it would be difficult to get a complete list because of the complexity of our DNS infrastructure.

I propose allowing the inclusion of a single prefix wildcard character in the limited email domains feature. If a limited email domain contains a "*", we would create a regex and match that specific entry via a wildcard. So "*.harvard.edu" would match "cyber.law.harvard.edu", "fas.harvard.edu", etc. To match the root TLD, you'd just manually enter "harvard.edu".

We have a variant of this applied as a core hack to our wordpress install at http://blogs.law.harvard.edu and it's been working fine for years. I will package it up as a patch if there's interest. Thoughts?

I don't think it'd make sense to allow embedded wildcards (dom*ain.org).

Attachments (11)

multisite-fuzzy-limited-domains-15706.patch (1.9 KB) - added by djcp 14 years ago.
15706.patch (1.2 KB) - added by SergeyBiryukov 13 years ago.
15706.diff (821 bytes) - added by markjaquith 13 years ago.
15706.2.patch (1.0 KB) - added by SergeyBiryukov 13 years ago.
15706.3.patch (4.8 KB) - added by boonebgorges 11 years ago.
15706.3.unit-tests.patch (2.7 KB) - added by boonebgorges 11 years ago.
15706.4.diff (8.7 KB) - added by jeremyfelt 11 years ago.
15706.5.diff (9.3 KB) - added by jeremyfelt 11 years ago.
15706.2.diff (9.1 KB) - added by DrewAPicture 11 years ago.
docs fixes
15706.3.diff (9.2 KB) - added by helen 11 years ago.
15706.4.patch (9.4 KB) - added by boonebgorges 11 years ago.

Download all attachments as: .zip

Change History (51)

#1 @nacin
14 years ago

  • Keywords needs-patch added
  • Milestone changed from Awaiting Review to Future Release

Seems sensible. A patch would be great.

#2 follow-up: @djcp
14 years ago

  • Keywords has-patch added; needs-patch removed

So if a limited domain begins with "*.", we nick off those characters and check it against the right side of the user's email domain. If a limited domain doesn't begin with "*.", we just check it normally while iterating through the limited domains.

This should be fully backwards compatible, we've just expanded out the in_array to inspect each limited_domain value, making those prefixed with "*." match their subdomains.

So here at Harvard, our allowed domain list looks like:

harvard.edu
hbs.edu
radcliffe.edu
*.harvard.edu
*.hbs.edu
*.radcliffe.edu

It's important that we keep wildcarding to subdomains and not actual domain names. Otherwise, if ibm.com was using this feature with a wildcard thusly:

*ibm.com

I could register "notibm.com" and exploit their multisite install. Since we require wildcarding on the subdomain level ("*.ibm.com"), that's not possible.

I suppose you could do something stupid like enter "*.com" and allow anyone with a .com address to register in your multisite install, but c'mon. You can't fix stupid.

I also added the "*" to the limited_email_domain regex in wp-admin/network/edit.php.

#3 in reply to: ↑ 2 @nacin
14 years ago

Replying to djcp:

I suppose you could do something stupid like enter "*.com" and allow anyone with a .com address to register in your multisite install, but c'mon. You can't fix stupid.

:-) I like your thinking. (That said, I see a legitimate use case in *.gov or *.edu for certain themed networks, for example.)

We use tabs, not spaces, and if you can re-upload it with a diff or patch extension, it'll be presented better in Trac.

#4 @djcp
14 years ago

Done, sorry about that. I can't see a way to remove the old .txt file (not that familiar with trac).

#5 @djcp
14 years ago

  • Cc djcp added

#6 follow-up: @nacin
14 years ago

  • Keywords 3.2-early added

Cool. (Not a problem, also. I killed the attachment.)

Patch looks great, at a glance. All I see are coding standards improvements to be addressed. Marking for 3.2-early.

  • foreach ( and if ( rather than foreach( and if(. Same goes for opening braces, space before.
  • substr( $limited_domain, 0, 2 ) == '*.' should probably be 0 === strpos( $limited_domain, '*.' ).
  • The direct domain match can be made into an elseif, instead of the nested if.
  • Once a match is found, you should break; out of the foreach; no need to keep checking.
  • No need for the braces around the one-line if at the bottom.
  • $matched_domain == false should be ! $matched_domain.

#7 follow-ups: @andrea_r
14 years ago

  • Cc andrea_r added

+1, yes please. As an extra, anyway you could make the same in the DISallowed email field? So super admins can block all *.info or *.ru domains from signing up (for example).

#8 in reply to: ↑ 6 ; follow-up: @djcp
14 years ago

Replying to nacin:

Cool. (Not a problem, also. I killed the attachment.)

Patch looks great, at a glance. All I see are coding standards improvements to be addressed. Marking for 3.2-early.

As this is my first contribution to the wordpress core, I'm a little unsure of standards. Sorry! Should I make those change and re-upload?

#9 in reply to: ↑ 7 @djcp
14 years ago

Replying to andrea_r:

+1, yes please. As an extra, anyway you could make the same in the DISallowed email field? So super admins can block all *.info or *.ru domains from signing up (for example).

That seems like a fine idea but my time is a little tight right now.

#10 in reply to: ↑ 8 @nacin
14 years ago

Replying to djcp:

Replying to nacin:
As this is my first contribution to the wordpress core, I'm a little unsure of standards. Sorry! Should I make those change and re-upload?

Not a problem. Congrats on your first patch :-) If you can do that, that'd be great. Otherwise, I'll clean it up in about a month when 3.2 development gets underway.

If there's no patch on this ticket come 3.2 for that Andrea mentions, I'll port this over and get it handled there as well.

#11 @PeteMall
14 years ago

Replying to djcp:

Replying to nacin:

Cool. (Not a problem, also. I killed the attachment.)

Patch looks great, at a glance. All I see are coding standards improvements to be addressed. Marking for 3.2-early.

As this is my first contribution to the wordpress core, I'm a little unsure of standards. Sorry! Should I make those change and re-upload?

WordPress Coding Standards

You don't have to fix it for this patch. Patch looks good for the 3.2 cycle.

#12 @westi
14 years ago

  • Cc westi added

I wonder how many people use the limited email domains feature - didn't even realise it existed.

I also wonder if this would not be better served by better hooks and a plugin?

#13 @djcp
14 years ago

I am a sad panda. This didn't make it into 3.2 beta 1, and I think this has real utility for academia and any large institution that wants to create easy to manage multisite installs. Any chance this could get reviewed? It's highly unlikely to cause any backwards compatibility issues.

Last edited 14 years ago by djcp (previous) (diff)

#14 @SergeyBiryukov
13 years ago

  • Keywords 3.2-early removed
  • Milestone changed from Future Release to 3.3

Refreshed for 3.3, adjusted as per nacin's comments.

#15 in reply to: ↑ 7 @SergeyBiryukov
13 years ago

Replying to andrea_r:

As an extra, anyway you could make the same in the DISallowed email field? So super admins can block all *.info or *.ru domains from signing up (for example).

is_email_address_unsafe() seems to support regular expressions:

if (
	strstr( $email_domain, $banned_domain ) ||
	(
		strstr( $banned_domain, '/' ) &&
		preg_match( $banned_domain, $email_domain )
	)
)
return true;

According to comments on #14695, it's possible to use something like /.*\.info/.

#16 @jane
13 years ago

  • Keywords dev-feedback added

We're at freeze for 3.3.

Leads/committers, please weigh in on commit vs punt.

As always an incremental improvement over the course of several releases is better than not improving anything until something can be made perfect 3-4 releases from now. If this patch is good, let's get it in? If not, please leave specific feedback about what makes it not commit-worthy so the contributors can learn and/or revise for next round.

#17 @westi
13 years ago

I still think a better solution here is a filter which allows for a plugin to implement this or any other rule for allowed domains rather than complicating the core code in this case.

We should just augment this line of code with an extra apply_filters call:

if ( in_array( $emaildomain, $limited_email_domains ) == false )

@markjaquith
13 years ago

#18 @markjaquith
13 years ago

  • Milestone changed from 3.3 to Future Release
  • Owner set to westi
  • Status changed from new to reviewing

Tend to agree that a filter would be better. A filter on the response, that passes in the emaildomain. That can do all the wildcard regexes it wants.

Patch uploaded with suggested filter solution.

#19 @SergeyBiryukov
13 years ago

Since "Banned Email Domains" field already supports regexps, wouldn't it make sense to do the same in "Limited Email Registrations" for consistency? Done in 15706.2.patch.

#20 @boonebgorges
13 years ago

  • Cc boonebgorges@… added

#21 @holizz
13 years ago

  • Cc tom@… added

#24 @boonebgorges
11 years ago

  • Keywords needs-refresh added; has-patch removed

Patch will need refreshing, because sanitization changes introduced in r21993 (#21552) mean that items containing asterisks are now stripped when the fields are saved.

#25 @boonebgorges
11 years ago

  • Keywords has-patch added; needs-refresh removed

15706.3.patch is a refresh against trunk that does the following (some overlap with #21730):

  • Introduces is_email_domain_in_list(), a utility function for checking whether an email passes a domain whitelist/blacklist check. It supports wildcard URL chunks using *.
  • Refactors is_email_address_unsafe() to use is_email_domain_in_list()
  • Introduces is_email_address_allowed() to parallel is_email_address_unsafe()

15706.3.unit-tests.patch provides unit tests for the new wildcard functionality. Everything passes.

#26 @SergeyBiryukov
11 years ago

  • Milestone changed from Future Release to 3.7

@jeremyfelt
11 years ago

#27 @jeremyfelt
11 years ago

15706.4.diff does a few things:

  • Merges Boone's 2 patches into one for develop.svn
  • Accounts for the changes in [25197] that added case insensitivity to the email check
  • Removes unused local $limited_email_domains left over after introduction of is_email_address_allowed()

Just realized it still needs docs on the filters as well.

#28 @DrewAPicture
11 years ago

  • Keywords needs-docs added

@jeremyfelt
11 years ago

#29 @jeremyfelt
11 years ago

  • Keywords needs-docs removed

15706.5.diff adds docs to the previous patch.

@DrewAPicture
11 years ago

docs fixes

#30 @DrewAPicture
11 years ago

15706.2.diff fixes up some missing periods in the docs and other things.

@helen
11 years ago

#31 @helen
11 years ago

Patch works for me, tests come out as expected before and after patch, docs look good. I tweaked a little in 15706.3.diff to add @ticket annotations to the tests and remove one whitespace change that seemed accidental.

#32 @nacin
11 years ago

  • Keywords commit added; dev-feedback removed

I've asked westi to look this over.

#33 @nacin
11 years ago

So I think the only objection I have to this is the function naming. is_email_domain_in_list() is okay, but not great. It receives a full email address, but its name suggests it is not just checking only the email's domain, but also only receives the domain. I think we might be able to generalize it further. What about is_domain_in_list() that is allowed to receive an email address? If @ exists, we explode on it, otherwise, we just assume the whole string is the email. The reason for this is it could be potentially useful when whitelisting domains in the HTTP API, such as safe redirection and such. It would be nice to be able to do is_domain_in_list( 'make.wordpress.org', array( '*.wordpress.org', '*.wordcamp.org' ) ).

I don't actually care much about that, as much as I care that these are opposites:

  • is_email_address_allowed(), which searches limited_email_domains
  • is_email_address_unsafe(), which searches banned email domains

"unsafe" in is_email_address_unsafe() implies security, which is wrong. "allowed" does not strongly imply whitelisted.

The UI isn't even well thought out. It names the settings "Limited Email Registrations" and "Banned Email Domains", as if they are two unrelated settings, versus being direct opposites and more or less mutually exclusive? (I guess you could allow a domain but block a particular subdomain, like allowing *.school.edu (professors/administrators) but blocking students.school.edu.)

What about is_email_address_whitelisted() and is_email_address_blacklisted()? Should these be multisite-specific? ms_email_whitelisted() ms_email_blacklisted()?

#34 @boonebgorges
11 years ago

What about is_domain_in_list() that is allowed to receive an email address? If @ exists, we explode on it, otherwise, we just assume the whole string is the email.

I like that idea. See 15706.4.patch, where I also pare down some of the new testcases that shouldn't have been there.

As for the function names, I agree that they're not good. is_email_address_unsafe() is, as you note, highly misleading, but it seemed beyond the scope of this ticket to change something that's already working as intended. The newly proposed is_email_address_allowed() just parallels the naming convention, which comes from the name of the saved site option. I also agree that the UX is not great, though I don't have any bright ideas about how to make it much better.

In any case, I hope that the peripheral crumminess of existing UX and function names, etc, doesn't get in the way of getting the current enhancement - wildcard domains in the white/blacklists - into 3.7. It seems to me that the UX stuff could be handled in another go-round, independent of the specific fix being considered in this ticket.

http://core.trac.wordpress.org/attachment/ticket/15706/15706.4.patch

#35 @nacin
11 years ago

I guess I don't quite see "allowed" and "unsafe" being parallel. Maybe I'm wrong? I guess "safe" would make more sense, but that's not good either because it isn't strictly an inverse.

Maybe we go with whatever our ideal function name is for this new function? (And leave the current unsafe() function alone.) Any good ideas? Unless we don't actually know what an ideal function name is, in which case "allowed" is fine. But I will note the option names are "limited" and "banned", not "allowed" and "unsafe".

#36 @boonebgorges
11 years ago

But I will note the option names are "limited" and "banned", not "allowed" and "unsafe".

Ah right, I was misremembering. I guess I made up "allowed".

The clearest terms to me are 'whitelist' and 'blacklist', so I guess I'd lean toward is_email_address_whitelisted().

#37 @jeremyfelt
11 years ago

I'm a fan of _whitelist and _blacklist. The change to is_domain_in_list() is appealing too.

#38 @nacin
11 years ago

  • Milestone changed from 3.7 to Future Release

I started to break this down by removing is_email_address_allowed() all together, and using is_domain_in_list() directly in the validate signup function. I then realized that wildcards were actually useless here, because the logic that handles banned domains (and is now in is_domain_in_list()) already blocks subdomains when a higher level domain is blocked. Limited email domains, however, do not currently have this logic. So this is a code problem *and* a UX problem. There is just too much weird stuff going on here, need to move this out of 3.7.

http://irclogs.wordpress.org/chanlog.php?channel=wordpress-dev&day=2013-09-27&sort=asc#m697332

#39 @jeremyfelt
11 years ago

  • Component changed from Multisite to Login and Registration
  • Focuses multisite added
  • Keywords commit removed

#40 @chriscct7
9 years ago

  • Keywords needs-patch added; has-patch removed

Needs improved patch. See comment:38

Note: See TracTickets for help on using tickets.