Page MenuHomePhabricator

MediaWiki should do validation of e-mail addresses
Closed, ResolvedPublic

Description

Currently MediaWiki doesn't seem to verify e-mail addresses entered by users, seeing as many e-mails get sent with addresses that don't even have an apostrophe (@). These e-mails go lost or end up in the wrong places without the user ever receiving a warning, as the mails cannot be bounced either.

MediaWiki should be doing a (better) job at verifying e-mail addresses. Ideally using a fully RFC2822 address checker with MX record verification, but we could at least check if the address contains an apostrophe.


Version: unspecified
Severity: minor
See Also:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=11225

Details

Reference
bz22449

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:55 PM
bzimport set Reference to bz22449.

mike.lifeguard+bugs wrote:

(In reply to comment #0)

apostrophe (@)...
at least check if the address contains an apostrophe.

apostophe = '
at symbol = @

couldn't help myself :(

Haha, right. I guess the dutch word 'apestaartje' (@) tricked me there. ;)

overlordq wrote:

We[1] did a poor-mans e-mail validation which simply split at the @ and checked the host portion for MX/A, as regexs to check for valid addresses can[2] be pretty nasty[3].

1 - http://toolserver.org/~acc/
2 - http://toolserver.org/~overlordq/regex.txt
3 - http://www.ex-parrot.com/pdw/Mail-RFC822-Address.html

Cf the similar bug 959. As pointed out in comment #33, trying to check for full RFC compliance is both difficult and unnecessary.

However, it's been pointed out that some simple sanity checks (length, obviously bogus characters) could at least be applied.

This would be especially nice since it's very easy to copy and paste an email address for a new user with extraneous markup such as 'Name <name@wikimedia.org>' which upon passwordless account creation will trigger a 'mailing error' and leave the user in a tricky state.

So, I looked into this recently for an extension. I've been running for years with a specific regex to check for email validity but it doesn't handle "everything".

The problem is all sorts of exceptions to four billion RFCs (such as allowing "+" characters, etc.). The best I've seen is this: http://www.dominicsayers.com/isemail/ but it is pretty hard core. I'm not sure if doing anything *less* is worthwhile.

r75627 adds a very simple email validator helper. Does not enforce anything though.

ayg wrote:

HTML5 has a good real-world definition:

http://www.whatwg.org/specs/web-apps/current-work/multipage/states-of-the-type-attribute.html#valid-e-mail-address

It works out to something like

"^$username_char+@$domain_char+(\.$domain_char+)+$"

where

$username_char = [-a-zA-Z0-9!#$%&\'*+/=?^_`{|}~.];
$domain_char = [-a-zA-Z0-9]

I did some research on how overly restrictive this definition was in practice, using an older and more restrictive definition:

http://lists.whatwg.org/htdig.cgi/whatwg-whatwg.org/2009-August/022220.html

It looks like it will be fine, at least for English users. Almost none would be affected if we did some sanitization (e.g., to fix up stuff like "Foo" <foo@bar.example>). I've been meaning to do some more testing on the databases to see if other languages might be affected more, but never got around to it. If obviously bogus addresses are causing problems, then we may as well go ahead with it. It's a pretty trivial change to get MediaWiki to reject these on the server side.

ayg wrote:

*** Bug 26287 has been marked as a duplicate of this bug. ***

r75876, r75862, r75627 marked as resolved with this bug being tagged upstream and remote bug logged

Marking bug as blocking MediaWiki 1.17 release tarball (bug 26676).

Reedy probably meant r75682, not r75862

Ashar, looks like you still have a FIXME on the code to get it to allow @localhost email addresses. Might only want to allow that where people specifically request it.

Any idea if you can add that in the next week?

It is on my TODO list, hopefully this week-end if time allow.

I have altered the regexp so the second part is 0 to n elements instead of at least 1.

See: r80694

I am assuming this issue is fixed in MediaWiki regardless of the w3.org specification.

Merged in 1.17 by Roan with r80722.

ayg wrote:

As it happens, a matching change was made in the HTML5 specification:

http://www.w3.org/Bugs/Public/show_bug.cgi?id=11225