Page MenuHomePhabricator

Mask email addresses to discourage casual harvesting
Closed, ResolvedPublic

Description

Author: marco

Description:
At the moment eMails given to MediaWiki with [mailto:CDL-Klever@gmx.net M]
results to HTML
<a href="mailto:CDL-Klever@gmx.net" class='external text'
title="mailto:CDL-Klever@gmx.net" rel="nofollow">M</a>

This leads to spambots opening a wiki userpage (and ignoring robots.txt) which
parse HTML and see an email. A way to stop them gathering emails of wikipedians
would be a charmask of the link in <a href=(.*)">. Is this possible?


Version: unspecified
Severity: normal

Details

Reference
bz5911

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:13 PM
bzimport set Reference to bz5911.
bzimport added a subscriber: Unknown Object (MLST).

I'd guess that most spam bots are smart enough to decode character references
these days - so I think this is not such a good idea. There are alternative ways
(render into an image, use javascript to decode a scrambeled mail address,
simply inser junk, etc) - all of which only work to some extend and have
disadvantages for some people, especially blind people, etc.

I would suggest to implement a hook for email obfuscation - that way, it would
be simple to write extensions for all kinds of address obfuscation / protection.

robchur wrote:

Of course, the solution to not making your email address available to the
public...is to not publish it on a public web site which anyone can read.

I like an unicode character that shows an @ bu it's not the 0x40 character, so
it's different for bots.
Users don't notice it, but may have problems if they copy-and-paste the address
in its mail client. Any workaround for it?

Normal: @
Unicode: ﹫

See also http://es.wikipedia.org/wiki/Usuario:Platonides/simbolos

ibloodyhatespam wrote:

I agree with Rob; mail addresses should not be visible. Whatever you add to a
mail address to hinder bots, if it's visible and in significant numbers, it will
be of interest to a spammer to create a conversion filter for.

A better way of handling accounts and mail addresses would be to have user names
and give a user the opportunity to choose mail address visibility themselves
(the same way most wiki projects work). Bugzilla after all will be able to mail
updates regardless of public mail address visibility. I've taken a lot of
precautions to prevent ending up on a spam list over the past ten years, so I'd
quite like to see this fixed before my inbox starts overflowing.

ibloodyhatespam wrote:

You're saying people either shouldn't be submitting bugs here at all, or go
through the bother of creating expiring mail addresses to help track bugs.
I'm assuming here that the vast majority of bugzilla participants mildly,
possibly even strongly dislikes unsolicited bulk email.
Is the coding to give people a choice in this matter that difficult to implement?

This bug was about obfuscation in MediaWiki. I opened a new bug
for our bugzilla: Bug 9872.

ui2t5v002 wrote:

(In reply to comment #2)

Of course, the solution to not making your email address available to the
public...is to not publish it on a public web site which anyone can read.

(In reply to comment #5)

Wontfix as per comment #2

Wouldn't it be nice if everyone was as technically-savvy as you? But they're not.

Why not do what Google Groups does? All email addresses would be obfuscated automatically, with things like CDL-Klever@gmx.net becoming CDL-Kle...@gmx.net, and the only way to access the actual email would be to pass a CAPTCHA.

Remember that Mediawiki is not just used on Wikipedia.

(In reply to comment #8)

Remember that Mediawiki is not just used on Wikipedia.

Yes and most wikis don't have many email links in their pages, only a few; and many even want those few to be clickable.
Anyway, moving to extension requests and marking fixed as [[mw:Extension:EmailAddressImage]] was created in the meanwhile (five years ago or a week after last comment, actually).