Abuse filters can be fooled by using U+200B ZERO WIDTH SPACE (ccnorm doesn't remove/normalize them)
Closed, ResolvedPublic
Actions

Assigned To

None

Authored By

	He7d3r
	Feb 28 2014, 12:17 PM

Description

As you can check on
https://test.wikipedia.org/wiki/Special:AbuseFilter/tools
ccnorm("BAD")!==ccnorm("BAD")
where the first string has just 3 characters and the second one has a few invisible characters inside it.

Therefore, anyone can fool abuse filters which try to avoid ofenses, badwords, etc.. by just copying invisible characters in the text.

Version: unspecified
Severity: normal

Details

Reference: bz62049

Event Timeline

• bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:03 AM

• bzimport added a project: AntiSpoof.

• bzimport set Reference to bz62049.

• bzimport added a subscriber: Unknown Object (MLST).

He7d3r created this task.Feb 28 2014, 12:17 PM

To fix this, we would either need to add these characters to AntiSpoof's maintenance/equivset.in (and make them normalize to and empty string) or, if that's not possible/ desired, we could also extend our own ccnorm function.

Seems like antispoof would be the right place for this.

Change 117640 had a related patch set uploaded by Hoo man:
Map U+200B (zero width space) to an empty string

https://gerrit.wikimedia.org/r/117640

Change 117640 merged by jenkins-bot:
Map U+200B (zero width space) to an empty string

https://gerrit.wikimedia.org/r/117640

Chris approved my patch

Abuse filters can be fooled by using U+200B ZERO WIDTH SPACE (ccnorm doesn't remove/normalize them)Closed, ResolvedPublicActions

Description

Details

Event Timeline

Abuse filters can be fooled by using U+200B ZERO WIDTH SPACE (ccnorm doesn't remove/normalize them)
Closed, ResolvedPublic
Actions