Page MenuHomePhabricator

Ability to match text based on a negative lookbehind/lookahead regex
Closed, InvalidPublic

Description

Hi,

Abuse filter need to have a feature where in it would provide a unique code to be included (using <!---exceptioncode--> next to given string in article.

Sub feature exception or a word chosen in wiki language should be standard software should suggest randomly generated code to be added next to word <!--exceptionThecode-->

Reason: In certain context certain abusive words remain encyclopedic. For example in artcile http://en.wikipedia.org/wiki/13_June 1934 entry about meeting between Hitler and mussolini when asked to historians certain abusive words do have encyclopedic value.So we may need to give exception at that place only and no where else.This will save abuse filter managers and patrollers.

Thanks


Version: unspecified
Severity: enhancement

Details

Reference
bz47495

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 22 2014, 1:25 AM
bzimport added a project: AbuseFilter.
bzimport set Reference to bz47495.
bzimport added a subscriber: Unknown Object (MLST).

You can already exclude pages containing a string, very easily. Are you saying that this kind of regular expressions you're looking for don't work with AbuseFilter? (I don't remember what subset of regex we use exactly.)

Anyway, your approach seems wrong: you can't determine context reliably with regex, you should tag such edits adding words you don't like and then check them manually.

(In reply to comment #1)

You can already exclude pages containing a string, very easily. Are you
saying
that this kind of regular expressions you're looking for don't work with
AbuseFilter? (I don't remember what subset of regex we use exactly.)

Anyway, your approach seems wrong: you can't determine context reliably with
regex, you should tag such edits adding words you don't like and then check
them manually.

:The logic is exemption is needed only in a particular line in particular page.Use anywhere else is abuse and can be disallowed strait away.

:This will increase work for abuse filter only initially.Once the people know they can not cheat the system easily they will automatically desist.Once people desist lesser work for filters,filter managers and patrollers.

:Tagging only continues patrollers work.Wiki's like en wiki has large manpower available to do that.Most of the wikis in wikimedia umbrella are smaller wikis and we have less man power .If we save our time on patrolling our editors will have more time at disposal for content contribution

Mahitgar, you can achieve that easily by changing the condition to

( ( YOURCONDITION ) & article_text != '13 June' )

(In reply to comment #3)

Mahitgar, you can achieve that easily by changing the condition to

( ( YOURCONDITION ) & article_text != '13 June' )

Yes I know, this exempts the complete page.

Thanks

Mahitgar set Security to None.

One can already write a filter that checks for something like

("word" in added_lines) & ! ("<!-- hidden tag -->word" in added_lines)

and similar constructions, where editors agree in advance to use a particular marker to indicate that a word or phrase is okay. That seems like a problem for filter writers / article editors to decide and implement. Beyond that, I'm not sure what the actionable request is here.

Nemo_bis claimed this task.

Beyond that, I'm not sure what the actionable request is here.

Thanks for looking, let's mark this closed. Can be reopened if someone specifies an actionable request (like a missing standard regex feature).