Page MenuHomePhabricator

Invalid values of added_lines, removed_lines, edit_diff
Closed, ResolvedPublic

Description

Author: beau

Description:
There is something wrong with diff function for abusefilter. In the provided URL bot added one interwiki link (one line), but from abusefilter point of view it looks like the whole content of the page was replaced.


Version: unspecified
Severity: major
URL: http://pl.wikipedia.org/w/index.php?title=Specjalna:Rejestr_nadu%C5%BCy%C4%87&details=14746

Details

Reference
bz20310

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 10:54 PM
bzimport added a project: AbuseFilter.
bzimport set Reference to bz20310.
bzimport added a subscriber: Unknown Object (MLST).

beau wrote:

The diff shown at the top of the page and the value of edit_diff do not match.

alexsm333 wrote:

Sometimes added_lines and removed_lines both contain one unchanged line from above. (As a result, edit_diff variable is wrong as well.)

Example: [[Special:AbuseFilter/examine/327603336]]; diff: http://en.wikipedia.org/w/index.php?diff=318457689

This actually happens quite a lot (at least on ru.wp), leading to false positives and forcing me to make extra checks on old_wikitext. Bumping severity to major.

I don't have access to URL in beau's post, but I suspect it's the same behaviour.

alexsm333 wrote:

Update: looks like the unchanged line above is always included into added_lines and removed_lines whenever user makes changes to the end of the page.

Although technically AbuseFilter might be right (linebreak \n is indeed added to or removed from the "unchanged line"), this is clearly not the behaviour expected by filters maintainers.

bug 20851 looks similar to this one.

seems to be the same as bug 19716

beau wrote:

*** This bug has been marked as a duplicate of bug 19716 ***

beau wrote:

Bug 19716 is about a different issue.

beau wrote:

*** Bug 20851 has been marked as a duplicate of this bug. ***

beau wrote:

*** Bug 26433 has been marked as a duplicate of this bug. ***

beau wrote:

I have submitted gerrit #6090 for review.

  • Bug 23304 has been marked as a duplicate of this bug. ***