Page MenuHomePhabricator

Denote diffs complexity with edit distance rather than bytecount change
Open, LowestPublicFeature

Description

Author: gschriss

Description:

As an user scanning recentchanges/watchlist or history/contributions, I want a concise indicator of how much an edit changed, so that I can navigate diffs more easily. The bytecount is too weak; a standard metric is edit distance. https://en.wikipedia.org/wiki/Edit_distance

As an extension to Bug 1085 ('Display size of changes every line in
Special:Recentchanges'), change the way page character counts are displayed to
reflect the type of edit made. The current system obscures whether the edit was
a simple text addition/removal or if existing text was altered. Examples:

Before: ...end of sentence.
After: ...end of sentence. This is a sentence just added...
Character change: (+30)
Simple text addition.

Before: ...end of sentence. Some dubious statement...
After: ...end of paragraph.
Character change: (-30)
Simple text removal.

Before: ...some tpyo...
After: ...some typo...
Character change: (0)
Complex change

Before: ...the man is a laureate...
After : ...the guy is an a__hole...
Character change: (0)
Complex change with edit summary: 'fixed typo'
Looks like nothing really happened

One possibility would be to designate (+/-N) for simple changes, ((+N, -N)) for
multiple simple changes, <+/-N> for complex changes, and <<+/-N>> for multiple
complex changes, aka article rewrites. Or something like that that gives
article reviewers more information at a glance.

Thanks, George
[[User:GChriss]]


Version: 1.24rc
Severity: enhancement

Details

Reference
bz8571

Related Objects

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 9:31 PM
bzimport set Reference to bz8571.
bzimport added a subscriber: Unknown Object (MLST).

Why would editors need this? Editors look after good or bad edits, but these different types of "complex" diffs cannot make such distinction. Besides, it would take a lot of extra space in the history.

I believe this is a wontfix.

Nemo_bis renamed this task from Page character counts: denote simple vs. complex changes to Denote diffs complexity with edit distance rather than bytecount change.Jan 16 2015, 9:50 PM
Nemo_bis updated the task description. (Show Details)
Nemo_bis set Security to None.

This is a very reasonable request (now that I extracted its spirit and actionable item), however it's going to be rather hard. I think edit distance would be the easiest, but it would still require a schema change.

Other possibilities could be:

  • visual diff percentage (there is some on/off experimentation in this area),
  • HTML output bytecount difference/edit distance.

AbuseFilter already has such things (added/removed lines, old vs. new html); maybe one of these could be added there for a start (for the anti-vandalism side).

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:02 AM
Aklapper removed a subscriber: wikibugs-l-list.
In T10571#130430, @Qgil wrote:

Why would editors need this? Editors look after good or bad edits, but these different types of "complex" diffs cannot make such distinction. Besides, it would take a lot of extra space in the history.

I believe this is a wontfix.

I know I'm replying to a very old comment, but editors need this to see if anything's actually changed on the page. If I see a page with two edits and a bytecount change of 0, that could either be vandalism that's already been reverted, in which case I can ignore it, or it could be two separate edits that resulted in an article of the same length, in which case I would need to check to see if it's a good edit or a bad edit.

That would be so much more informative!

The byte size difference doesn't tell you much about an edit in most day-to-day work. A few changed bytes, for example, may indicate either typo fixes or a fundamental rewrite of the lede. Displaying the edit distance isn't going to be a silver bullet but it would be a massive improvement.