Page MenuHomePhabricator

Click tracking no associating valid rev_id with some edit_success events.
Closed, ResolvedPublic

Description

Relevant to #40775

I'm running into edit_success events that do not have a valid revision_id associated with them. For example, at the end of clicktracking.log-20121022.gz.

$ zcat clicktracking.log-20121022.gz | grep "cta_edit-edit_success" | tail -n1
enwiki ext.articleFeedbackv5@10-option6X-cta_edit-edit_success 20121022062446 0 sE35RgbXXhnXHM5rluqKSK4QOCe8ik3w 0 0 0 0 0 Dowry system in India|0

Notice the "Dowry system in India|0" at the end of the event. This portion of the event should contain <title>|<new revision id>, but the <new revision id> == 0.

This bug appears to affect 3731/69933 edit_success events that were captured in my sample. This wouldn't be a big deal, but it affects a full 1/3rd of the cta_edt-edit_success events (668/1694),


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=40775

Details

Reference
bz41336

Event Timeline

bzimport raised the priority of this task from to Unbreak Now!.Nov 22 2014, 1:11 AM
bzimport set Reference to bz41336.
bzimport added a subscriber: Unknown Object (MLST).

I did a little bit more digging to find that event I referenced above does not have a revision reflected in the history of the article "Dowry system in India".

http://en.wikipedia.org/w/index.php?title=Dowry_system_in_India&action=history

Currently, the above history page shows that the last revision was saved at 2012-10-14 10:22. In the log, the edit success event has the associated timestamp of 2012-10-22 06:24, nearly a week later.

The enwiki database confirms this:

select max(rev_timestamp) from revision where rev_page = 31999935;

+--------------------+

max(rev_timestamp)

+--------------------+

20121014102253

+--------------------+
1 row in set (0.11 sec)

(copying from thread) this is a known issue (not necessarily a bug): if you try and save an empty edit, i.e. submit an edit with no modification, AFT still logs this as an edit_success but since MW doesn't generate a rev_id as a result of this transaction, a 0 value is stored to flag this as a "silent fail". Ideally we wouldn't even log these events as they are potentially inflating edit success counts, in the past we've just discarded them when processing the log data.

Wondering how severe this is. Should this block the next deployment? Fabrice?

I've pushed a patchset that will no longer register such no-modification submissions as "edit_success", but will now register then as "edit_norevision".

The change is part of https://gerrit.wikimedia.org/r/#/c/29766/