Page MenuHomePhabricator

Key performance indicator: Top contributors: Find good Ranking algorithm fix bugs on page
Closed, ResolvedPublic

Description

http://korma.wmflabs.org/browser/top-contributors.html should answer these questions:

Wikimedia professionals apart, who are the top tech community contributors, what are their areas of activity and where are they based? Let's list everybody, not just the top 10. This will help the WMF and the Wikimedia movement knowing and supporting these contributors better.

Tables are good, no need for graphs.


Version: unspecified
Severity: normal

Details

Reference
bz62221

Related Objects

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:57 AM
bzimport set Reference to bz62221.

What we have discussed so far:

  • now the universe is contributors ranking in the 5 activities, we should have everybody ranking in 4
  • now te ranking is calculated based on the average position in the 5 activities, it should be based on the best 4 (discarding the 5th worst result)
  • it should be very easy to rank only the amateurs (independent and unknown) leaving aside the paid contributors

Still, it is useful (and fun) to keep the table with everybody.

I think that's all?

Forgot one detail discussed: the rankings are meant to be calculated based on the data of the last 12 months. This way the old glories will leave room to the new challengers sooner.

One main motivation of this KPI is to identify current top independent contributors.

This table is now considered ready:

http://korma.wmflabs.org/browser/top-contributors.html

We are still missing better strings, but this is a task for me. I'm taking the bug. Help / patches welcome.

https://www.mediawiki.org/wiki/Community_metrics#Global_Pending_List still mentions "Count only the last 12 months" as a nice-to-have improvement.. If this item is still open when we close this bug, we will open a new bug report for it.

Qgil lowered the priority of this task from High to Low.Nov 23 2014, 11:05 PM
Qgil removed Qgil as the assignee of this task.Dec 16 2014, 8:16 AM

The page seems to have become rather unreliable over the time:
It currently lists only three people, the displayed data (e.g. does "Tickets" use the Maniphest backend from T96238? I doubt) is questionable, and the timeframe the data refers to is not told.

(Note to myself: HTML code is at https://github.com/Bitergia/mediawiki-dashboard/blob/master/browser/top-contributors.html and data loading is done in https://github.com/Bitergia/mediawiki-dashboard/blob/master/browser/js/mediawiki.js )

what are their areas of activity

@Qgil: How to gather "areas of activity" and how to summarize them? I do not see us listing dozens of code repositories or Phabricator projects in a table?

and where are they based?

I'll repeat what I wrote in T112528#1748039: Country sounds like a huge waste of time to me. 0.31% of the korma profiles have country information, most of those profiles being random IRC login names rarely matched on any "real" (relevant) profiles.

These two empty sections at the end of http://korma.wmflabs.org/browser/top-contributors.html can be removed.

Areas of activity refers to Git/Gerrit, Bugzilla, MediaWiki, Mailman, IRC... The table already has them.

That table also has Location. I still think it is interesting to know where in the globe are these top contributors located. I don't see any need to remove that column, neither any need for us to invest our time asking their location. Let's wait until the day when contributors can update their own data.

The table mixing professionals and volunteers is interesting, and we should keep it, but the original intention was to detect the most prolific volunteers. Therefore, maybe we need a second table? (easy) Or a way to filter by affiliation. (probably tricky)

These two empty sections can be removed.

https://github.com/Bitergia/mediawiki-dashboard/pull/70 (together with explaining the algorithm applied)

I like what I'm seeing (though my vanity is hurt by not finding myself on the list).

When I looked just now the krinkle was listed with rank 1 despite not having data in a few columns. This appears to be because having no rank is considered 0 which (I guess) is a higher rank than 1?

Aklapper renamed this task from Key performance indicator: Top contributors to Key performance indicator: Top contributors: Should have sane Ranking algorithm which takes (un)reliability of user data into account.Nov 27 2015, 2:28 PM
Aklapper renamed this task from Key performance indicator: Top contributors: Should have sane Ranking algorithm which takes (un)reliability of user data into account to Key performance indicator: Top contributors: Find good Ranking algorithm fix bugs on page.Jan 28 2016, 9:13 PM
Aklapper updated the task description. (Show Details)

Note: I assume the timeframe of http://korma.wmflabs.org/browser/top-contributors.html is all time. Which might not be the best choice if we want to find current top contributors in T85600: Who are the top 50 independent contributors and what do they need from the WMF?

General comments on the user account data in korma:

Usernames in different data sources feel pretty much "detached by default" / not connected / merged into a single uuid identity.
This negatively influences the quality of the data displayed on korma.wmflabs.org.

After getting access to the JSON identities file, spending quite some hours on searching names and manually and time-consumingly merging UUIDs of the same persons in our user database in korma, I am sure that in the Wikimedia tech community, we will fail to recognize quite some emerging top contributors

  1. as long as underlying algorithms do not improve proposing or assuming which UUIDs are likely the same person (see e.g. data in T124475: Eliminate duplicated «"source": "wikimedia:its"» identities in korma identities DB),
  2. without a tool (T60585: Allow contributors to update their own details in tech metrics directly) that allows users to merge their UUIDs into one identity and without a motivation why users should care and use that tool,

This KPI is requiring a lot more hours than I expected, when I proposed a couple of years ago or so. We need to ask ourselves whether identifying top contributors and T85600: Who are the top 50 independent contributors and what do they need from the WMF? is worth the effort.

  1. without a tool (T60585: Allow contributors to update their own details in tech metrics directly) that allows users to merge their UUIDs into one identity and without a motivation why users should care and use that tool,

Motivation could easily come from publicizing the data and making sure contributors are involve in the fora (wikitech-l, etc) that the data ie publicized in.

These two empty sections can be removed.

https://github.com/Bitergia/mediawiki-dashboard/pull/70 (together with explaining the algorithm applied)

Merged the explanation on top of http://korma.wmflabs.org/browser/top-contributors.html

I think we can close this task as "good enough" once

  1. Input welcome: we have decided whether we really want all-time data here or just "last 2 years" or whatever
  2. Input welcome: we have decided if being able to sort by affiliation Independent is needed or if we can survive extracting from the current list
  3. T117871: Many profiles on profile.html do not display identity's name though data is available is fixed.
  4. T125797: top-contributors.html is not sorted by rank anymore is reproduced and fixed.
  5. T123929: Mailing lists recently added to korma do not have "Top senders" data created (JSON file is 404) is fixed.

For the records, http://korma.wmflabs.org/browser/top-contributors.html lists 225 people. As stats are all-time, people can have (had) multiple affiliations. Hence the sum of the following numbers is not 225: 80 are tagged as WMF, 80 as Independent, 8 as WMDE, 2 Hallo Welt, 1 Wikia, 1 WikiWorks, 62 without affiliation data.

Aklapper changed the task status from Open to Stalled.Feb 16 2016, 4:01 PM

All blocking tasks are fixed, hence closing as resolved.

If someone does not want all-time data but just "last 2 years" or such on http://korma.wmflabs.org/browser/top-contributors.html, please file a new separate task.

To find people with Independent affiliation, use your browser's search and its highlight function. (That should not be a topic for this task anyway, but only for T85600).