Page MenuHomePhabricator

Create a list of primary contributors to a page
Closed, ResolvedPublic

Description

Author: titoxd.wikimedia

Description:
As discussed on [[MediaWiki talk:Cite text]], it would be very helpful if we
could obtain a list of primary contributors to a given page.

Previously, this was done using a query on the toolserver:
(http://tools.wikimedia.de/~tim/cgi-bin/contribution-counter?page={{PAGENAMEE}}&namespace={{NAMESPACEE}}&dbname=enwiki_p)
however, as you all know, the enwiki database in the toolserver is dead, so we
cannot use that any longer. The source code of the tool is here:
([http://tools.wikimedia.de/~tim/cgi-bin/contribution-counter?source=on)

We have seen other tools spring up to solve this, such as
http://vs.aka-online.de/wppagehiststat/. However, these are much more expensive
than a direct SQL query, which would be extremely helpful for the English
Wikipedia and other MediaWiki users in general. This information is also
necessary for some citation formats, and useful for mirrors for GFDL compliance.


Version: 1.8.x
Severity: enhancement
URL: http://en.wikipedia.org/wiki/MediaWiki_talk:Cite_text/Archive_1#Primary_Contributors

Details

Reference
bz7988

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 9:27 PM
bzimport set Reference to bz7988.
bzimport added a subscriber: Unknown Object (MLST).

robchur wrote:

I could swear blind there's a Page class method that does this. I could further
swear blind that it's available as an option to be switched on in MediaWiki. I
suspect, however, it might be considered too expensive, although we could store
it in the cached ParserOutput in some fashion, perhaps.

robchur wrote:

(In reply to comment #1)
That deviated a bit. We could do it in quite a simple fashion as a special page
which would allow such queries, but that might get abused. I think it would be
neater if it was provided on the page view itself. Then again, Wikipedians might
complain about feeling undervalued if their names didn't appear on the "credit
reel", as it were. Swings and roundabouts.

titoxd.wikimedia wrote:

So, perhaps something like the second tool I linked above, which basically is a
special page with cached results? Then making a link to this page, similar to
"Show logs for this page" on the history view? That would reduce the number of
queries slightly.

robchur wrote:

We could adopt a model similar to existing query pages, which use a
periodically-regenerated cache. Another approach might be to cache on demand in
a table somewhere, this being "purged" when the page is touched.

titoxd.wikimedia wrote:

(In reply to comment #4)
Probably the second method would be more efficient, as having to update yet
another index would probably be excessive for all pages would probably be as
expensive as regenerating them manually.

titoxd.wikimedia wrote:

(In reply to comment #5)
s/"regenerating them manually"/"generating them from an external, off-wiki
query"... mental screw-up.

robchur wrote:

I like the second method better, too; we're talking about a GROUP BY which is
applied to all pages if we do a massive caching event; this tool would be used
on, realistically, a low proportion of pages. Prematurely caching information
about a page that might never be required is totally pointless and a waste of
processing time.

robchur wrote:

I've written and checked-in the Contributors extension, which adds an includable
special page. It could do with a little optimisation work, though; despite
effective result set caching, the initial SQL is still quite expensive.

titoxd.wikimedia wrote:

Awesome. It works great. Who do I have to bribe to enable it on Wikimedia wikis? :)

robchur wrote:

The usual suspects, although I wouldn't mind some further optimisations, as I
said above.

brian wrote:

(In reply to comment #0)

As discussed on [[MediaWiki talk:Cite text]], it would be very helpful if wecould obtain a list of primary

contributors to a given page.

[snip]

I can't find anywhere on that page nor here any hint of a definition of "primary contributors". It also seems pretty
meaningless when "Anonymous" keeps appearing at the top of the list.

titoxd.wikimedia wrote:

(In reply to comment #11)

I can't find anywhere on that page nor here any hint of a definition of "primary contributors". It also seems

pretty meaningless when "Anonymous" keeps appearing at the top of the list.

By primary contributors we mean the editors with most edits to the page. As for "anonymous", it can be filtered out,
but whether that is a good idea or not is a completely different beast.

le.korrigan wrote:

Hello. Is there anything which prevents the Contributors extension on Wikimedia wikis? Or is just too expensive for the servers? This feature would be very useful to help properly reusing the GFDL content. Thanks.

http://wikidashboard.appspot.com/ appears to do this for all sorts of pages and give more fine grained info than just "these are the top contributors" but also *when* they contributed.

I'm thinking this WORKSFORME. Any objections to closing it that way?

(In reply to comment #14)

I'm thinking this WORKSFORME. Any objections to closing it that way?

Apart from it being a third party tool.

we actually have two ways to do this in core apparently ?action=credits and =info, although they are both disabled on the cluster and I think also in the default packages

It works on translatewiki.net: http://translatewiki.net/wiki/Project:News?action=credits

Since this is a request for a feature that now exists (in fact, existed maybe 2+ years before the bug was opened: http://svn.wikimedia.org/viewvc/mediawiki/trunk/?pathrev=4187), I'm closing this RESOLVED. If you want this enabled on the cluster, that is a different bug. Please open one if that is what you want.

If you want the Contributors extension deployed on the cluster, please see http://www.mediawiki.org/wiki/Writing_an_extension_for_deployment for the steps needed to move it along there. You may need to open a new bug to get the extension reviewed.