Page MenuHomePhabricator

Integrate wikimetrics with mediawiki-utilities
Closed, DeclinedPublic

Description

https://github.com/halfak/mediawiki-utilities


Version: unspecified
Severity: normal

Details

Reference
bz63203

Related Objects

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:52 AM
bzimport set Reference to bz63203.
bzimport added a subscriber: Unknown Object (MLST).

bingle-admin wrote:

Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/analytics/cards/cards/1502

Given that mediawiki utilities uses plain SQL and we are set to use alembic I do not see how this 'integration' could happen. Are we sure we want to keep this bug open?

wikimetrics is using sqlalchemy, and that's a bit of a mismatch with mediawiki utilities. I don't think that's too big of a deal, we could integrate the tools if it's a good idea. But that depends on which way wikimetrics as a product goes, and how we structure our data pipeline.

One possibility is to have wikimetrics become the ETL tool for public data. It could restructure our OLTP + recent changes + event streams into a more traditional, easy to work with, data warehouse. In that case, the logic from mediawiki-utilities would be very useful. We may wish to convert some of it to sqlalchemy, but that's a minor point.

Another possibility is to have a separate ETL process, based on an existing tool or a combination of tools. Wikimetrics would then be re-fashioned to query on top of the resulting data warehouse. In that case, mediawiki-utilities could be used to inform the ETL process but it would have a very different purpose from Wikimetrics.

I'm not opinionated on which way we go, but I think we should keep this bug open as a reminder of the great logic encapsulated in mediawiki-utilities.

mforns subscribed.

Declining because Wikimetrics is being discontinued. See: T211835.