Page MenuHomePhabricator

Automate generation of fundraiser tracking data, long term storage, and data mining
Closed, DeclinedPublic

Description

Currently we generate stats on locke to one landingpage or banner impressions file. Then someone has to rsync it for long term archive storage on dataset. After, we have to run the appropriate mining script and ship the fundraiser data.

Lets automate this.

We'll need to setup a regular job that pulls a specific set of log files from locke to dataset1. Then a cron job can kick off at a specific point in time (say 30min post fundraiser) that will email the fundraisers with a csv version of the data.


Version: unspecified
Severity: enhancement

Details

Reference
bz25205

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:21 PM
bzimport set Reference to bz25205.

The 2step proces is now well automated thanks to Nimish. Documentation to come next.

ngautam wrote:

There is now a cron running on locke which has user file_mover automatically compress the data, send it to dataset1, run analysis on it, and email the CSV to tfinc@wikimedia at 55 min past the hour.