Page MenuHomePhabricator

Make sure 2013 traffic logs are gone from /a/squids/archive on stat1002
Closed, DeclinedPublic

Description

We're >90 days into 2014, so according to the Data retention guidelines [1],
we should not have traffic logs from 2013.

Let's check that no 2013 traffic logs are around any longer and remove any
relicts that we find in stat1002's /a/squids/archive.

[1] http://meta.wikimedia.org/wiki/Data_retention_guidelines


Version: unspecified
Severity: normal
Whiteboard: u=AnalyticsEng c=EventLogging p=0 s=2014-10-30

Details

Reference
bz63543

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:15 AM
bzimport set Reference to bz63543.
bzimport added a subscriber: Unknown Object (MLST).

bingle-admin wrote:

Prioritization and scheduling of this bug is tracked on Mingle card https://wikimedia.mingle.thoughtworks.com/projects/analytics/cards/cards/1529

Couldn't we just rm them? It's be pretty simple to do (heck, I could do it. Assuming the permissions change hasn't gone through yet).

(In reply to Oliver Keyes from comment #3)

Couldn't we just rm them?

On stat1002? Yes, we could (if we had sufficient privileges).
And next they'd be back, as they have gotten rsynced over again
from their various real sources :-)

It's be pretty simple to do (heck, I could do it.
Assuming the permissions change hasn't gone through yet).

The permissions change has gone through.

(In reply to christian from comment #4)

(In reply to Oliver Keyes from comment #3)

Couldn't we just rm them?

On stat1002? Yes, we could (if we had sufficient privileges).
And next they'd be back, as they have gotten rsynced over again
from their various real sources :-)

Aha; I thought they were generated on stat2. Cool!

It's be pretty simple to do (heck, I could do it.
Assuming the permissions change hasn't gone through yet).

The permissions change has gone through.

Darn. Timing! :p

Need ops help on this. Deferring however painful.

It seems RT 8760 (which is about auditing log retention on stat1002) might
be relevant here.

Discussions started on what to do with each directory:
/a/squid/archive
api
arabic-banner
bannerImpressions
blog
edits
edits-geocoded
glam_nara
mobile
mobile-geocoded
sampled
sampled-geocoded
sopa
teahouse
zero