Page MenuHomePhabricator

UniversalLanguageSelector-tofu logging too much data
Closed, ResolvedPublic

Description

On 2014-06-25, there was throughput alarm for EventLogging [1].

Nuria starting investigating and it turned out to be caused by
UniversalLanguageSelector-tofu [2].

Currently, there is some discussion with the Language team about the
way forward.

Meanwhile, it turned out that the size of the Schema's table sticks
out a bit [3].

Not sure about way forward, but it seems the outcome of the discussion with
Language will dictate the next steps.

[1] http://lists.wikimedia.org/pipermail/analytics/2014-June/002260.html
[2] search for “tofu” on

http://bots.wmflabs.org/~wm-bot/logs/%23wikimedia-analytics/20140625.txt

[3] http://lists.wikimedia.org/pipermail/analytics/2014-July/002269.html


Version: unspecified
Severity: normal
Whiteboard: u=AnalyticsEng c=EventLogging p=0 s=2014-07-10

Details

Reference
bz67463

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:30 AM
bzimport set Reference to bz67463.

Language team was logging data to that table at a very high rate without being aware they were doing so. Since their intention was to run an experiment for couple weeks they stop logging data and we agreed that we could create a table in staging with the data they needed and probably get rid of the UniversalLanguageSelector-tofu_7629564 table which was getting very large (~100G).

We have sent an e-mail to localisation team asking them whether is OK for us to remove the large table.

Now, we also need to take steps on our end to prevent huge tables being created in the future due to programming errors on the client.

I have created a backlog item to this extent:
https://bugzilla.wikimedia.org/show_bug.cgi?id=67470

We have dropped the large table UniversalLanguageSelector-tofu_7629564 and created a table on staging with data so the i18n team can do the needed research.

Closing bug as I think all items on analytics side have been addressed.