Different results with queries in labs versus production around september 21st.
The following query returns different (very different) results in production that it does in labs:
Production:
SELECT count(log_user)
FROM enwiki.logging /* exclude proxy registrations */ WHERE log_type = 'newusers' /* only include self-created users, exclude attached and proxy-registered users */ AND log_action = 'create' AND log_timestamp BETWEEN 20140921000000 AND 20140922000000;
Returns: 8027
Labs:
SELECT count(log_user)
FROM logging /* exclude proxy registrations */ WHERE log_type = 'newusers' /* only include self-created users, exclude attached and proxy-registered users */ AND log_action = 'create' AND log_timestamp BETWEEN 20140921000000 AND 20140922000000;
Returns: 6842
Halfak did some digging and placed the missing rows in
analytics-store:staging.missing_labs_new_user_20140921
It looks like there's 5 hours of the day where the rows were missing.
mysql:research@analytics-store.eqiad.wmnet [staging]> select LEFT(log_timestamp, 10) as hour, count(*) from missing_labs_new_user_20140921 GROUP BY 1;
+------------+----------+
hour | count(*) |
+------------+----------+
2014092108 | 83 |
2014092109 | 336 |
2014092110 | 304 |
2014092111 | 344 |
2014092112 | 118 |
+------------+----------+
5 rows in set (0.01 sec)
Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=72226