Page MenuHomePhabricator

Erik Zachte's list of discrepancies for old vs new traffic reports
Closed, DeclinedPublic

Description

migrated from Asana
bug is partially solved, all comment 'as' is' copied from Asana

Stefan Petrea created task.Dec 19, 2012
Stefan Petrea Report SquidReportRequests.htm:

  1. One issue:

old totals: text/vnd.wap.wml 72M, application/x-www-form-urlencoded total 26.2 M
new totals: text/vnd.wap.wml 26.2, application/x-www-form-urlencoded total 72 M

so values have miraculously been swapped
I have no idea if old or new report is telling the truth here

-----------------------------------------------

Report SquidReportOrigins.htm:

Section: Requests with external origins:

  1. new report shows strange domains, like 2620-, 2001-, 2a01, etc

totals old report: 136,377 M, pages 6,828 M, images 107,737 M, other 21,812 M
totals new report: 137,541 M, pages 6,829 M, images 107,745 M, other 22,968 M
I'd say 'other' is a significant difference between old and new, worth investigating

-----------------------------------------------

Report SquidReportDevices.htm

  1. PM: broken in old and new (is another bug, can be addressed separately)

Report SquidReportClients.htm

  1. many discrepancies:

old has much more tablets (3.35% vs 0.71% in new report)

Safari/Android/Mozilla are missing in new report on 'browsers, tablets'
Safari on 'browsers, other mobile' is 5.73% and on new 8.13%
Safari occurs twice in old report in section 'browsers, other mobile'
Totals for 'Mobile applications' is much higher (10x) in old report, some entries are missing in new report
'Browser version, tablet' major mismatches between old and new
'Mobile application versions' likewise

-----------------------------------------------

Report SquidReportGoogle.htm

  1. GoogleBot?/Other is 21.8 M in old report / 28.4 M in new (Googlebot stands for probable imposters, using agent string GoogleBot but from unexpected ip addres)

Report SquidReportCountryData.htm

  1. new report: Global North/South Android is much less on new report And N+S don't add up to global total, not even close (in sections 'All pageviews' and 'To mobile site')
  1. again column tablets is way out of sync, old values 5 times as much as new , but that is just another showing of bug already mentioned on other report

Dec 19, 2012 at 3:41pm •
Stefan Petrea marked today.Dec 19, 2012
Stefan Petrea 1,2,7 is solved and 5 is underway

Dec 19, 2012 at 3:43pm •
Stefan Petrea 5 solved as well
Dec 20, 2012 at 6:36pm •
Stefan Petrea waiting for review https://gerrit.wikimedia.org/r/#/c/39587/

Dec 20, 2012 at 6:36pm •
Stefan Petrea 4 solved as well(through ^^ gerrit patchset) MobileDeviceTypes.csv was being read from an non-existing path (due to some csv files being moved to csv/meta)

Dec 20, 2012 at 7:06pm •
Stefan Petrea Some of the totals in SquidreportClients.htm are >100% , I need to fix that, started writing tests for it. I was able to identify a small dataset to reproduce the problem. Solution is underway.

Dec 21, 2012 at 4:46pm •
Stefan Petrea I have also discussed with Erik about what each of the totals in SquidReportClients.htm mean. At the moment my understanding is that some of the knowledge related to those totals was lost. We will have to do something to get it back. Options are digging through code, contacting a former Director of Mobile, or re-defining them so we can document them and after that we know what the all the numbers mean.… See More
Dec 21, 2012 at 4:48pm •
Stefan Petrea Forgot to mention that they're not obvious for me. If we can have another discussion on them that would be great. I will be digging through the code in the meanwhile.
Dec 21, 2012 at 4:50pm •
Erik Zachte Yes, It is puzzling for me as well. Here is the gist of what we discussed last night:

Till about a year ago the reports was pretty intuitive (Stefan agreed on this):

http://stats.wikimedia.org/archive/squid_reports/2011-08/SquidReportClients.htm
There was
list 'Browsers, non mobile' html 87.6%
list 'Browsers, mobile' html 14.2%
list 'Browser version, non mobile' html… See More

Dec 21, 2012 at 5:22pm
Stefan Petrea unmarked today.Feb 22


Version: unspecified
Severity: normal

Details

Reference
bz46265

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 1:26 AM
bzimport set Reference to bz46265.
bzimport added a subscriber: Unknown Object (MLST).

These new versions of the reports were reverted more than a year ago, as no-one could muster the time to fix open issues. Right now maintenance on squid log based reports is still on hold, pending replacement or restructuring in a HADOOP context.

Aklapper edited subscribers, added: Aklapper; removed: ezachte, wikibugs-l-list.

Closing this ticket as Wikistats version 1 is dead per https://stats.wikimedia.org/Wikistats_1_announcements.htm . In case this ticket is still a valid bug report or feature request for Wikistats 2, then please reopen. Thanks a lot!