Page MenuHomePhabricator

Stripping of trailing characters from zero tags in varnish logs not effective
Closed, ResolvedPublic

Description


Version: unspecified
Severity: normal

Details

Reference
bz66833

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:25 AM
bzimport set Reference to bz66833.
bzimport added a subscriber: Unknown Object (MLST).

Although the zero vcl comes with code to strip a trailing character of
a zero tag [1], we're nonetheless seeing tags as

zero=404-01b

in the zero tsvs, and mobile-sampled-100 tsvs [2] [3], where we'd expect

zero=404-01

instead. While those requests are very few and harmless, it seems this
code to strip the trailing character is not working as expected.

[1] https://git.wikimedia.org/blob/operations%2Fpuppet.git/a23daab91b67cb7a41b71aa4f77828359dc53170/templates%2Fvarnish%2Fzero.inc.vcl.erb#L312

[2] The volume of requests is extremely low (like 20-100 requests/day),
first occurrence is on 2014-05-22T19:20:30 in the zero tsvs.

The requests go to

http://{en,pl}.{m,zero}.wikipedia.org/wiki/Main_Page

and are coming from a 10.68.0.0/16 IP, so they are probably on purpose.

[3] Up to 2014-06-19, such requests are not present in the sampled-1000
stream. But that may be caused by the sampling factor and the low volume of
requests. So I expect to see one or the other such request in the
sampled-1000 logs at some time.

Sent a heads up to Adam and Yuri, since the bug is filed in bugzilla's
“Analytics” product.

Just a thought - could this be the result of a test run? Some testing harness faking the headers from non-carrier IPs?

(In reply to Yuri Astrakhan from comment #3)

Just a thought - could this be the result of a test run?

Who knows what people test :-)
At worst times it hit us with ~30K requests / day.
1 “test” every 3 seconds on average is certainly not nice testing :-)
But no one hinders carriers (or Wikipedia Zero users) from doing that.

Some testing
harness faking the headers from non-carrier IPs?

We log the X-Analytics of the response (not request).
With the agreements around when you set the X-Analytics on the response,
"faked headers from non-carrier IPs" should not be an issue.

But since the bug got attention again, I looked at current logs, and
seeing the issue gone, I bisected the dates, and it seems those
requests died off on 2014-06-24 around 15:29:30.

Ibe45b47218c633973c8d3bcdb209346944955876 happened ~1 hour before, and fixed
wrong imports. Not sure if the wrong imports back-fired in some way, and fixing
the wrong imports thereby fixed the requests?

Anyways, the requests are gone, and the relevant code in puppet has been
refactored since. Hence, closing the bug.