Page MenuHomePhabricator

Set geoiplookup on a different IP than bits
Closed, InvalidPublic

Description

Author: me.myself99

Description:
screenshots in zip file

Recently (~ 2011.2.1), pages began being displayed without Wikipedia page "template" formatting, normally delivered via CSS I suspect.

I found a high correlation between this symptom and blocking data transfer with either of the GeoIP lookup servers by my firewall. I.e., when I block connections with the 2 following servers, this symptom appears. When I suspend the blocking, pages appear normally.

geoiplookup.wikipedia.org
geoiplookup.wikimedia.org

Screen shots of normal ("FormattingOK") and broken ("NoFormatting") page renderings attached.

Both browser cache clearing, and server cache purging did not rectify the problem. This symptom may disappear during non-peak times (I haven't tested for this), but is definitely happening now (afternoon, Pacific Time).

If this correlation indeed identifies the cause of the problem, it reveals a bad website design, because blocking connections with non-content-oriented servers, such as a GeoIP server, is and should be viewed as *entirely* reasonable by a user (and sysadmins). Such blocking should not break a website.

BTW, this symptom appeared briefly about a couple of months ago (I think it was), and then went away. Was GeoIP being tested live then?


Version: unspecified
Severity: normal
OS: Windows XP
Platform: PC

Attached:

Details

Reference
bz27347

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:13 PM
bzimport set Reference to bz27347.

Which browser are you using? I don't recognize it in your screenshots.

I tried blocking geoip (by redirecting it in /etc/hosts ) and could not reproduce the issue. Does it still happen?

geoip really should not be able to cause this...

p.s. /me bets seamonkey on the browser.

I tried installing seamonkey on my windows host and then editing lmhosts to block the geoip and nothing looks funky.

I blocked geoiplookup in my Linux machine,verified it was blocked, cleared the cache and hit shift-reload in Chrome and didn't see the problem

I suspect something other that geoip. Are you sure you're not blocking bits.wikimedia.org? I was able to get this sort of retro look when I blocked that site.

me.myself99 wrote:

Ah, I didn't mention browser. It is SeaMonkey. (v2.0.2, will upgrade to latest soon).

I just tried wikipedia again, and it's still doing it (any page does it).

To specify, I'm using a soft firewall (Symantec). I block outbound connections to those 2 servers, and this symptom appears. If I turn off the blocking rule (using a toggle in the firewall software--trust me, it's pretty simple), they render fine. Turn the rule back on, the pages render without formatting again.

Keep in mind, the article content is delivered, such as TOC links, refs, pictures, etc. But the wikipedia logo and the rest of what appears to be the article formatting template, horizontal rules, the table border around the TOC if there is one, etc. are what's missing.

me.myself99 wrote:

(In reply to comment #4)

Ah, I didn't mention browser. It is SeaMonkey. (v2.0.2, will upgrade to latest
soon).

I just tried wikipedia again, and it's still doing it (any page does it).

To specify, I'm using a soft firewall (Symantec). I block outbound connections
to those 2 servers, and this symptom appears. If I turn off the blocking rule
(using a toggle in the firewall software--trust me, it's pretty simple), they
render fine. Turn the rule back on, the pages render without formatting again.

Keep in mind, the article content is delivered, such as TOC links, refs,
pictures, etc. But the wikipedia logo and the rest of what appears to be the
article formatting template, horizontal rules, the table border around the TOC
if there is one, etc. are what's missing.

Update: Tried with Firefox, same thing happens. Do I have to break out Wireshark?

me.myself99 wrote:

UPDATE:
I added a rule ahead of my GeoIP blocking rule, specifically allowing bits.wikimedia.org, and pages render correctly.

So, this is either:

  1. a confused router or server-side problem, in which blocked connections to geoiplookup trump any subsequent fetch attempts from bits.wikimedia.org, or
  1. a major flaw in Symantec's software (go figure)

I'm asking Rob on OPS to look at this, to see if they can set geoiplookup on a different IP address, but he can close it if he doesn't think we should do that.

Given that bits and geoiplookup are on the same IP address, I'm guessing your firewall just blocks the IP.

Better make sure whitelisting bits doesn't also whitelist geoiplookup if you still want to block it.

Just finished talking to Rob, where he explained that using up an additional IP to handle this situation wasn't feasible or good policy.

However, if you can't make your firewall behave, Rob and I recommend you check out Tor (http://www.torproject.org/). GeoIP will still point somewhere, but it won't be your location.

Just to make sure you understand that blocking geoip is a fruitless effort for more than just the problem you are having...

The geoip stuff simply takes your IP address, and correlates it with the country you are in. We could likely go down to the city scale, but currently are not.

You are accomplishing absolutely nothing by blocking the geoip server. Every time you visit any site anywhere, or send a packet to any server in existence, you are sending them the same exact information you are sending to the geoip server. It isn't a breach of privacy as the information you are sending is totally public, and we don't share this information with anyone. The information is added to your session, and isn't available to anyone else.

We don't use this information to track you. In fact, we *do* use the information to adjust your content. Currently it is only being used for fundraiser banners, but gadget writers have plans for using the information as well. So, if you block this server, at minimum expect you user experience to not be as good. (Remember that gadgets are javascript, and as such are client side, using information that is only available to you, and not to the gadget writers)

FYI, we also use geoip services for load balancing to our caching servers. We do geoip lookup based on your DNS resolver. This is so that we can send you to faster servers for your area.

I'm a fairly paranoid person, but I at least am paranoid about things I actually should be paranoid about, and not paranoid about every thing that /sounds scary/.

me.myself99 wrote:

Understood.

Thank you to Mark for indicating that GeoIP and bits are on same IP. Will worry about differentiating the two on my end (obviously).

Thanks, Ryan, for the info on the use of GeoIP.

My motivation for blocking was simply seeing this new (to me, at least) use of GeoIP, without any explanation of what it's for. With the new inclusion of GeoIP functionality in Mozilla browsers, I simply wasn't sure what I as a user was getting, and defaulted to blocking it.

I'll look around, but as I didn't yet find any coverage in the Tech Village Pump nor Bugzilla, is there any possibility of writing up a short overview of all the tech used on wikipedia/wikimedia, to allay these kinds of worries? If there isn't one, perhaps we can add one to the wikipedia article on wikipedia?

Thanks for the info and help everybody.