Page MenuHomePhabricator

Provide a JavaScript API for geoip lookups
Closed, ResolvedPublic

Description

There is code in several extensions (WikimediaShopLink, UniversalLanguageSelector, CentralNotice) for loading https://bits.wikimedia.org/geoiplookup and https://geoiplookup.wikimedia.org/.

This exceeds the scope of what extensions are entitled to determine, in my opinion, since access to GeoIP look-ups has monetary, legal & performance implications (the latter because it is often used in a manner that adds a request to each page load).

GeoIP lookups should be wrapped by a JavaScript API that de-duplicates / caches lookups. The API should make it possible to apply limits to GeoIP usage that are enforced across all extensions.


Version: wmf-deployment
Severity: enhancement

Details

Reference
bz57126

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:15 AM
bzimport set Reference to bz57126.
bzimport added a subscriber: Unknown Object (MLST).

Thanks Ori, I'm excited to see progress in this area.

From our (techops) side I have two requests for this common API:
a) Identify what exactly is it that we need from GeoIP; we currently provide alot of information, including city, lat long as well as a netmask that we had to hardcode to 24 (i.e. useless) due to performance issues with libgeoip (cf. https://gerrit.wikimedia.org/r/#/c/34280/). I couldn't identify any code that actually used anything but country from my cursory lookup.

b) Deprecate the bits.wikimedia.org/geoiplookup URL and rely solely on geoiplookup.wikimedia.org. As a reminder, the reason that we used a separate top-level domain is that IPv6 city databases are non-existent, so we have to force users to use IPv4 by using a separate domain; depending on (a), we might not need this anymore, but having a separate domain gives us more possibilities, so let's keep that.

Additionally, note that depending on the use case, it might be possible to do some GeoIP lookups on the Varnish layer but instead of serving JSON, tagging regular requests to MediaWiki with a custom header that it can be then Vary-ed on. It looks possible we e.g. might going to end up doing that for CentralAuth.

Finally -and this falls outside of ops- it might be helpful to look at MaxMind's own Javascript API ( http://www.maxmind.com/en/javascript ) for ideas on what could be implemented in ours from an API perspective.

(In reply to comment #1)

b) Deprecate the bits.wikimedia.org/geoiplookup URL and rely solely on
geoiplookup.wikimedia.org. As a reminder, the reason that we used a separate
top-level domain is that IPv6 city databases are non-existent, so we have to
force users to use IPv4 by using a separate domain; depending on (a), we
might not need this anymore, but having a separate domain gives us more
possibilities, so let's keep that.

Bug 55907 comment 2 and documentation ([[wikitech:geoiplookup]] && [[m:geoiplookup]]) suggest that geoiplookup.wikimedia.org is deprecated in favor of bits.wikimedia.org/geoiplookup.

If geoiplookup.wikimedia.org is going to be un-deprecated, we'll need to make sure that developers are aware and that the relevant documentation is updated.

ULS uses the country information for language selection suggestions on the frontend.

We use the information that is already loaded on WMF sites (except for third parties where we add a call to retrieve that info), but it could potentially be delayed until click, but it would add to the delay of showing the language selector widget for the first time.

I agree with all the points made in this task, but want to add that it's no longer a performance hit to query the GeoIP information. This is included by Varnish as a Cookie header in every request, and Javascript simply parses that value. As I mentioned in card T102848, CentralNotice is currently parsing the cookie and storing in a window global variable, but I'd like to push that code out of there and into core or a library.

I agree with all the points made in this task, but want to add that it's no longer a performance hit to query the GeoIP information. This is included by Varnish as a Cookie header in every request, and Javascript simply parses that value. As I mentioned in card T102848, CentralNotice is currently parsing the cookie and storing in a window global variable, but I'd like to push that code out of there and into core or a library.

If I may be pedantic: it still is for IPv6 traffic (where we do an additional request to geoiplookup which only has an IPv4 address). IPv6 currently accounts for > 6% of our traffic, not insignificant. This is expected to be fixed with T99226, though.

TheDJ set Security to None.
TheDJ added projects: Performance Issue, IPv6.
TheDJ removed a subscriber: wikibugs-l-list.

grr, who said edit conflicts where a bad thing....

faidon claimed this task.

This task been resolved a long time ago with the GeoIP cookies.