Page MenuHomePhabricator

API langlinks does not return title of Main Pages
Closed, InvalidPublic

Description

http://en.wikipedia.org/w/api.php?action=query&prop=langlinks&titles=Main%20Page

returns something like

<langlinks>
  <ll lang="ar" xml:space="preserve" />
  <ll lang="bg" xml:space="preserve" />
  <ll lang="ca" xml:space="preserve" />

It should be <ll lang="ar" xml:space="preserve">blah blah</ll>

This happens with every wikipedia.


Version: 1.23.0
Severity: normal

Details

Reference
bz62020

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:02 AM
bzimport set Reference to bz62020.
bzimport added a subscriber: Unknown Object (MLST).

All the langlinks on enwiki's main page have no title specified, since an interwiki link with no title specified happens to link to that wiki's main page: https://en.wikipedia.org/w/index.php?title=Template:Main_Page_interwikis&action=edit#.

The API result merely reflects this situation. See https://en.wikipedia.org/wiki/GIGO for details.

I'm very confused why this got closed as invalid. I understand why it's showing up as an empty string, but I don't understand why we would continue to do this.

What is the perceived difficulty of doing this?

Surely we can resolve a empty string to the title it relates to... (just like our client would have to)? It's a bit unfair to expect an api client e.g. app to have to know about and deal with this behaviour themselves which is why I suspect @Nullzero raised this in the first place.

I'm very confused why this got closed as invalid. I understand why it's showing up as an empty string, but I don't understand why we would continue to do this.

What is the perceived difficulty of doing this?

How would code running in the context of enwiki know that [[de:]] is supposed to be [[de:Wikipedia:Hauptseite]]? Or worse, how would code running on some third-party wiki that includes [[en:]] and [[de:]] to link to the English Wikipedia's and German Wikipedia's main pages know to turn the first into [[en:Main Page]] and the second into [[de::Wikipedia:Hauptseite]]?

Well.. something somewhere most know how that's resolved to the dewiki main page as otherwise that link wouldnt work.

If we are relying on a varnish redirect to take a user to the right place that is a little alarming and we should stop doing that by either ignoring empty strings as inputs or resolving them when rendered.

Well.. something somewhere most know how that's resolved to the dewiki main page as otherwise that link wouldnt work.

The "something somewhere" is dewiki itself. Giving every other wiki in the universe knowledge of every other wiki's configuration might be possible (via API requests to the target wiki's api.php to query it from siteinfo, for example), but would be massive overkill for this situation.

If we are relying on a varnish redirect to take a user to the right place

MediaWiki itself handles the redirect. More specifically, MediaWiki::parseTitle() finds no title in the request and so uses the wiki's main page, and then MediaWiki::tryNormaliseRedirect() sees that the requested URL with an empty title doesn't match the canonical URL for the main page and redirects.

I can't say there isn't a redirect at the varnish level on Wikimedia wikis that prevents this code from being reached, but if there is it's only an optimization.

that is a little alarming and we should stop doing that by either ignoring empty strings as inputs or resolving them when rendered.

If you really want to start an RFC to make interwiki links like [[de:]] break, feel free to do so in a separate task. Resolving them when rendered does not seem feasible, as I've described at length above.