Page MenuHomePhabricator

Sister project (interwiki) links should be stored in their own table
Closed, ResolvedPublic

Description

We have all kind of links stored in db, but not sisterproject links. (Well, it's not the most correct word, because not only sister projects but other wikis and sites are linked with interwiki prefixes.)

These should be stored in database as well to provide better user friendliness of data mining, especially for broken link checking.


Version: unspecified
Severity: enhancement

Details

Reference
bz14473

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:12 PM
bzimport set Reference to bz14473.

I agree -- we have language links and even external links, but we don't have interwiki-links in the db. This is a bit annoying. And as suggested, checking for broken links to sister projects would be a prime application for this on wikimedia sites.

  • Bug 15986 has been marked as a duplicate of this bug. ***

christensenc wrote:

Thanks, Alexandre! I guess I didn't search the bug list as well as I thought, but I was looking for interwiki not sisterwiki :)
Here are my comments from 15986 so no one needs to follow that link:

The default behavior of links seems to be to add a link to a table:
categorylinks
pagelinks
imagelinks
externallinks
langlinks
templatelinks

But there is no table to store non-language interwiki links.

It would be really useful (to me anyway) to store interwiki links somehow, even
if they are stored inside of one of these other tables.

I think with the proper planning this could fix (or help fix) the following
bugs:
https://bugzilla.wikimedia.org/show_bug.cgi?id=167 - store IW links as
metadata
https://bugzilla.wikimedia.org/show_bug.cgi?id=1394 - IW what links here from
common
https://bugzilla.wikimedia.org/show_bug.cgi?id=1886 - more general IW what
links here
https://bugzilla.wikimedia.org/show_bug.cgi?id=4591 - local IW links made bold

  • Bug 23195 has been marked as a duplicate of this bug. ***

Assigning to myself; how did we manage to forget to do this for like 7 years?

Fixed in r65104.

Added iwlinks table to track inline interwiki link usage.

Like langlinks, this stores the interwiki prefix (as iwl_prefix) and full page title (as iwl_title), attached to the page doing the liking (as iwl_from -> page_id).
Unlike langlinks, there can be multiple entries stored per interwiki prefix.

Updater to add the table confirmed on MySQL, untested on SQLite but should work.
Someone may still need to add and test a PostgreSQL updater.

Refactored makeWhereFrom2d() out of LinkBatch to Database so it could be re-used for the similar mapping for the interwiki links, which need a string prefix rather than an int namespace key.
Also cleaned it up internally to reuse existing code for building where clauses from arrays. (Tim & Domas -- if the previous more verbose code was there to reduce function call and array processing overhead on very large link lists, feel free to unroll it again if the difference is measurable. Just swap the var names around from the old LinkBatch code and escape the base key value if it's not an integer, it'll be functionally equivalent.)