Page MenuHomePhabricator

API: return a list of supported wikis
Closed, ResolvedPublic

Description

Currently my bot has hard-coded that commonswiki, incubatorwiki and specieswiki are not supported. Please add a possibility to fetch a list of supported wikis (wikiid or domain), so that i can filter my langlinks based on this list before submitting them.

This can be implemented by a new module or by extending sitematrix.


Version: unspecified
Severity: normal

Details

Reference
bz38263

Related Objects

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:02 AM
bzimport set Reference to bz38263.
bzimport added a subscriber: Unknown Object (MLST).

All that have code="wiki" and are not closed, private or fishbowl. We do not support linking on alternate URLs, neither for ordinary aliases or language variants. Use the canonical form.

The sitematrix is the source for valid siteids
http://meta.wikimedia.org/w/api.php?action=sitematrix&smtype=language

You can also get the list used by individual modules by using the url
http://wikidata-test-repo.wikimedia.de/w/api.php?action=paraminfo&modules=wbgetitems

Note that the last list will contain all valid siteids, and do no distinction on which group they belong to. Sitelinks mimicking interlanguage links on Wikipedia should only come from the group identified by code="wiki"

Note especially that the following are redirected

For example will "nowiki" (aka "no") work while adding links, but not "nbwiki" (aka "nb"). The aliases is visible at "Special:Sitematix" as they have red links.
http://meta.wikimedia.org/wiki/Special:SiteMatrix

There could be sitelinks to additional groups, or even sites outside Wikimedia, but for now only Wikipedia is supported.

Note also that there are a small number of closed wikis

There could be others that should not be listed.

commonswiki, specieswiki and incuatorwiki langlinks are also linking to wikipedia wikis, but only unidirectional. And one day wikidata must support them. I personally think it's always better to fetch this config from api instead of heardcoding them in source code of all clients.

The paramsinfo parameter value list would solve my problem, if the result would be usable.

Fetching all open wikipedia language projects is not the problem.

The sitematrix *is* the source for the sites entered in the database table, and the database table *is* the source used for validation of sites. If something does not validates, its because its not included for the moment.

In this project there are some constraints on what can (and should) be supported for the moment, and that is interlanguage links between Wikipedia projects. Projects that have unidirectional links to wikipedia projects can use the sitelinks. Projects that wants to be included in the sitelinks must be able to provide a page that identifies the same entity as the Wikipedia pages, and must also be accepted by the community.

For now we only provide the site ids necessary to do so and leave the rest to the future community. Those that have a code="wiki" will work, other codes may work, those not included in the sitematrix (and the paramsinfo) will fail.

Re-opening.

The list of sites returned by the API's self-documentation should be the list of sites that can actually be used with setsitelink and setitem. Since we currently only support links to Wikipedias, the list of sites should only be Wikipedias. That is, it must only return wikis in the group we actually support for linking.

Merls said on IRC: ommonswiki and wikispecies belong to the wikipedia langlink group, dewikisource langlinks to not link to wikipedia

I guess all wikis on which langlink link to wikipedia should be in the wikipedia group, iven if they are not wikipedias. Makes sense?

Commons and wikispecies are special in that they can read from Wikidata, but not write to Wikidata, so probably that should be addressed somehow.

Should be fixed in master: only wikipedia sitres are listed by the documentation, and only wikipedia sites are allowed as link targets.

Discussion about edge cases like commons and wikispecies should be taken somewhere else, it's a configuration and conceptual issue.

Verified in Wikidata demo time for sprint 15