Page MenuHomePhabricator

Provide commonswiki, wikidatawiki and centralauth on all clusters and document their usage
Closed, ResolvedPublic

Description

Currently, commonswiki_p.page is accessible and JOINable as commonswiki_f_p.page from within - for example - enwiki_p and dewiki_p. This should be extended to:

  1. all target databases (for example, zhwiki_p has no link commonswiki_f_p),
  2. all source databases (not just Commons), and
  3. all source tables in each of the source databases.

Once this is set up and automated, its usage and limitations (i. e. JOIN only on indexes of the source table, etc.) need to be documented.


Version: unspecified
Severity: blocker

Details

Reference
bz57876

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:39 AM
bzimport added a project: Toolforge.
bzimport set Reference to bz57876.

There is no plan to support federation of all databases; only commons and wikidata are planned.

Is there a use case for other combinations that are not simply hypothetical and not better served by application logic?

(In reply to comment #1)

There is no plan to support federation of all databases; only commons and
wikidata are planned.

Is there a use case for other combinations that are not simply hypothetical
and
not better served by application logic?

Answering the first question: I'm not aware of any; Commons and Wikidata was what was available at the Toolserver and in use there.

  • Bug 58802 has been marked as a duplicate of this bug. ***

Joining of a foreign databases with a wiki database is highly needed if this data is also used by mediawiki installation.

These databases are currently commonswiki, wikidatawiki and centralauth. The last one was only a requested feature on toolserver but never realized. But it would be useful to have it on labs.

commonswiki: needed because of included images

wikidatawiki: needed because of languagelinks and included properties

centralauth: getting all rights of a user on a local wiki (combining of local and global user groups)

  • Bug 52559 has been marked as a duplicate of this bug. ***

Closed bug #52559 as a duplicate of this one, gave higher priority.

Talked to Tim: I'll rename this bug to reflect the 3 databases commonswiki, wikidatawiki and centralauth.

According to Marc-André Pelletier, this will take maximum two weeks after the migration of Labs from pmtpa to eqiad is done. (Not because it lasts that long, but because of the general work load.)

(In reply to Tim Landscheidt from comment #0)

Currently, commonswiki_p.page is accessible and JOINable as
commonswiki_f_p.page from within - for example - enwiki_p and dewiki_p.

Is this documented anywhere? I can't find any info whatsoever on federated tables, joins of commonswiki/centralauth, commonswiki_f_p or any other keyword I could think of.

(In reply to Nemo from comment #8)

(In reply to Tim Landscheidt from comment #0)

Currently, commonswiki_p.page is accessible and JOINable as
commonswiki_f_p.page from within - for example - enwiki_p and dewiki_p.

Is this documented anywhere? I can't find any info whatsoever on federated
tables, joins of commonswiki/centralauth, commonswiki_f_p or any other
keyword I could think of.

The original summary of this bug was "Automate creation of federated tables and document their usage". I've added it to the summary again.

ETA according to IRC office hour: April 6th.

Change 123916 had a related patch set uploaded by coren:
Labs: automate federated table maintenance

https://gerrit.wikimedia.org/r/123916

Change 123916 merged by coren:
Labs: automate federated table maintenance

https://gerrit.wikimedia.org/r/123916

There are now a centralauth_f_p, wikidatawiki_f_p and commonswiki_f_p databases provided on every shard where the real corresponding database does not exist, and those databases now federate all the public tables by default.

Basic documentation (including the caveats) are at:

https://wikitech.wikimedia.org/wiki/Nova_Resource:Tools/Help#Joins_between_commons.2C_centralauth_and_wikidata_and_other_project_databases