Page MenuHomePhabricator

Regression: Section "Global usage" is not shown on Commons file page
Closed, ResolvedPublic

Description

Regression after yesterdays scap:

The section "Global usage" is not shown on Commons file description page but on the local projects only.

Compare
http://commons.wikimedia.org/wiki/File:Frankfurt_am_Main_Messeturm.jpg
with
http://de.wikipedia.org/wiki/Datei:Frankfurt_am_Main_Messeturm.jpg


Version: unspecified
Severity: normal

Details

Reference
bz23136

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 11:01 PM
bzimport added a project: GlobalUsage.
bzimport set Reference to bz23136.
bzimport added a subscriber: Unknown Object (MLST).

(In reply to comment #1)

I see it just fine.

Ignore me, I'm an idiot. I was looking in the wrong place.

I do not see it either on commons.

Looking in the code I find in function hasResults in
http://svn.wikimedia.org/viewvc/mediawiki/trunk/extensions/GlobalUsage/SpecialGlobalUsage.php?revision=64450&view=markup#l230

if ( !$file->exists() || $file->getRepoName() == 'local' )

return false;

All files on commons are local. So I think hasResults returns false. And the
code that displays global usage only display if hasResults is true.

The issue is that the "Global usage" tab has replaced the on-page global usage listing. This is, to put it mildly, less than useful. If the on-page listing could be restored, that would be ideal, but at least make it a preferences option to have the on-page listings.

Bryan.TongMinh wrote:

(In reply to comment #4)

The issue is that the "Global usage" tab has replaced the on-page global usage
listing. This is, to put it mildly, less than useful. If the on-page listing
could be restored, that would be ideal, but at least make it a preferences
option to have the on-page listings.

It's a bug, not a feature.

(In reply to comment #5)

It's a bug, not a feature.

Thanks Bryan for the clarification.

Ugh... yes, I'd recommend reverting/undeploying r60850 until we have a proper fix for this.

Alas, I don't see any obvious config variable that would tell us unambiguously that this wiki is acting as a global file repository for some other wikis. Besides, that wouldn't even be enough if someone had a weird setup with multiple global repos... although, does the current globalimagelinks table structure even support that properly?

The "right" way to fix this would be to store the wiki ID of the file in the globalimagelinks table (call it, say, gil_file_wiki) and only show the records where both gil_to and gil_file_wiki match.

The quick and dirty way to fix it would be to add a config variable like $wgGlobalUsageSourceWikis = array( 'commonswiki' ) and skip the $file->getRepoName() == 'local' check if we're on one of the listed wikis.

Bryan.TongMinh wrote:

(In reply to comment #7)

Alas, I don't see any obvious config variable that would tell us unambiguously
that this wiki is acting as a global file repository for some other wikis.
Besides, that wouldn't even be enough if someone had a weird setup with
multiple global repos... although, does the current globalimagelinks table
structure even support that properly?

The "right" way to fix this would be to store the wiki ID of the file in the
globalimagelinks table (call it, say, gil_file_wiki) and only show the records
where both gil_to and gil_file_wiki match.

You're absolutely right about that. Unfortunately I did not take this into account when writing this extension.

The quick and dirty way to fix it would be to add a config variable like
$wgGlobalUsageSourceWikis = array( 'commonswiki' ) and skip the
$file->getRepoName() == 'local' check if we're on one of the listed wikis.

The current fix that I am testing is basically wfGetDB(DB_SLAVE)->getDBname() == $wgGlobalUsageDatabase, but that is a less than ideal fix.

Bryan.TongMinh wrote:

(In reply to comment #8)

The quick and dirty way to fix it would be to add a config variable like
$wgGlobalUsageSourceWikis = array( 'commonswiki' ) and skip the
$file->getRepoName() == 'local' check if we're on one of the listed wikis.

The current fix that I am testing is basically wfGetDB(DB_SLAVE)->getDBname()

$wgGlobalUsageDatabase, but that is a less than ideal fix.

Committed with r64889.

Bryan.TongMinh wrote:

Deployed on Wikimedia; leaving this open for proper fix.

Bryan.TongMinh wrote:

(In reply to comment #7)

Ugh... yes, I'd recommend reverting/undeploying r60850 until we have a proper
fix for this.

Alas, I don't see any obvious config variable that would tell us unambiguously
that this wiki is acting as a global file repository for some other wikis.
Besides, that wouldn't even be enough if someone had a weird setup with
multiple global repos... although, does the current globalimagelinks table
structure even support that properly?

The problem with this is that the Commons repository does not know in which order a local wiki has its $wgForeignFileRepos. Take the following example:

  • Wiki A has as ForeignFileRepos in the following order: wiki B and wiki C
  • File Example.jpg exists on both on wiki B and wiki C.
  • File Example.jpg is used on wiki A. It will use wiki B's version, so globalimagelinks will point to wiki B.
  • Wiki B deletes file Example.jpg. Wiki A will now automatically use wiki C's version, however, this is not visible to wiki B and the globalimagelinks table will not be updated.

A possible fix for this could be fixing bug 22390.

Another option is waiting for wiki A to rerender the pages where the file is used. There may of course be an arbitrary long time interval between the actual deletion and the update of globalimagelinks.

Hadn't thought of that. :( I guess one way to work around the issue would be to _tell_ GlobalUsage which shared repos each wiki is using (and in what order), either through an explicit config variable or simply by recording that information in a separate table while we update the existing globalimagelinks table. Although then the extension would also have to query local image tables for existence checks, which could be a performance killer... unless we saved that information in a global table too...

Yeah, that could work. We'd need three global tables, say: globalimagelinks, globalimages and globalimagerepos. The globalimages table would just store existence data, basically two columns (wiki ID and file name). The globalimagerepos would store the filerepo search order for each wiki (three columns: wiki ID, repo ID and sequence number). Then a query to find all uses of a file would look something like:

SELECT gil_wiki, gil_namespace_text, gil_title

 FROM globalimagelinks, globalimagerepos AS repo1
WHERE gil_image = ?
  AND repo1.gir_wiki = gil_wiki
  AND repo1.gir_repo = ? 
  AND NOT EXISTS (
      SELECT 1 FROM globalimages, globalimagerepos AS repo2
       WHERE repo2.gir_wiki = repo1.gir_wiki
         AND repo2.gir_priority < repo1.gir_priority
         AND gi_repo = repo2.gir_repo
         AND gi_image = gil_image )

The same could be done without subqueries, using multiple queries and some client-side processing instead, like this:

  1. Find repos where the file exists:

    SELECT gi_repo FROM globalimages WHERE gi_image = ?
  1. Find wikis where the file is used:

    SELECT DISTINCT gil_wiki FROM globalimagelinks WHERE gil_image = ?
  1. Combine to get list of repos that could provide the file for each wiki:

    SELECT gir_wiki, gir_repo, gir_priority FROM globalimagerepos WHERE gir_repo IN (?) AND gir_wiki IN (?) ORDER BY gir_wiki, gir_priority
  1. Filter out wikis where the current repo is not listed first in the result of the previous query.
  1. Finally get list of uses on the remaining wikis:

    SELECT gil_wiki, gil_namespace_text, gil_title FROM globalimagelinks WHERE gil_image = ? AND gil_wiki IN (?)

Hmm, looks nice. Now someone just needs to code it. :)

Alternatively, the query in step 2 could be:

SELECT DISTINCT gil_wiki FROM globalimagelinks, globalimagerepos

WHERE gil_image = ? AND gil_wiki = gir_wiki AND gir_repo = ?

to filter out in advance any wikis that aren't using the repo we're interested in to begin with. This could be useful e.g. when looking at local files sharing their name with a file from Commons.

Bryan.TongMinh wrote:

I'm personally more in favour of a push or notification mechanism, where the shared repo notifies all slaves about a change. That would also solve bug 22390 and could in the future be used for cross wiki notification of deletions and such.

OK, that could work too. Although then we're back to needing a config variable to specify which wikis are acting as global repos. :)

(Unless we pull that info from the client wikis themselves, which would again require something like my globalimagerepos table.)

Looks fixed, because a section "File usage on other wikis" is shown for the example on commons and on de.wp.

(In reply to comment #16)

Looks fixed, because a section "File usage on other wikis" is shown for the
example on commons and on de.wp.

Correct. Somehow fixed in the past.

Gilles raised the priority of this task from Medium to Unbreak Now!.Dec 4 2014, 10:12 AM
Gilles moved this task from Untriaged to Done on the Multimedia board.
Gilles lowered the priority of this task from Unbreak Now! to Medium.Dec 4 2014, 11:22 AM