Page MenuHomePhabricator

CirrusSearch: The method for preventing duplicates of local and commons file results has a couple of flaws
Closed, ResolvedPublic

Description

I was thinking about this last night:

  1. If a file is updated on commons we'll remove all the flags that prevent duplicates from showing up. We certainly should fix this quickly.
  2. If a file is on a wiki, then uploaded to commons it won't be marked with the flag until we perform a noop edit on the wiki that used to have it.

One option we have to prevent both of these is to have commons search _all_ other wikis after a file is updated and mark the duplicates. That seems like too much overhead.

An option for preventing just #1 is to have commons know that it is special and needs to note clear those flags on update.

Another option for just preventing #1 is to move those flags into the common's database somehow and yank them out when building the commons page. If we integrate commons with some wikidata like thing then this might be a good option.

An option for masking number 2 is to remove the duplicate search results on the client side if they would appear on the same page. It is a parlour trick, but might be worth it.


Version: unspecified
Severity: normal

Details

Reference
bz58508

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 2:24 AM
bzimport added a project: CirrusSearch.
bzimport set Reference to bz58508.

Marking as resolved because https://gerrit.wikimedia.org/r/#/c/102967/ is merged and handles #1. #2 I think we can live with for now.