Page MenuHomePhabricator

CirrusSearch: When there isn't much data in the wiki result sorting can be wildly wrong
Closed, ResolvedPublic

Description

When there isn't much data in the index result sorting can be wildly wrong. The simplest way to fix this is to only use a single shard for really small wikis. We could also fix it by adding an extra query parameter to searches for small wikis. Read more here: http://www.elasticsearch.org/blog/understanding-query-then-fetch-vs-dfs-query-then-fetch/


Version: unspecified
Severity: normal

Details

Reference
bz53039

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:58 AM
bzimport added a project: CirrusSearch.
bzimport set Reference to bz53039.
bzimport added a subscriber: Unknown Object (MLST).

Note that this shouldn't be a problem in production because there will be "enough" data but it makes tests fail spuriously.

Assigning high because it causes my tests to fail when they shouldn't, making me waste time.

Change 79791 had a related patch set uploaded by Demon:
Default to more accurate but slower search_type.

https://gerrit.wikimedia.org/r/79791

Change 79791 merged by jenkins-bot:
Default to more accurate but slower search_type.

https://gerrit.wikimedia.org/r/79791