Page MenuHomePhabricator

Fix rebuildTermSearchKey script.
Closed, ResolvedPublic

Description

The rebuildTermSearchKey script needs fixes for several issues. Not that the core of the functionality is implemented in TermSqlCache::rebuildSearchKey().

The most important issues are:

  • The batch size is hardcoded to 10. It should be configurable (ideally, using a command line option, passed as a parameter to rebuildSearchKey). The default should be at least 100, maybe 1000 (ask Asher?).
  • The continuation between batches is ambiguous. We have term_row_id now, use that instead of trying to sort and select by *all* fields.
  • Use a logging callback to report progress (e.g. a dot per batch, or some such).
  • To allow for nicer handling of options, logging, etc, rebuildSearchKey() could be factored out into a separate class, so these things could be injected via the constructor instead of being function arguments.

There also seems to be a fatal error related to the issue addressed in Idbf67b96, but I'm not sure how that is related.


Version: unspecified
Severity: major

Details

Reference
bz45234

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:14 AM
bzimport set Reference to bz45234.
bzimport added a subscriber: Unknown Object (MLST).

One more thing: the script needs to take a continuation ID from the command line, so it can be re-started in case it broke.

Change I4d2b9fca: Rewrite of rebuildTermSearchKey

Change I9eb2735b: rebuildTermSearchKey should wait for slaves.

Both are merged and this fixes the rebuildTermSearchKey script.

Verified in Wikidata demo sprint 35-2