Page MenuHomePhabricator

rebuildtextindex.php doesnt remove wrong data in the searchindex table
Closed, ResolvedPublic

Description

When you have invalid data in the searchindex table, the maintenance script rebuildtextindex.php is there to fix that. However, this currently does not work:

The script currently does only update the _SQL index_, not the _rows_ of the searchindex table. So rows, which should not be there (e.g. as they point to a deleted page), will _stay_(!) in the index. That way the index is created based on wrong data.

rebuildtextindex.php should first rebuild the _actual table data_, before it recreates the SQL index (based on that data).


Version: 1.22.3
Severity: normal

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:01 AM
bzimport set Reference to bz62276.
bzimport added a subscriber: Unknown Object (MLST).

rebuildtextindex.php works correctly when you truncate the searchindex table before you run rebuildtextindex.php. Adding this step to rebuildtextindex.php should solve the issue.

Thanks for taking the time to report this!

In case you are interested to provide a patch, see https://www.mediawiki.org/wiki/Developer_access

If you could do that, it would be amazing!

Your patch adds

$this->clearSearchIndex();

to the "mysql" condition of function execute() of maintenance/rebuildtextindex.php.

This solves the issue for me!

gerritbot subscribed.

Change 186973 had a related patch set uploaded (by Addshore):
Run clearSearchIndex when mysql in rebuildTextIndex

https://gerrit.wikimedia.org/r/186973

Patch-For-Review

Change 186973 merged by jenkins-bot:
Run clearSearchIndex when mysql in rebuildTextIndex

https://gerrit.wikimedia.org/r/186973

hoo subscribed.

Thanks a bunch, guys!

Can this bugfix still be backported to the supported branches? 1.24 and especially the 1.23 LTS branch will still be around for some time.

Current LTS is 1.23. 1.24 is not LTS but will be supported until November 2015

@MarkAHershberger may decide. It should apply cleanly to those branches.

Thanks for pointing this out; it is not much of an effort and it would solve the bug.

CC'ing @greg as this is a backport request for tarballs. (Wondering whether a dedicated "backport request" ticket against the MediaWiki-Tarball-Backports project would not be better, in theory. Whatever workflow works.)