Page MenuHomePhabricator

Exclude redirects from search results
Closed, ResolvedPublic

Description

Author: dangrey101

Description:
Some searches, eg http://en.wikinews.org/wiki/Special:Search?
search=london+underground&go=Go , throw up many many redirects. It would be helpful if
they could be filtered from search results to avoid burying actual articles.

Perhaps redirects could be exculded by default, with an option on the search page saying
something like "include redirects in search?".


Version: unspecified
Severity: enhancement

Details

Reference
bz3174

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 8:45 PM
bzimport added a project: MediaWiki-Search.
bzimport set Reference to bz3174.
bzimport added a subscriber: Unknown Object (MLST).

erjohan wrote:

Found the same bug today, when searching for venice, perhaps results which have
redirects should have climd in the result list for every redirect that match it?

wikimedia_bugzilla wrote:

This happens when searching for a number of items on Wikipedia; I've come across
the problem frequently over the last couple of weeks. (e.g. searching for
"graviton" brings up "Gravitons" [redirect to "Graviton"], then "Graviton".)

IMO, where redirects are brought up MediaWiki should follow the redirect and
display the article the redirect points to instead of the redirect itself
(though it should point out that it's got this article through a redirect,
rather than direct terms matching) It should also check and see if this article
is already in the search results, to make sure the article is displayed once
only. While in most cases the latter will be true, which makes the former
irrelevant, there will be times when it isn't - in which case the former will be
useful.

spam02 wrote:

I know there's technical difficulty in having proper URL after redirection AND
keeping the information, but there's a solution to this problem:

Add <meta name="robots" content="noindex,follow"> to redirected pages (these with
wrong URL and content taken from another page).

This will exclude duplicates from Google and alike, and since there's "follow"
directive, link to original page will be followed and proper URL will be indexed.

ayg wrote:

Above comment was meant to refer to external search engine spiders, not internal
search (which this bug deals with). See bug 7042.

  • Bug 7812 has been marked as a duplicate of this bug. ***

ayg wrote:

*** Bug 8850 has been marked as a duplicate of this bug. ***

ayg wrote:

*** Bug 9320 has been marked as a duplicate of this bug. ***

rainman wrote:

Fixed in Lucene Search 2. Same-namespace redirects are excluded from search results, while their names are indexed alongside the article title they point to.

There are still some issues about redirect table not containing all redirects, and thus results are not always as expected, but this should be settled by running updateLinks.php maintenance script.

Also, I think that the search results could be a bit more verbose, by showing the redirects that point to the page if they satisfy the query, but that's an issue with the search term highlighter.