Page MenuHomePhabricator

Searching for IP Addresses in page content unsuccessful
Closed, ResolvedPublic

Description

Author: cameronem

Description:
We are using Mediawiki for technical documentation. As part of this we have device info contained in a table and part of this is the IP Addresses configured.

Unfortunately searches pick up the ip addresses entered which do definately exist on pages. If I enter the ip then the search finds no hits.

We are running MediaWiki 1.12.0, PHP 5.1.2, MySQL 5.0.26

Below is an exert of part of a table in use (IPs changed). EG if "10.10.10.1" is entered in the search box and search clicked, it will find no results.

{| style="width:75%; text-align:left" border="1"
! width="30%" bgcolor="#EEEEEE"| ''Configuration Item''
! bgcolor="#EEEEEE"| ''Setting''

-

! Device Name

{{PAGENAMEE}}
-
Management IP Address
10.10.10.1
-
}

Version: 1.12.x
Severity: normal
OS: Linux

Details

Reference
bz15027

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:18 PM
bzimport added a project: MediaWiki-Search.
bzimport set Reference to bz15027.

The index field markup munging probably needs some love to make sure common constructs like this are handled cleanly.

Yes, but in the meantime this kind of use is actually the point of Semantic MediaWiki. Things like that with special importance should be tagged. The information gathered from it would be far more useful than that of a simple search.

That's nice, but irrelevant to the real world where people use straight text. :)

Take the liberty to assign to Brion. He obviously knows what this is about.

tathagata.dasgupta wrote:

This bug had put me in a faux pas, some weeks back after I had evangelised moving to mediawiki for migrating all system maintenance documentations. As I was demonstrating the migrated documents to my senior, a similar request for searching IP of a server or name of a feed file returned no results - even though they were very well in the page, and the older documenting methods (abominable .doc files) worked fine.
Nobody showed any interest in wikifying "everything" after this :(

cameronem wrote:

Also occurs on 1.13 since upgrading.

Is anybody aware of any method that we can use for the searches to work. I have tried putting the IP in "quotes" etc but made no difference.

Otherwise some form of realistic time before it can/will be fixed? Sorry I just need to give justification to the higher up people...

Ok, the problem here appears to be that with the default MySQL search backend the search terms get broken up at the periods...

Thus "192.168.1.1" would become the boolean mode search: "+192 +168 +1 +1"
All the terms would be too short for the default, and fail to return any results.

In current 1.14 development trunk, we now pad the short words in the index, so they can be found:
'+192U800 +168U800 +1U800 +1U800''

This is a slight improvement, in that the search will work and return matches. But it's still pretty crappy, since the loss of the periods loses you context; you can easily end up with false positives, and the result highlighting doesn't show you what you wanted either.

If those periods don't get stripped out of the input, it might work a little better, but the index may need adjustment as well...

Ok, fixed on 1.14 dev trunk in r44791.

Now allowing periods through to the indexed text, and encoding periods that appear within a compound word so they get caught more cleanly.

Also made a tweak so highlighting works a bit better on word boundaries -- eg "192.168.1.1" no longer hits a highlight match for "192.168.1.100". However it's still not 100% handling some cases with the periods. Sigh.