Page MenuHomePhabricator

aborted search requests - can't add item to statement
Closed, ResolvedPublic

Description

It isn't possible to add a statement with some items as values. Discussion excerpt from http://www.wikidata.org/wiki/Wikidata:Contact_the_development_team#Slow_ajax.2Fsearch_code.3F :


From my Firebug:
GET http://www.wikidata.org/w/api.php?callback=jQuer...ities&format=json&language=nb&type=item&search=W 200 OK 2.35s
GET http://www.wikidata.org/w/api.php?callback=jQuer...ies&format=json&language=nb&type=item&search=Wik Aborted
GET http://www.wikidata.org/w/api.php?callback=jQuer...es&format=json&language=nb&type=item&search=Wiki Aborted
GET http://www.wikidata.org/w/api.php?callback=jQuer...s&format=json&language=nb&type=item&search=Wikip Aborted
GET http://www.wikidata.org/w/api.php?callback=jQuer...&format=json&language=nb&type=item&search=Wikipe Aborted
GET http://www.wikidata.org/w/api.php?callback=jQuer...language=nb&type=item&search=Wikipedia-pekerside Aborted

The code seems to have no qualms about launching two (or more?) GET requests at the same time, in other words searching for "Wiki" and "Wikip" at the same time. This can't be good for performance. And most of the requests seem to time out, I suppose this is because the server is overloaded because the server-side code needs some optimizing? In any case, right now it's impossible for me to add "is a" + "Wikipedia disambuguation" to an item, because all the GET requests (beyond searching for the single letter "W") time out. I had the same problem yesterday, so it seems like a chronic problem. - Soulkeeper (talk) 10:52, 7 February 2013 (UTC)

To reproduce, go to Q1176386. Select interface "Norsk (bokmål)". Click "Legg til" (add) under "Utsagn" (Statements). Type "er" (is a). When the other (value?) field shows up, try to save it as "Wikipedia-pekerside". It's next to impossible, because the ajax calls tend to be aborted before they return anything. - Soulkeeper (talk) 11:09, 7 February 2013 (UTC)

I can reproduce this also with language set to English.


Version: master
Severity: major
Whiteboard: u=dev c=backend p=0

Details

Reference
bz44746

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:39 AM
bzimport set Reference to bz44746.
bzimport added a subscriber: Unknown Object (MLST).

I can confirm that this bug still exists. A timeout has been implemented in the UI a while ago to not trigger request instantly when typing a character. However, when typing slow enough, requests tend to time out (one problem) and the UI evaluates older requests that are answered after newer requests messing up the list of suggestions (second problem).

Setting this to normal as it is no longer as bad with the timeout.

Change 114165 had a related patch set uploaded by Thiemo Mättig (WMDE):
SearchEntities API call refactored for performance

https://gerrit.wikimedia.org/r/114165

Change 114165 merged by Addshore:
SearchEntities API call refactored for performance

https://gerrit.wikimedia.org/r/114165

I can reproduce this. It's like consecutive API calls (e.g. https://www.wikidata.org/w/api.php?callback=...&action=wbsearchentities&format=json&language=de&type=item&continue=0&search=wik&_=...) randomly fail with a timeout. Having to much requests should not be the issue since this also happens if I wait a full second between each key press.

I wonder why the API calls in my test (in Firefox) are aborted after only 150ms? Where is this short timeout set?

When looking at SearchEntities I was wondering why it queries so much and immediately throws away most of it. That's what my change above is about. I hope it helps, even if it's does not fix the issue.

First of all, this probably shouldn't use numerical offsets (if possible... not sure it is due to the way it works). Also we should avoid implementing our offsets by loading a lot of stuff first and then discarding stuff in PHP, which can get quite slow (also not sure this is possible).

If the above isn't possible, there are still some ways to improve performance. Also I think we should limit the number of entities which can be returned for a single term (limit + offset) to a value which is reasonable for UI use (100?).

(In reply to Marius Hoch from comment #6)

this probably shouldn't use numerical offsets

There is already a hard limit of 5000 in TermSqlIndex::getMatchingIDs(). Which is fine in my opinion. Decreasing this to something like 1000 would not help much.

Also please note that none of the use cases we talk about here uses the "continue" parameter. So the limit and continuation really is not the problem.

Change 145364 had a related patch set uploaded by Hoo man:
Math is sooo confusing ;-)

https://gerrit.wikimedia.org/r/145364

Change 145365 had a related patch set uploaded by Hoo man:
Math is sooo confusing ;-)

https://gerrit.wikimedia.org/r/145365

Change 145364 merged by jenkins-bot:
Math is sooo confusing ;-)

https://gerrit.wikimedia.org/r/145364

Change 145365 merged by jenkins-bot:
Math is sooo confusing ;-)

https://gerrit.wikimedia.org/r/145365