Page MenuHomePhabricator

Inconsistent machine translation with yandex
Closed, InvalidPublic

Description

Author: gmaruzz

Description:
I'm using Yandex, with latest released wikimedia and latest released MLEB

When I open a page for translation, only some of the elements get the translation aid (eg: "loading", and then the text from Yandex)

Many of them do not get the "loading" at all.

Seems that the http request to Yandex is not made for all elements.

Also, seems to be random which ones get the http request (eg: "loading").


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=50861

Details

Reference
bz55633

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 22 2014, 2:28 AM
bzimport set Reference to bz55633.
bzimport added a subscriber: Unknown Object (MLST).

(In reply to comment #0)

Seems that the http request to Yandex is not made for all elements.

How did you determine this?
We've had problems with Yandex lately, Niklas said there were timeouts on their end: bug 50861.

gmaruzz wrote:

because when the request to Yandex is made, there is a "loading" in the box.

Is not a problem of timeout, neither of flooding. I've been able to flood Yandex with requests, and it works.

Also, no DNS problems.

Eg: with a script doing more or less the same request to Yandex, I have no problems.

Seems more a problem in the GUI not requesting the aid (eg: something in Ajax, or whatever).

I was not able to trace it, neither to debug or profile...

How I can trace the working of the Yandex requests?

(In reply to comment #2)

How I can trace the working of the Yandex requests?

Niklas a while ago added some [[mw:profiling]] (read for instructions): https://git.wikimedia.org/blobdiff/mediawiki%2Fextensions%2FTranslate/42268a4bba7a30b22d316748957c62aa7bd71daf/utils%2FTranslationHelpers.php

gmaruzz wrote:

using that instrumentation I was able to profile the requests, and they're all under one second.

Unsurprisingly, like when you go to the dentist, with profiling enabled it was working with most elements, only very few had not the Yandex aid.

After disabling both debugging and profiling, it still works for vast majority of elements (when I reported the bug was working only for a small minority of elements).

So, I would ask Niklas to have a look into it, because a minority of elements do not get their aids.

But now is markedly usable for me. Let's hope will not go back to a minority of elements served by aids...

Thanks Nemo

gmaruzz wrote:

eg:

TranslateWebServiceRequest-Yandex 1 378.126 378.126 73.972% 31480 ( 378.126 - 378.126) [0]
TranslateWebServiceRequest-Yandex 1 850.506 850.506 92.564% 32400 ( 850.506 - 850.506) [0]
TranslateWebServiceRequest-Yandex 1 405.096 405.096 28.516% 31744 ( 405.096 - 405.096) [0]
TranslateWebServiceRequest-Yandex 1 478.253 478.253 83.835% 31720 ( 478.253 - 478.253) [0]
TranslateWebServiceRequest-Yandex 1 421.888 421.888 38.762% 31944 ( 421.888 - 421.888) [0]
TranslateWebServiceRequest-Yandex 1 404.257 404.257 86.350% 31728 ( 404.257 - 404.257) [0]
TranslateWebServiceRequest-Yandex 1 400.511 400.511 42.294% 31504 ( 400.511 - 400.511) [0]
TranslateWebServiceRequest-Yandex 1 541.835 541.835 88.234% 31600 ( 541.835 - 541.835) [0]
TranslateWebServiceRequest-Yandex 1 543.182 543.182 44.004% 31840 ( 543.182 - 543.182) [0]
TranslateWebServiceRequest-Yandex 1 417.529 417.529 87.692% 31592 ( 417.529 - 417.529) [0]
TranslateWebServiceRequest-Yandex 1 524.851 524.851 48.962% 31896 ( 524.851 - 524.851) [0]
TranslateWebServiceRequest-Yandex 1 412.205 412.205 85.592% 31632 ( 412.205 - 412.205) [0]
TranslateWebServiceRequest-Yandex 1 380.788 380.788 40.686% 31520 ( 380.788 - 380.788) [0]
TranslateWebServiceRequest-Yandex 1 672.832 672.832 91.783% 31976 ( 672.832 - 672.832) [0]
TranslateWebServiceRequest-Yandex 1 586.617 586.617 41.860% 31824 ( 586.617 - 586.617) [0]
TranslateWebServiceRequest-Yandex 1 525.063 525.063 89.640% 31904 ( 525.063 - 525.063) [0]
TranslateWebServiceRequest-Yandex 1 583.270 583.270 47.132% 31824 ( 583.270 - 583.270) [0]
TranslateWebServiceRequest-Yandex 1 1030.991 1030.991 91.273% 32672 ( 1030.991 - 1030.991) [0]
TranslateWebServiceRequest-Yandex 1 898.897 898.897 92.879% 32456 ( 898.897 - 898.897) [0]
TranslateWebServiceRequest-Yandex 1 560.820 560.820 35.811% 31864 ( 560.820 - 560.820) [0]
TranslateWebServiceRequest-Yandex 1 461.466 461.466 88.955% 31768 ( 461.466 - 461.466) [0]
TranslateWebServiceRequest-Yandex 1 512.157 512.157 46.494% 31912 ( 512.157 - 512.157) [0]
TranslateWebServiceRequest-Yandex 1 644.665 644.665 91.503% 31800 ( 644.665 - 644.665) [0]
TranslateWebServiceRequest-Yandex 1 408.716 408.716 34.412% 31736 ( 408.716 - 408.716) [0]

lockalsash wrote:

It would be nice to see the messages, which are under one second. Are they too long, contain some untranslateable wiki/html markup or just normal text? What are the language pairs for these timeouts? Maybe it is a bug in Yandex.Translate algorithm, then it should be reported via http://feedback2.yandex.com/translate/other/ . If there are problems only with non-slavic languages, then translatewiki could disable yandex translation helper helper for them.

Also don't forget that Yandex.Translate service IP addresses are located in Russia. For this reason, yandex translator works nicely for my website https://translateblender.ru , while Microsoft's translatehelper is almost always stays in autodisabled state (which was one of the reasons to create YandexTranslateHelper).

One good long-term solution is to rewrite TranslationHelper system to use asynchronous requests. This means running all the curl requests in parallel, so the time complexity would be max(t(yandex), t(microsoft), etc) instead of t(yandex)+t(microsoft)+etc.

This it not about translatewiki.net as far as I understand.

The web services get automatically suspended if there are too many failures in given time period. After a cooldown they will be tried again and resume if the service works, or kept suspended if it still fails.

In the past days I've seen lot of failures for Yandex at translatewiki.net and it has been suspended for long periods of time because of "http-timed-out".

I suggest you enable the debug logging for translation services:
$wgDebugLogGroups['translationservices'] = 'file in the filesystem';

Nikerabbit claimed this task.

There have been changes to Yandex support, so I believe this issue is now outdated. If this is not the case, please reopen with the details requested in above comment.