Page MenuHomePhabricator

The source language of a page should be arbitrary
Closed, ResolvedPublic

Assigned To
Authored By
bzimport
Mar 26 2012, 4:09 PM
Referenced Files
None
Tokens
"Like" token, awarded by Qgil."Love" token, awarded by mcruzWMF."Love" token, awarded by al."Love" token, awarded by Florian."Love" token, awarded by Jcornelius."Like" token, awarded by Nemo_bis.

Description

Author: bdamokos

Description:
Currently the Translate extension assumes that the source language of the translation is the same as the content language of the wiki.
This might be the case in a default wiki, but it makes the extension less useful on wikis with multilingual content (e.g. [[meta:]]; where the default is English), where content might be created in any language and the translation tool used to make it available in the other languages of the wiki.

Todos as of summer 2014: https://www.mediawiki.org/wiki/User:SPQRobin/Page_language


Version: unspecified
Severity: enhancement

Details

Reference
bz35489

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 22 2014, 12:16 AM
bzimport set Reference to bz35489.

Had the core away to select the content language per page basis, the Translate extension could use that information. I think there is bug open for that.

Confirmed. The source language of translatable pages have to be in the content
language. We're not giving priority to changing this short term.

Feature requirements would become very different, as we cannot expect everyone
to read, just as an example of course, Hungarian. If the source units are in
Hungarian, it should for example be possible for translators to choose a
different source language than the original source language (i.e. "en" instead
of "hu") if there are (reviewed? and) translated units in that language.

All in all, there are some very specific hurdles to overcome here, and in the
Wikimedia localisation team, we have prioritised others.

Advice: Always use the content language as the source language for translatable
pages. For now, that's English on Meta wiki, MediaWiki wiki and Wikimania 2012
wiki.

If core introduces the support to set the page content language, I see no reason not to use that information and imho we can do that without providing all the other features you mentioned.

bdamokos wrote:

(In reply to comment #2)

Feature requirements would become very different, as we cannot expect everyone
to read, just as an example of course, Hungarian. If the source units are in
Hungarian, it should for example be possible for translators to choose a
different source language than the original source language (i.e. "en" instead
of "hu") if there are (reviewed? and) translated units in that language.

Isn't this feature already available in some form on Translatewiki, where it is possible to choose a "helper language" in settings?

The translation tool should be available for translators who speak both a language other than English and English and could benefit from the quite usable interface provided by the tool (and the standardized language bar and possibly translation memory) even in cases where there would not be a multitude of target languages.

Advice: Always use the content language as the source language for translatable
pages. For now, that's English on Meta wiki, MediaWiki wiki and Wikimania 2012
wiki.

This is helpful just in cases when the page to be translated is originally created in English.

I appreciate the input, but I reserve the right to decide on priorities.

Any hope for that one?

I certainly can’t measure the hurdles Siebrand mentions, but I would like to second Bence: using as a start page a page in another language than English (French in my case) sounds like a very common use case (I am thinking of Meta here).

Is there no other procedure than translating “manually” from French to English, marking the English version for translation, and pushing the original French chunks into the French translation? It kind of defeats the all purpose…

Chapters are a very notable user of Translate extension, but last time I checked I had the impression that the pages translated from a non-English source are a fraction of the thousands affected pages on Meta. [[m:Meta:Translate_extension#Migration]]

I desire this enhancement a lot but in all frankness I can't confidently say that the current priority is inaccurate...

bdamokos wrote:

I think a big number of non-English to English translation of documents ending up on Meta is taking place in other venues partly because of the lack of feature and creating this feature could lead to an increase in such activity and reduce Anglo-centrism, as well as increase the ease of creating those non-English to English translations (see the workflow described by Jean-Fred...).

Confirming that this would be useful on Meta - another example: blog posts for the Wikimedia blog. I would just have used the Translate extension to enable the translation of a blog post draft from Portuguese into English if it hadn't been for this bug.

I understand this is not a priority, but what are the blocking issues here?

It’s kind of frustrating to see the Translate extension become more and more awesome, and not being able to use it for translations :)

(In reply to comment #10)

I understand this is not a priority, but what are the blocking issues here?

Setting the content language of the page should be done in MediaWiki core. However, that is non-trivial to implement. See the blocker bug.

It is trivial to change Translate extension to use the page content language, which already exists as concept in MediaWiki. Currently the page content language can only be changed via hooks or some hardcoded cases.

mchidester wrote:

I was hoping to use this extension on an educational wiki to assist in the translation of foreign-language documents to English, and it wasn't until I had sorted out how to make it interact properly with Extension:Proofread Page (the transcription engine developed for Wikisource) that I realized there was no way to set the language of the document. It seems like this isn't going to happen any time soon, but I still want to express how useful this feature would be to my site if it's ever implemented.

mroth wrote:

I would echo the comments on the usefulness of having translations from other languages to English. Right now, we're getting a lot of multilingual content on the Wikimedia Blog:
http://blog.wikimedia.org/tag/multilingual-post/

But almost none of these translations are happening with the translate extension. The post authors are writing in their native language and then giving us their best English translation (which is sometimes clearly just Google Translate) on the blog page on Meta.
https://meta.wikimedia.org/wiki/Wikimedia_Blog/Drafts

I then spend a not insignificant amount of time re-writing the roughly translated English into more natural English. It would be amazing if we can get translations the "other way" with the awesome tool you've built.

maxime.boissonneault wrote:

The source language can already be changed with MediaWiki. In 1.20.4, you can add something like this in LocalSettings.php :

$wgHooks['PageContentLanguage'][] = 'onPageContentLanguage';
function onPageContentLanguage( $title, &$pageLang ) {
$categories = $title->getParentCategories();
$pageLang = 'fr';
foreach ($categories as $i => $value) {

		if ( strpos($i,'English_documentation') !== false ) {
			$pageLang = 'en';
			break;
		}

}
return true;
}

In MediaWiki 1.21, from what I understand, this hook has been moved to the ContentHandler class, but I assume the same kind of trick should work.

Then, if you add [[Category:English documentation]] on a page, the source language will be considered as English, and it will be considered as French otherwise.

The only drawback that I've seen so far is that Translate does not make use of this information quite correctly for the <languages /> tag. While it shows the page as English first, with a possible translation in French, the language box does not show English unless you tell the extension that you want to translate both in English (which is a non-sense) and in French. If this could be fixed so that the PageContentLanguage always shows in the language box, this would work flawlessly.

I assume someone could easily come up with a tag to add to the page instead of using categories.

maxime.boissonneault wrote:

Nikerabbit in chat solved the issue with the <languages /> tag.

In messagegroups/WikiPageMessageGroup.php, simply change the getSourceLanguage() function so that it returns

$this->getTitle()->getPageLanguage()->getCode();

instead of $wgLanguageCode.

I hope this will get incorporated into Translate soon, and that somebody come up with a better way than categories to change the source language.

This will however do the trick for our wiki for now.

maxime.boissonneault wrote:

I submitted a patch for this change. Commit number is Ifd838e962bb532c246d41b2322b2b5861cc46102

Maxime: If you add "Bug: 35489" as the last line of the commit message, there will be an automatic notification in Bugzilla - see http://www.mediawiki.org/wiki/Gerrit/Commit_message_guidelines

maxime.boissonneault wrote:

Thanks, I will do that next time. Should I do another commit to add this ?

Wonderful to see this getting resolved. Thank you, Maxime.

It would be great to see this getting resolved, or at least have its importance raised from "lowest". At the WMF staff All Hands meeting last month, a discussion took place about ways in which the Foundation succeeds or fails to be a truly international organization; I brought this bug up as an example of the latter (see e.g. https://commons.wikimedia.org/wiki/File:Wikimedia_Foundation_2013_All_Hands_Offsite_-_Day_1_-_Photo_41.jpg - right flip chart, third item from below).

As said before, this is not currently a priority for the language team, and I don't have time to work on this personally.

And, Translate extension basically supports this feature already. I'm hoping someone else will implement the missing piece, page content language selection in MediaWiki core. That someone is not watching this bug, I'm afraid.

I would find this fix enormously useful. All of our shared spaces that are intended to be multilingual, have "English" as the default language. That is a convention; and is not meant to force translations to happen to/from English.

So yes: we need per-page language selection. Is there a separate bug for this?
Niklas: who are working on that part of MW core? I take it you don't recommend Maxime's patch, and instead recommend pushing for the update to core?

Having our own Translate extension and being able to organize translation on the wiki is really wonderful. This is the only major drawback with the current process; but it is significant: and keeps many people from using Meta for translations at all (where they might have once done so). Tilman gave one example; people also face this in grantmaking, where the original source for reports is rarely English:
https://meta.wikimedia.org/wiki/Grants:IEG/Learning/Round_1_2013/Impact#Tools_for_execution

(In reply to SJ from comment #22)

So yes: we need per-page language selection. Is there a separate bug for
this?

Two even, see blockers.

Niklas: who are working on that part of MW core?

Nobody, currently (as with most areas of core).

Nemo - Thank you. Cause for a core corps?

Kunal - That's wonderful, and I see you have excellent mentors lined up.
Good luck with this project.

kc.kclau wrote:

I am looking into using MediaWiki for a new wiki site to encourage people to submit articles. They may prefer to submit in their native languages and have them translated to English and other major languages if possible. So a fix to this bug is most desirable.

I am using MediaWiki 1.22.5 with the patch already in place, see Comments 15 and 16 above:
In messagegroups/WikiPageMessageGroup.php, simply change the getSourceLanguage() function so that it returns

$this->getTitle()->getPageLanguage()->getCode();

instead of $wgLanguageCode.

But I cannot get it working. I tried with the following change to use User interface language instead:
// return $this->getTitle()->getPageLanguage()->getCode();
return RequestContext::getMain()->getLanguage()->getCode();

as I expect that it would come from the Language setting in the User Preference and can be modified ad-hoc by appending ?uselang=xyz or &uselang=xyz to the current URL. It did not work either.

As I am new to MediaWiki (and php), may be I have got it wrong and would appreciate if someone can point out my mistake.

Anyway, it is great to hear that a fix is upcoming and I look forward to trying it.

Change 135312 had a related patch set uploaded by Nemo bis:
First version of Page Language selector

https://gerrit.wikimedia.org/r/135312

Change 135312 had a related patch set uploaded by Anomie:
First version of Page Language selector

https://gerrit.wikimedia.org/r/135312

Change 136623 had a related patch set uploaded by Kunalgrover05:
Front end for page language selector(DONT MERGE)

https://gerrit.wikimedia.org/r/136623

Change 135312 merged by jenkins-bot:
First version of Page Language selector

https://gerrit.wikimedia.org/r/135312

Change 136623 abandoned by Kunalgrover05:
Front end for page language selector(DONT MERGE)

Reason:
Already merged

https://gerrit.wikimedia.org/r/136623

Hi all, I work as community liaison for Learning and Evaluation, and as such, I wanted to add a dimension to this task that may be under estimated. We work with program leaders from all over the world, and try to make it as easy as possible for them to share what they now about their programs. This is why I encourage submitting content in their original language. This also happens when somebody wants to submit a grant proposal. Further, grant reports are sometimes written both in English and in the grantee original language. This right now happens in a very 1.0 web style, with the same text twice. I figure, if we could give Meta users the choice of language when they create a page, every content could actually use the translation tool as it was intended. Finally, the importance of having tools in their original language is key for program leaders to work with partners in their local context. It would make Meta truly global :-)

+1 @mcruzWMF - This is an important issue for meta users who do not speak English as fluently and would prefer to express themselves in their native language. Any updates about the process or blockers would be really helpful - Thanks!

a dimension to this task that may be under estimated.

Thanks for your comment. I think every single subscriber of this task is painfully aware of the importance of this bug for the reasons you stated, which are worth repeating.

The most proximate blocker is T69223, assigned to Sean Pringle. I suggest that you direct prayers in that direction. :) That probably takes few minutes on most chapter wikis, while it's going to be more complex on Meta.

Nemo_bis set Security to None.
Nemo_bis added a project: Epic.
Nikerabbit raised the priority of this task from Lowest to Low.Jul 22 2015, 4:28 PM
Nikerabbit raised the priority of this task from Low to Medium.Jul 22 2015, 5:05 PM

There's something that "normal admins" can do? I offer myself.

There's something that "normal admins" can do?

Not on Wikimedia wikis. Are you admin on a non-Wikimedia wiki?

Yes, I am. I'm admin in some non-Wikimedia wikis with Extension:Translate. I pretend to make easier translation from some languages to others without depending of a centralized language. I'm sysadmin with some little knowledge about coding and part of a technology community that use Mediawiki. I just wanna help in some way because a lot of projects and communities will be beneficiary :)

al, great. Then you can help by enabling the feature on your wiki (it should be enough to give the pagelang permission to your group), using it a bit and commenting on https://www.mediawiki.org/wiki/User:SPQRobin/Page_language what works and what not.