pywikibot transliteration should support chinese transliteration
Closed, ResolvedPublic
Actions

(For future reference, defining the exact scripts to be supported in a bug request is welcome. If it's just about "support more" than a report can easily get unfixable by comments broadening the scope of a bug report.)

nzmoihue wrote:

#c3

I checked that source but I couldn't find the dictionary file, [1] syas there is file named CTLauBig5.tit, but there isn't. Can you tell me more precise about the dictionary?

[1] http://cpansearch.perl.org/src/KAWASAKI/Lingua-ZH-Romanize-Pinyin-0.23/lib/Lingua/ZH/Romanize/DictZH.pm

nzmoihue wrote:

There is not a one-to-one "dictionary" there, that is why I CCd original writer of the transliteration. Also have a look at https://github.com/axgle/pinyin

Created attachment 16327
Python translation of https://github.com/axgle/pinyin

(In reply to [no longer active user] from comment #14)

There is not a one-to-one "dictionary" there, that is why I CCd original
writer of the transliteration. Also have a look at
https://github.com/axgle/pinyin

{{done}} translation of the four scripts to python. See attachment.

Attached:

zhtransliteration.py8 KBDownload

Created attachment 16327 [details]
Python translation of https://github.com/axgle/pinyin

However there is some bug that caused 6651 Chinese characters getting 'zuo'.

Attached:

zhtransliteration.py8 KBDownload

I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.

Change 157498 had a related patch set uploaded by Zhuyifei1999:
Improving transliteration support for Chinese

https://gerrit.wikimedia.org/r/157498

(In reply to Merlijn van Deen from comment #17)

I suppose we can add this, but what's the intended use case? We support full
unicode console output (and input, but transliteration is output-only) on
all systems.

Yes, that is indeed a hard question. Why do we still have transliteration.py?

Change 157498 merged by jenkins-bot:
Improving transliteration support for Chinese

https://gerrit.wikimedia.org/r/157498

Reopen if there is more to be done.

(In reply to John Mark Vandenberg from comment #21)

Reopen if there is more to be done.

zh-hant (Traditional Chinese) needed.

I will repeat my question:

I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.

/why/ is it needed?

(In reply to Merlijn van Deen from comment #23)

I will repeat my question:

I suppose we can add this, but what's the intended use case? We support full unicode console output (and input, but transliteration is output-only) on all systems.

/why/ is it needed?

I suppose, that when the output is somehow ASCII-limited (for some reason the log files by the grid engine on tool labs is an example of this), transliterated output could be more useful than a pile of question marks or other non-readable code.

Some transliteration data exist at https://www.wikidata.org/wiki/MediaWiki:Gadget-SimpleTransliterate.js

Liuxinyu970226 unsubscribed.Mar 3 2015, 12:04 AM

Liuxinyu970226 subscribed.Mar 9 2015, 7:37 AM

Liuxinyu970226 set Security to None.Sep 18 2015, 11:34 AM

Restricted Application added subscribers: revi, Aklapper. · View Herald TranscriptSep 18 2015, 11:34 AM

Shizhao added a project: Chinese-Sites.Dec 27 2016, 8:30 AM

Liuxinyu970226 updated the task description. (Show Details)Dec 29 2016, 1:25 AM

Liuxinyu970226 mentioned this in T75410: Use an external package for transliteration.

Aklapper added a project: Pywikibot.Jun 29 2018, 3:19 PM

In T58524#631881, @zhuyifei1999 wrote:

Some transliteration data exist at https://www.wikidata.org/wiki/MediaWiki:Gadget-SimpleTransliterate.js

Most of the useful ones have been added since 603ab87d916f73c5981b03ae6deb269795306276.

VulpesVulpes825 moved this task from Backlog to Extensions/Skins on the Chinese-Sites board.Jun 21 2020, 7:39 AM

Is there anything left to do here?

Shizhao moved this task from Extensions/Skins to Apps/Tools/Services on the Chinese-Sites board.Jul 5 2021, 2:40 AM

I close this now. Please feel free to reopen it if there is something left to do.

Shizhao moved this task from Apps/Tools/Services to Closed on the Chinese-Sites board.Oct 19 2021, 3:53 AM

I believe we need some further works here...

More character support - current list is still incomplete
Handling Chinese polyphone (heteronym) character, like 行 has two pronunciation called hang and xing. I have no idea how to handle such cases as this is a context free process.

It will benefit a gadget on Wikidata.

Restricted Application added a subscriber: Ericliu1912. · View Herald TranscriptJun 17 2022, 2:28 PM

Stang moved this task from Closed to Apps/Tools/Services on the Chinese-Sites board.Jun 17 2022, 2:28 PM

Stang removed a project: Pywikibot-General.

In T58524#8011766, @Stang wrote:

More character support - current list is still incomplete

Can you give any example or description or implementation for any language which can be adopted. Without any further information it cannot be completed.

Handling Chinese polyphone (heteronym) character, like 行 has two pronunciation called hang and xing. I have no idea how to handle such cases as this is a context free process.

Polyphone characters cannot be handled in Pywikibot. Transliteration is made for any output word scraps but there is no context or semantic processing. The only way could be to use it in combination with other characters. 行 is already transliterated as xing.

I close this now as invalid because there is no description what to do with this task. It is necessary for the implementation to have a list of unicode chars and it’s transliteration. Please feel free to reopen it if there is something left to do and you have such a table or any other implementation in any language.

Stang moved this task from Apps/Tools/Services to Closed on the Chinese-Sites board.Aug 21 2022, 12:01 PM

Ericliu1912 unsubscribed.Aug 24 2022, 8:43 AM

pywikibot transliteration should support chinese transliterationClosed, ResolvedPublicActions

Description

Details

Related Objects

Event Timeline

pywikibot transliteration should support chinese transliteration
Closed, ResolvedPublic
Actions