Page MenuHomePhabricator

Change default font for EasyTimeline on zh projects to something that actually has glyphs for Chinese characters
Closed, ResolvedPublicPRODUCTION ERROR

Description

All Chinese characters in the time line would be shown as "□". Seems the "FreeSan.ttf" doesn't contain full UTF-8 character set. Maybe change the font to "ArialMSUnicode.ttf" or something would solve this.

Regards.


Version: unspecified
Severity: major

Related Objects

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Since fonts-arphic-uming is installed in servers, can't we just use uming.ttc for EasyTimeline (zh-TW/zh-HK)?

Unrelated to the EasyTimeline codebase (it's configuration only, see T22825#835162), hence removing that project again from this task.

Since fonts-arphic-uming is installed in servers, can't we just use uming.ttc for EasyTimeline (zh-TW/zh-HK)?

because the extension is for the whole site, maybe it doesn't send the variant parameter to change the font.

T123223 It seem that some new free fonts had been installed in the server. Can them be used for the issue? @Aklapper

Since fonts-arphic-uming is installed in servers, can't we just use uming.ttc for EasyTimeline (zh-TW/zh-HK)?

because the extension is for the whole site, maybe it doesn't send the variant parameter to change the font.

fonts-noto-cjk is not unified but region-based font for China/Hong Kong/Taiwan/Korean. so will it cause problem with cross languages EasyTimeline setting?

T123223 It seem that some new free fonts had been installed in the server. Can them be used for the issue? @Aklapper

I do not know enough about EasyTimeline's configuration to answer this, sorry. :(
Let's hope to receive an answer in T84777 where you also asked this.

@Roytam1 There are no such thing known as unified font in the world at least as of now, and they are not supposed to be unified. Alternative to region based font would be a font that are designed according to a particular regional standard or according to font developer's habit. See Unicode's FAQ about CJK for further detail. I am not familiar with server environment nor the software's setting, but in home environment when an application use a multiregion font without specifing a region, the result is often default to China's standard.

The subtask has been resolved. font packages are now being installed on all appservers should be done in a couple minutes. please see if this has fixed the problem.

fatalmonitor yields:

125 not find/open font (unifont-5.1.20080907)

I guess it is transient / pending puppet to run on all app servers.

Related URL: https://gerrit.wikimedia.org/r/64205 (Gerrit Change I3c03bf9b2352a4e577f94ad92d2d38021cf12968)

@hashar

Is this patch working? Is it the problem of the wrong font-filename?

Or Is it the problem of the ttf and ttc filetype? It seems like the ttc is the collection of the ttf ,and some will not use it normally. Can the program use the ttc file?

Related URL: https://gerrit.wikimedia.org/r/64205 (Gerrit Change I3c03bf9b2352a4e577f94ad92d2d38021cf12968)

@hashar

Is this patch working? Is it the problem of the wrong font-filename?

It seems not working in production:
https://zh.wikipedia.org/wiki/2016%E5%B9%B4%E5%A4%AA%E5%B9%B3%E6%B4%8B%E9%A2%B1%E9%A2%A8%E5%AD%A3#.E9.A2.A8.E6.9A.B4.E6.99.82.E9.96.93.E8.A1.A8

Related URL: https://gerrit.wikimedia.org/r/64205 (Gerrit Change I3c03bf9b2352a4e577f94ad92d2d38021cf12968)

@hashar

Is this patch working? Is it the problem of the wrong font-filename?

It seems not working in production:
https://zh.wikipedia.org/wiki/2016%E5%B9%B4%E5%A4%AA%E5%B9%B3%E6%B4%8B%E9%A2%B1%E9%A2%A8%E5%AD%A3#.E9.A2.A8.E6.9A.B4.E6.99.82.E9.96.93.E8.A1.A8

I have seem that.

timeline/font system

T84777 made the webservers to include the Debian packages that provide fonts (In puppet: include ::mediawiki::packages::fonts). The fonts-wqy-zenheipackage is indeed installed and provide the font:

$ dpkg -L fonts-wqy-zenhei
...
/usr/share/fonts/truetype/wqy/wqy-zenhei.ttc
...

However that font is NOT going to be used by EasyTimeline!

EasyTimeline generates the image using a software called ploticus. The extension pass to the command a font name using the -f option. That is set in MediaWiki config using:

wmf-config/CommonSettings.php
if ( $wmgUseTimeline ) {
  wfLoadExtension( 'timeline' );
  if ( $lang = 'zh' ) {
    $wgTimelineFontFile = 'unifont-5.1.20080907.ttf';
  }

For the zh language we end up with the following call chain (summarized):

EasyTimeline.pl -f $wgTimelineFontFile
ploticus --font $wgTimelineFontFile

Ploticus look up the given file name in the path set via GDFONTPATH which is defined in our configuration has:

wmf-config/CommonSettings.php
putenv( "GDFONTPATH=/srv/mediawiki/fonts" );

The fonts are thus copied in our /fonts directory and fonts installed via Debian packages are never used!

Font selection

As part of this task, the font for zh language has been set from wqy-zenhei.ttcto unifont-5.1.20080907.ttf in June 2014:

(and see [[Kochi font]] for those Japanese fonts)

I'm submitting a patch to use unifont-5.1.20080907.ttf. This seems like a language-neutural font.

        } elseif ( $lang == 'zh' ) {
-               $wgTimelineSettings->fontFile = 'wqy-zenhei.ttc';
+               $wgTimelineSettings->fontFile = 'unifont-5.1.20080907.ttf';
        }

Potential fix

Debian has a unifont package https://packages.debian.org/jessie/unifont with the description:

font with a glyph for each visible Unicode Plane 0 character

GNU Unifont was designed to render something besides an empty box for each visible Unicode character in the Basic Multilingual Plane (Plane 0). Plane 0 contains most of the world's modern writing scripts. This font looks best at 12pt.

I guess ploticus/gd need a ttf version of it and Debian has such a package https://packages.debian.org/jessie/ttf-unifont . The .ttf files provided are:

/usr/share/fonts/truetype/unifont/unifont.ttf
/usr/share/fonts/truetype/unifont/unifont_csur.ttf
/usr/share/fonts/truetype/unifont/unifont_sample.ttf
/usr/share/fonts/truetype/unifont/unifont_upper.ttf
/usr/share/fonts/truetype/unifont/unifont_upper_csur.ttf

Which are described in the README as:

unifont: Basic Multilingual Plane (BMP, Plane 0)

unifont_csur: Fonts containing Plane 0 Unifont glyphs plus glyphs for Michael Everson's ConScript Unicode Registry (CSUR) for the Plane 0 Private Use Area

unifont_upper: Fonts containing glyphs from Unicode Plane 1 through Plane 14, inclusive.

I am not sure whether glyphs are in Plane 0 or another plane such as Plane 2 ''Supplementary Ideographic Plane'' (CJK characters).

Maybe we will have to create our own TrueType font that combines both Plane 0 and Plane 2 if that is at all possible.

Once we get a font that is known to work for zh use case. We can put it on the Wikimedia servers under /srv/mediawiki-staging/fonts/. Then patch the config to point $wgTimelineFontFile to it. Eventually nuke files that got previously generated by bumping $wgTimelineEpochTimestamp = '20130601000000';.

I created a more basic use case that does not even have any unicode characters: https://zh.wikipedia.org/w/index.php?title=User:Hashar/T22285

<timeline>
ImageSize  = width:900 height:300
PlotArea   = top:10 bottom:60 right:20 left:20
AlignBars  = early
DateFormat = dd/mm/yyyy
Period     = from:01/01/2016 till:31/12/2016
TimeAxis   = orientation:horizontal
ScaleMinor = grid:black unit:month increment:1 start:01/01/2016

BarData =
  barset:Hurricane
PlotData=
  barset:Hurricane width:10 align:left
  from:26/05/2016 till:27/05/2016 text:"ae"
  from:03/07/2016 till:11/07/2016 text:"eeeb"
  from:17/07/2016 till:17/07/2016 text:"c"
</timeline>

That yields on the server side:

Nov 14 15:39:38 mw1208:  Could not find/open font (unifont-5.1.20080907)
Nov 14 15:39:38 mw1208:  Could not find/open font (unifont-5.1.20080907)
Nov 14 15:39:38 mw1208:  Could not find/open font (unifont-5.1.20080907)
Nov 14 15:39:38 mw1208:  Could not find/open font (unifont-5.1.20080907) (width calc)
Nov 14 15:39:38 mw1208:  Could not find/open font (unifont-5.1.20080907) (width calc)
Nov 14 15:39:38 mw1208:  Could not find/open font (unifont-5.1.20080907) (width calc)
Nov 14 15:39:38 mw1208:  Could not find/open font (unifont-5.1.20080907) (width calc)
Nov 14 15:39:38 mw1208:  Could not find/open font (unifont-5.1.20080907) (width calc)
Nov 14 15:39:38 mw1208:  Could not find/open font (unifont-5.1.20080907) (width calc)
Nov 14 15:39:38 mw1275:  Could not find/open font (unifont-5.1.20080907)
Nov 14 15:39:38 mw1275:  Could not find/open font (unifont-5.1.20080907)
Nov 14 15:39:38 mw1275:  Could not find/open font (unifont-5.1.20080907)
Nov 14 15:39:38 mw1275:  Could not find/open font (unifont-5.1.20080907) (width calc)
Nov 14 15:39:38 mw1275:  Could not find/open font (unifont-5.1.20080907) (width calc)
Nov 14 15:39:38 mw1275:  Could not find/open font (unifont-5.1.20080907) (width calc)
Nov 14 15:39:38 mw1275:  Could not find/open font (unifont-5.1.20080907) (width calc)
Nov 14 15:39:38 mw1275:  Could not find/open font (unifont-5.1.20080907) (width calc)
Nov 14 15:39:38 mw1275:  Could not find/open font (unifont-5.1.20080907) (width calc)

Not sure why it is rendered on two servers. Pretty sure the font name comes from our zhwiki setting $wgTimelineFontFile = 'unifont-5.1.20080907.ttf';.

The file is definitely on those servers as /srv/mediawiki/fonts/unifont-5.1.20080907.ttf.

I reproduced it on my machine with ploticus version 2.42. The code does strip the '.ttf' suffix

fontpath = getenv( "GDFONTPATH" );
if( fontpath == NULL ) Eerr( 12358, "warning: environment var GDFONTPATH not found. See ploticus fonts docs.", "" );
if( strcmp( &s[ strlen(s) - 4 ], ".ttf" )==0 ) s[ strlen( s)-4 ] = '\0'; /* strip off .ttf ending - scg 1/26/05 */
strcpy( GFTfont, s );

So I guess we can rename/symlink our font file and adjust the mediawiki config for consistency.

- $wgTimelineFontFile = 'unifont-5.1.20080907.ttf';
+ $wgTimelineFontFile = 'unifont-5.1.20080907';

On our production test server I did a symlink:

$ cd /srv/mediawiki/fonts
$ ln -s unifont-5.1.20080907.ttf unifont-5.1.20080907

My test case https://zh.wikipedia.org/w/index.php?title=User:Hashar/T22285 suddenly rendered some text.

f4ff760ed86eb5c679ee9c1486b480fb.png (300×900 px, 1 KB)

Change 321493 had a related patch set uploaded (by Hashar):
Move EasyTimeline config to its own file

https://gerrit.wikimedia.org/r/321493

Lets say I am de facto working on it. Will get the font file renamed and see what happens. Then probably delete the whole cache of EasyTimeline files (by changing the Epoch).

Change 321558 had a related patch set uploaded (by Hashar):
Test for $wgTimelineFontFile values

https://gerrit.wikimedia.org/r/321558

Change 321560 had a related patch set uploaded (by Hashar):
Symlink fonts for ploticus

https://gerrit.wikimedia.org/r/321560

Change 321561 had a related patch set uploaded (by Hashar):
Drop '.ttf' from $wgTimelineFontFile settings

https://gerrit.wikimedia.org/r/321561

I think I have all the patchset more or less figured out. I even wrote a small test to ensure that $wgTimelineFontFile is not set to a file with a .ttf suffix and that the file exists in /fonts/.

Will complete tomorrow then we can schedule it for testing / deployment.

+ EasyTimeline quite helpful to have this task on the project board.

This comment was removed by Cwek.

It is not deployed yet. The few patches I have wrote need to be polished/reviewed and can then be deployed. You ca n find all of them via https://gerrit.wikimedia.org/r/#/q/bug:22825

Do you have tested the CJK case? This is the important point need to fix.

Change 321560 merged by Hashar:
Symlink fonts for ploticus

https://gerrit.wikimedia.org/r/321560

Change 321493 merged by jenkins-bot:
Move EasyTimeline config to its own file

https://gerrit.wikimedia.org/r/321493

Mentioned in SAL (#wikimedia-operations) [2016-12-05T14:21:54Z] <hashar@tin> Synchronized wmf-config/timeline.php: Move EasyTimeline config to its own file - T22825 (duration: 00m 44s)

Mentioned in SAL (#wikimedia-operations) [2016-12-05T14:23:24Z] <hashar@tin> Synchronized wmf-config/CommonSettings.php: Move EasyTimeline config to its own file - T22825 (duration: 00m 44s)

Change 321561 merged by jenkins-bot:
Drop '.ttf' from $wgTimelineFontFile bump epoch

https://gerrit.wikimedia.org/r/321561

Mentioned in SAL (#wikimedia-operations) [2016-12-05T14:26:15Z] <hashar@tin> Synchronized fonts: For T22825 (duration: 00m 47s)

Mentioned in SAL (#wikimedia-operations) [2016-12-05T14:30:06Z] <hashar@tin> Synchronized wmf-config/timeline.php: Drop ttf from $wgTimelineFontFile and bump epoch - T22825 (duration: 00m 47s)

Change 321558 merged by jenkins-bot:
Test for $wgTimelineFontFile values

https://gerrit.wikimedia.org/r/321558

Should be good now based on a dummy test pages on https://zh.wikipedia.org/wiki/User:Hashar/T22285 . Note that the pages will need to be purged though (via action=purge).

Note that I really discourage using EasyTimeline we should aim at migrating to the Graph extension https://www.mediawiki.org/wiki/Extension:Graph

Should be good now based on a dummy test pages on https://zh.wikipedia.org/wiki/User:Hashar/T22285 . Note that the pages will need to be purged though (via action=purge).

Note that I really discourage using EasyTimeline we should aim at migrating to the Graph extension https://www.mediawiki.org/wiki/Extension:Graph

The Axis text and Legend overlaps.
https://zh.wikipedia.org/wiki/2016%E5%B9%B4%E5%A4%AA%E5%B9%B3%E6%B4%8B%E9%A2%B1%E9%A2%A8%E5%AD%A3#.E9.A2.A8.E6.9A.B4.E6.99.82.E9.96.93.E8.A1.A8

@Roytam1 looking at the code, the months are in a bar that is shifted down by 20 pixels:
bar:Month width:5 align:center fontsize:S shift:(0,-20) anchor:middle color:canvas

If you remove the shift:(0,-20) the months are listed up the line and no more overlap the legend not ideal.

The legend position is hard fixed with:

Legend     = columns:2 left:30 top:59 columnwidth:270

Changing top:59 to top:39 move the legend down by 20 pixels and you can keep the month bare to shifted down by 20 pixels. Tried it and it works fine.

One sure thing, it is unrelated to this task :}

TTO subscribed.

One sure thing, it is unrelated to this task :}

Not necessarily; the new font seems to have slightly different metrics. It doesn't seem like a big problem though, so I'm marking this "resolved".

Crosslinking with T156191 and T127683 - might be related, or addressed in the same way?

mmodell changed the subtype of this task from "Task" to "Production Error".Aug 28 2019, 11:12 PM