Page MenuHomePhabricator

Generating PDFs for books with more than one page fails horribly at pl.wikisource
Closed, ResolvedPublic

Description

Generating PDFs for books with more than one page fails horribly at pl.wikisource. Example book: https://pl.wikisource.org/w/index.php?title=Wiki%C5%BAr%C3%B3d%C5%82a:Ksi%C4%85%C5%BCki/Copp%C3%A9e_Fran%C3%A7ois_-_Henryka&rcid=450937 - click "Pobierz PDF" and watch the havoc unfold.

RuntimeError: command failed with returncode 256: ['mw-render', '-w', 'rl', '-c', 'cache/87/87f4f0452d9a266c/collection.zip', '-o', 'cache/87/87f4f0452d9a266c/output.rl', '--status', 'qserve://localhost:14311/87f4f0452d9a266c:render-rl', '--template-blacklist', 'MediaWiki:PDF Template Blacklist', '--template-exclusion-category', u'Omi\u0144 w druku', '--print-template-prefix', 'Drukuj', '--print-template-pattern', '$1/Wydruk', '--language', 'pl']
Last Output:

2013-04-23T21:30:16 mwlib.options.warn >> Both --print-template-pattern and --print-template-prefix (deprecated) specified. Using --print-template-pattern only.
MISSING FONTS: 'Gujarati'
1%  reading /tmp/tmp-mw-renderkA0Rg1/tmp_GQQMz/revisions-1.txt
set locale to 'pl_PL.UTF-8' based on the language 'pl'
1% laying out 0% error removing '/tmp/tmp-mw-renderkA0Rg1/tmp_GQQMz'
Traceback (most recent call last):
  File "/home/pp/local/bin/mw-render", line 9, in <module>
    load_entry_point('mwlib==0.15.7', 'console_scripts', 'mw-render')()
  File "/home/pp/local/lib/python2.6/site-packages/mwlib/apps/render.py", line 241, in main
    return Main()()
  File "/home/pp/local/lib/python2.6/site-packages/mwlib/apps/render.py", line 206, in __call__
    writer(env, output=tmpout, status_callback=self.status, **writer_options)
  File "/home/pp/local/lib/python2.6/site-packages/mwlib/rl/rlwriter.py", line 2193, in writer
    r.writeBook(output=output, coverimage=coverimage, status_callback=status_callback)
  File "/home/pp/local/lib/python2.6/site-packages/mwlib/rl/rlwriter.py", line 471, in writeBook
    art = self.buildArticle(item)
  File "/home/pp/local/lib/python2.6/site-packages/mwlib/rl/rlwriter.py", line 360, in buildArticle
    revision=item.revision)
  File "/home/pp/local/lib/python2.6/site-packages/mwlib/nuwiki.py", line 430, in getParsedArticle
    return uparser.parseString(title=title, raw=raw, wikidb=self, lang=self.siteinfo["general"]["lang"], expandTemplates=expandTemplates)
  File "/home/pp/local/lib/python2.6/site-packages/mwlib/refine/uparser.py", line 62, in parseString
    a = compat.parse_txt(input, title=title, wikidb=wikidb, nshandler=nshandler, lang=lang, magicwords=magicwords, uniquifier=uniquifier, expander=te)
  File "/home/pp/local/lib/python2.6/site-packages/mwlib/refine/compat.py", line 193, in parse_txt
    sub = core.parse_txt(raw, **kwargs)
  File "/home/pp/local/lib/python2.6/site-packages/mwlib/refine/core.py", line 1037, in parse_txt
    combined_parser(parsers)(tokens, xopts)
  File "/home/pp/local/lib/python2.6/site-packages/mwlib/refine/core.py", line 646, in __call__
    p(x, xopts)
  File "/home/pp/local/lib/python2.6/site-packages/mwlib/refine/core.py", line 762, in __init__
    tokens[i] = m(name, vlist, inner or u"", xopts)
  File "/home/pp/local/lib/python2.6/site-packages/mwlib/refine/core.py", line 879, in create_pages
    pages = expander.db.select(s, e)
AttributeError: 'DictDB' object has no attribute 'select'
 in function system, file /home/pp/local/lib/python2.6/site-packages/mwlib/nslave.py, line 64

Version: master
Severity: major
URL: https://pl.wikisource.org/w/index.php?title=Wiki%C5%BAr%C3%B3d%C5%82a:Ksi%C4%85%C5%BCki/Copp%C3%A9e_Fran%C3%A7ois_-_Henryka&rcid=450937
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=52108
https://bugzilla.wikimedia.org/show_bug.cgi?id=65298

Details

Reference
bz47575

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:31 AM
bzimport added a project: Collection.
bzimport set Reference to bz47575.
bzimport added a subscriber: Unknown Object (MLST).

This is server side so lies with PediaPress to fix it

This is server side so lies with PediaPress to fix it

CC'ing Volker - Volker, if there are better people to CC please tell me. :)

General comments:

When I use "Pobierz jako PDF" in the left pane, I get through to the download section for your link:
"Dokument został wygenerowany. Pobierz plik na swój komputer."
The PDF is a bit too empty for my taste, but that's likely another bug that I have in mind and cannot find right now.

"Pobierz jako PDF" links to:
https://pl.wikisource.org/w/index.php?title=Specjalna:Ksi%C4%85%C5%BCka&bookcmd=render_article&arttitle=Wiki%C5%BAr%C3%B3d%C5%82a%3AKsi%C4%85%C5%BCki%2FCopp%C3%A9e+Fran%C3%A7ois+-+Henryka&oldid=443913&writer=rl

If I use "Pobierz PDF" from

[ Przygotuj książkę ] [ Pobierz PDF ] [ Pobierz EPUB ] [ Pomoc ]

I can reproduce the problem.

"Pobierz PDF" links to:
https://pl.wikisource.org/w/index.php?title=Specjalna:Ksi%C4%85%C5%BCka/render_collection/&colltitle=Wiki%C5%BAr%C3%B3d%C5%82a:Ksi%C4%85%C5%BCki/Copp%C3%A9e_Fran%C3%A7ois_-_Henryka

Out of curiosity, here does <table class="plainlinks"> come from?I've tried Wikisources in other languages but don't get it displayed there.

Are there other examples for this?

ralf_wikimedia wrote:

Thank you for the bug report.

It looks like this has been broken with our recent switch to expand
templates via api.php. This change was necessary in order to support
the scribunto extension.

Unfortunately we have two problems here:

  1. The "AttributeError: 'DictDB' object has no attribute 'select'"

error is caused by our software using a stripped down 'DictDB'
instance as a "wiki database". We would be able to fix that.

  1. Since we now expand templates when fetching articles, we do *not*

fetch the templates being used by an article. Since api.php does not
expand the <pages> tag, the information for expanding the pages tag is
not even available.

My suggestion for fixing this issue would be to fix the expand
templates functionality, but that is not possible according to
https://bugzilla.wikimedia.org/show_bug.cgi?id=46115#c7.

The other way the issue could be fixed, would be to work with
mediawiki's HTML output. But that's a lot of work, which we can't
afford to do in order to fix wikisource.

ralf_wikimedia wrote:

I've opened a bug about using mediawiki's HTML output:
https://bugzilla.wikimedia.org/show_bug.cgi?id=47867

ralf_wikimedia wrote:

*** Bug 52108 has been marked as a duplicate of this bug. ***