Page MenuHomePhabricator

PDF output does't render ProofreadPage's <pages /> tag
Closed, DeclinedPublic

Description

Odissea stripped down

The string
<pages index="Odissea (Pindemonte).djvu" from=8 to=8 />
is not rendered correctly, see attachment for a minimal test case. In the attached PDF, at least the image is missing, which is visible at https://it.wikisource.org/wiki/Odissea?oldid=1460987


Version: unspecified
Severity: major
URL: https://it.wikisource.org/w/index.php?title=Speciale:Libro&bookcmd=render_article&arttitle=Odissea&oldid=1460988&writer=rdf2latex

Attached:

Details

Reference
bz71822

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:56 AM
bzimport added a project: OCG-PDF-renderer.
bzimport set Reference to bz71822.

Created attachment 16717
Odissea PDF as in URL (standard state)

Attached:

(In reply to Erasmo Barresi from comment #2)

This is a duplicate of bug 66597.

Thanks for pointing out. However, Collection switched to OCG after you filed bug 66597, so we have two different issues with identical result.

@Nemo_bis wrote:

The string
<pages index="Odissea (Pindemonte).djvu" from=8 to=8 />
is not rendered correctly, see attachment for a minimal test case.
In the attached PDF, at least the image is missing, which is visible
at https://it.wikisource.org/wiki/Odissea?oldid=1460987
Odissea PDF as in URL (standard state)

Attached:

Its not being detected (added to Immagini) because the work's structure (or document outline if you like) does not being to establish some sort of heading level rank (<h4><span class="mw-headline" id="Indice">Indice</span></h4> until after p. 1. To bookCmd, this is like 'putting the cart before the horse' - so it skips it and goes on trying to make sense of the html innards as its relates to structure & flow.

Add to that the fact that image is not hosted where the other successfully detected .svgs are from (on Commons) but locally (it.wikisource), trying to render a stored history of the page instead of one based on the "current" page ID, wiki-mark-up's incessant fiddling with heading elements by design -- plus the transclusion in from the Page namespace probably having some small role in all that too -- and its no wonder the operation gives up & dies... or gracefully renders little if anything at all in the end.

Eliminate a few of those possible factors by generating a PDF "closer" to the static content (from the Page namespace: means no transclusion involved, H# rank is easy to detect on such "short" pages, uses the current rev ID, etc. etc... and bookCmd can manage the remaining factors; generating the content with the previously missing .png successfully.

No argument - the collection extension isn't great for much if anything, but well-entrenched, sub-par user practices &/or policies play a role in the success or failure in this drama too.

This bug is abot <pages />. If you test without <pages />, I'm not sure how your comment is relevant to this bug.

But you're right, further test cases should be used: for instance page 10 alone, https://it.wikisource.org/w/index.php?title=Speciale:Libro&bookcmd=render_article&arttitle=Odissea&oldid=1504847&writer=rdf2latex , is correctly transcluded. Further tests are need to isolate whether there is an issue with local images etc. as you speculated.

As already announced in Tech News, OfflineContentGenerator (OCG) will not be used anymore after October 1st, 2017 on Wikimedia sites. OCG will be replaced by Electron. You can read more on mediawiki.org.

Aklapper removed cscott as the assignee of this task.

If I understand correctly, then this is fixed on Wikimedia servers nowadays:
We nowadays link to WS Export in the sidebar for PDF files (which includes the image and all the subpages). When using "Versione stampabile" the image is also included. I'm not sure how <pages index="Odissea (Pindemonte).djvu" from=8 to=8 /> would be rendered correctly though. :-/

I'm declining this task as it was reported when we still used the OfflineContentGenerator backend to create PDF. It has been dead for years and has been superseded by Proton and Electron-PDFs on Wikimedia servers (or in the case of Wikisource, with WS Export).