Page MenuHomePhabricator

Not possible to transclude to single paragraph over multiple instances of <pages />
Closed, ResolvedPublic

Description

Author: alex.farlie

Description:
http://en.wikisource.org/w/index.php?title=User:Sfan00_IMG/Sandbox&oldid=3732857

In the section CAP III. There is a paragaph/line break between '...hath' and
'had ... '.

In checking the HTML source it seems that a paragraph (or even div) break is being created for each call to <pages />

This apparently precludes the use of <pages /> To extract sections from
multiple pages and combine them into a single paragraph.

My suggestion would be that either the documentation needs updating to say WHY the intended usage isn't possible, or the extension needs to be modified to
cope with combination of multiple <pages /> usages to what would be a single paragraph.


Version: unspecified
Severity: enhancement

Details

Reference
bz35757

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 12:14 AM
bzimport added a project: ProofreadPage.
bzimport set Reference to bz35757.

sumanah wrote:

For clarity's sake:

<Qcoder00> The particualr issue here is to do with a 'continuation'

alex.farlie wrote:

How the particualr usage arose was because the original document, uses a
multicol format, for which on the original pages I was using tables.

This means that a normal complete page transclude wasn't possible, as
effectivly
I needed to be able to 'transclude' and link up the text contained in indivudal
cells in the tables concerned.

The pages used are :

http://en.wikisource.org/wiki/Page:Ruffhead_-_The_Statutes_at_Large,_1763.djvu/60

http://en.wikisource.org/wiki/Page:Ruffhead_-_The_Statutes_at_Large,_1763.djvu/61
respectivly..

cap3c is a continuation of the text begun in cap3, and thus it should link up seamlessly with it.

sumanah wrote:

Qcoder (the bug reporter) reiterates that "A straightforward transclude wasn't possible because of multi-column issues. Hence the need to combine multiple 'sections' into one paragraph." Also:

<Qcoder00> Sumanah: Fixing the 'bug'' (Although I' supsect I'm pushing what Proofread pages was designed to do) would help greatly for some documents...
<Qcoder00> which might have side by side versions for example ...

beau wrote:

Let assume I have page "Test page" with the following content:
<section begin="cap3"/>Some text here... <section end="cap3"/>
Unimportant blah blah blah...
<section begin="cap3"/>and there...<section end="cap3"/>

And then on other page I enter:
{{#lst:Test page|cap3}}
The MediaWiki outputs:
Some text here... and there...

The only issue I can see is that you need to list every page you want to transclude from. <pages> tag could be extended to have 'section' parameter, which extract text only from specified section on all pages.

alex.farlie wrote:

Beau, The issue here is that the two portions of cap3 (cap3/cap3c) are NOT in the same page.

My suggestion would have be for something like this on the first page :

<section begin="cap3" /> This is a.. <section end="cap3" continue=page2#cap3c>
(Page1)

on the second page (page2)
<section begin="cap3c" />.. sentence broken over a page <section end="cap3" />

And then when doing the transclude ...
<pages index="blah" from=1 to=1 fromsection=cap3 tosection=cap3 flow=yes />
which would combine the two sections wither side of the page break into a
single paragraph as requested.

To get "This is .... sentence broken over a page"

Not sure how to implement this in templates though, but the code that used for
doing 'page' transclusion seems to be able to cope with combining stuff like this already.

beau wrote:

So with my proposal of extending <pages> tag the following example:
Page 1:
Blah<section begin="cap3" /> This is a.. <section end="cap3" />Blah
Page 2:
Blah<section begin="cap3" />.. sentence broken over a page <section end="cap3" />Blah
Example usage:
<pages index="blah" from=1 to=2 section=cap3 />

Would work for you like you wanted to, right?

alex.farlie wrote:

Yes, but I would suggest a different paarmname for the section param, as fromsection and tosection are already used and it could get confusing.

Howabout onlysections=cap3 which is clearer to the intent :) ?

alex.farlie wrote:

using #lst: directly worked :) so this is now less of an immediate priority,
it's now a feature request ..

alex.farlie wrote:

Currious:

#lst: does NOT seem to work all the time...

http://en.wikisource.org/wiki/The_Statutes_at_Large_%28Ruffhead%29/Volume1/Statutes_of_Merton

comapred to

http://en.wikisource.org/wiki/User:Sfan00_IMG/Sandbox

#lst: works on the latter (seemingly) but not on the former.

beau wrote:

Can you reduce the test case? There is a lot of code and you did not write what you get (erroneous output) and what you expect to get.

alex.farlie wrote:

Noted.

I also don't see why using #lst: directly would work in one namespace,
but not another.

alex.farlie wrote:

Very strange -

I reduced BOTH pages down to ONLY the #lst: invocations and suddenly BOTH
pages work properly.

http://en.wikisource.org/w/index.php?title=User:Sfan00_IMG/Sandbox&oldid=3733390
http://en.wikisource.org/w/index.php?title=The_Statutes_at_Large_(Ruffhead)/Volume1/Statutes_of_Merton&oldid=3733392

which is somewhat confusing, as I would not expect something as simple as putting code inside a table cell to cause it to behave incorrectly.

http://en.wikisource.org/w/index.php?title=The_Statutes_at_Large_%28Ruffhead%29/Volume1/Statutes_of_Merton&oldid=3733386
is the intended format of the page

Much appreciated if you code check if your code (and that for {{#lst;}} behaves consitently.

alex.farlie wrote:

Placing the transcludes using #lst: inside a table, appears to cause the breakage, http://en.wikisource.org/w/index.php?title=The_Statutes_at_Large_(Ruffhead)/Volume1/Statutes_of_Merton&oldid=3733394

I do not understand why doing something 'simple' like putting two calls to #lst: in a table cell would cause it to 'glitch'

The text '..hath '' had ...' should run seamlessly, which it will do when the
#lst: class are NOT inside other formatting(like tables) it seems.

beau wrote:

When using tables and templates you need to pay attention to what template returns, because the MediaWiki parser may interpret that differently. I have recently reported a bug about paragraphs - bug #35789.

Take a look at:
https://en.wikisource.org/w/index.php?diff=3733398&oldid=3733394

The original bug appears to be a special case of a general problem that has been well known for some time (more than a year) but apparently doesn't have a bug. That is the problem of skipping sections generally. <pages /> doesn't work if you skip either sections or pages, {{page}} works fine for full pages precisely because you have to list out every page. Although this is a pain if you have a frequent occurrence of this issue, that's the solution that ThomasV had recommended previously. This is, however, commonly a problem, anytime one encounters an intentionally blank page or has a book in which only the odd pages are in English, for example, and are being transcluded. But solutions to the general problem have been discussed on wiki and it can probably be dealt with using templates based {{page|}}. We may have even done it and I've forgotten where they are. Would Beau's solution to extend <pages /> also handle the sister case of odd or even pages?

Part of the issue in the very specific case used in the bug report was created by choosing non-standard methods of formatting a complex work. There are viable workarounds that were discussed on IRC. But the general case of the bug is valid.

beau wrote:

I have submitted a change Ieb8b2c0a6443ca67090a7934084fa24d7bd1c77e for review.

(In reply to comment #17)

I have submitted a change Ieb8b2c0a6443ca67090a7934084fa24d7bd1c77e for review.

Can you tell us what it does or link us to the revision?