Page MenuHomePhabricator

DOM post processors should not add HTML elements in fosterable positions
Closed, ResolvedPublic

Description

Those meta placeholders are foster-parented out on the client, which triggers a DOM diff and throws off selser with the out-of-order dsr value.

In edit mode, we should probably not be inserting information like this at all. So maybe don't run findBuilderCorrectedTags in edit mode and instead replace it with a pass that simply strips the related meta tags? We might also be able to avoid inserting those metas altogether in meta mode.

Minimal test case extracted from http://en.wikipedia.org/wiki/Indo-Pakistani_War_of_1965:

echo -ne '{{col-begin}}\n{{col-3}}\nfoo' | node parse
<body data-parsoid='{"dsr":[0,27,0,0]}'><span about="#mwt1" typeof="mw:Object/Template" data-mw='{"parts":[{"template":{"target":{"wt":"col-begin"},"params":{}}},"\n",{"template":{"target":{"wt":"col-3"},"params":{}}},"\nfoo"]}' data-parsoid='{"tsr":[0,13],"src":"{{col-begin}}\n{{col-3}}\nfoo","dsr":[0,27,null,null]}'>
</span><p data-parsoid='{"stx":"html"}' about="#mwt1"></p><table class=" multicol" style="border-collapse: collapse; padding: 0px; border: 0px; background:transparent; width:100%;" about="#mwt1" data-parsoid='{"dsr":[null,27,2,2]}'>
<meta typeof="mw:Placeholder" data-parsoid='{"tsr":[14,23],"src":"{{col-3}}","dsr":[14,23,null,null]}'>
<tbody data-parsoid='{"dsr":[null,25,0,0]}'><tr data-parsoid='{"dsr":[null,25,0,0]}'><td style="width: 33.33%;" align="left" valign="top" data-parsoid='{"dsr":[null,25,null,0]}'>
<p data-parsoid='{"dsr":[24,25,0,0]}'>foo</p></td></tr></tbody></table></body>


Version: unspecified
Severity: normal

Details

Reference
bz48231

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 1:19 AM
bzimport added a project: Parsoid-DOM.
bzimport set Reference to bz48231.

Related URL: https://gerrit.wikimedia.org/r/62742 (Gerrit Change I9eefd9005c3552eba59c365fc3ddaae9166d2885)

This pass has to run even in edit mode for accuracy of DSR values. So, we will have to fix any problems with it.

We should then at least try to remove metas and spans in foster-parent position before they are foster-parented at the client.

We could also check the foster-parenting behavior of the latest HTML5 library version. If foster-parenting is fixed there then the template start meta will already be fostered out before encapsulation, which is both good and bad for dsr calculations. It would at least give us control about the situation and avoid client-side foster-parenting and the resulting DOM diff.

Related URL: https://gerrit.wikimedia.org/r/63444 (Gerrit Change I7e7fcef2f769c4d50c36384b264e212a833fe88b)

https://gerrit.wikimedia.org/r/62742 (Gerrit Change I9eefd9005c3552eba59c365fc3ddaae9166d2885) | change ABANDONED [by GWicke]

https://gerrit.wikimedia.org/r/63444 (Gerrit Change I7e7fcef2f769c4d50c36384b264e212a833fe88b) | change APPROVED and MERGED [by jenkins-bot]

The patch above fixes a large part of this issue.

There are however still some situations where a meta can end up in foster-parent position, which will then be foster-parented out at the client:

echo -ne '<table>{{echo|NOTOC}}<tr><td>foo' | node parse

<body data-parsoid='{"dsr":[0,36,0,0]}'><table data-parsoid='{"tsr":[0,7],"stx":"html","autoInsertedEnd":true,"dsr":[0,36,7,0]}'><meta property="mw:PageProp/notoc" data-parsoid='{"tsr":[7,25],"src":"{{echo|NOTOC}}","dsr":[7,25,null,null]}' about="#mwt1" typeof="mw:Object/Template" data-mw='{"target":{"wt":"echo"},"params":{"1":{"wt":"NOTOC"}}}'><tbody data-parsoid='{"dsr":[25,36,0,0]}'><tr data-parsoid='{"tsr":[25,29],"stx":"html","autoInsertedEnd":true,"dsr":[25,36,4,0]}'><td data-parsoid='{"tsr":[29,33],"stx":"html","autoInsertedEnd":true,"dsr":[29,36,4,0]}'>foo</td></tr></tbody></table></body>

DOMPostProcessor has 3 instances where <span>s are inserted into the DOM during tpl encapsulation without checking if those spans might get fostered out.

id/Pengguna%3AWilliam_Surya_Permana%2FKontribusi has an example where such a span is added to wrap whitespace content and then gets fostered out when it hits the visual editor and introduces a RT diff on serialization.

The <span> instance in dom.wrapTemplates.js is checked for in, 1dba828d7eddf9f00aea6b47242ecfc4aedcdd07.

And the current output of echo above no longer has the meta in a fosterable position,

<body data-parsoid='{"dsr":[0,36,0,0]}'><meta property="mw:PageProp/notoc" data-parsoid='{"magicSrc":"NOTOC","fostered":true,"dsr":[0,36,null,null],"pi":[[{"k":"1","spc":["","","",""]}]]}' about="#mwt2" typeof="mw:Transclusion" data-mw='{"parts":["<table>",{"template":{"target":{"wt":"echo","href":"./Template:Echo"},"params":{"1":{"wt":"NOTOC"}},"i":0}},"<tr><td>foo"]}'><table data-parsoid='{"stx":"html","autoInsertedEnd":true,"dsr":[0,null,7,0]}' about="#mwt2"><tbody data-parsoid='{"dsr":[25,null,0,0]}'><tr data-parsoid='{"stx":"html","autoInsertedEnd":true,"dsr":[25,null,4,0]}'><td data-parsoid='{"stx":"html","autoInsertedEnd":true,"dsr":[29,null,4,0]}'>foo</td></tr></tbody></table></body>

Can you verify output on that page in #c8 and close this ticket as necessary?

For the record, foster-parenting behaviour in the HTML5 library was fixed in 0d6c4aaf4e3621d576c7a15a84b243af71af793b.

Also, all 3 instances where <span>s are inserted in dom.wrapTemplates.js are now checked for DU.isFosterablePosition().

Further, I verified that the rt diff in #c8 doesn't appear to contain any fostered <span>s.

Closing.