Many DOM passes depend on an accurate identification of foster-parented content (see http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#foster-parent). We have implemented some detection already in dom.markFosteredContent.js, but also still depend on a hack (used to be a convenient bug) in the HTML5 treebuilder that disables fostering for meta tags.
It would be great if we could generalize and improve the existing algorithm so that
- it can be run as a first pass on the DOM,
- its marking of fostered content can be relied upon by all other DOM passes,
- it properly detects fostering of pure text, and
- we can remove the no-fostering-for-metas hack from the HTML5 treebuilder.
Fostered text detection can probably be addressed with this trick:
For each <table> TagTk, we can pre-pend a <meta typeof="mw:FosterMarker"> SelfclosingTagTk just before adding a tagId sequence number and feeding those tokens to the treebuilder. This will then create a 'fostering box' in the DOM:
content..
<meta typeof="mw:FosterBox" data-parsoid="{tagId: 3}"/>
potentially fostered content
<table data-parsoid="{tagId: 4}">..</table>
Fostered element content will have higher tagIds than both the meta and the table.
A complication we should ignore for now is cases like <table><meta><table>..- lets tackle those rare edge cases later.
The goal is to mark all fostered content with data.parsoid.fostered. Fostered text nodes need to be wrapped into a span for this. The extra meta tags for fostering detection should be stripped so that they don't interfere with later passes.
Version: unspecified
Severity: normal