Page MenuHomePhabricator

Images: wikitext tables inside image captions are lost
Closed, ResolvedPublic

Description

When editing a page which has an image with table data in its caption, the table is dropped (only its closing tab is kept). The rest of the caption is dropped as well.

Test edit: https://hu.wikipedia.org/w/index.php?title=Vall%C3%A1s&diff=13696305&oldid=13696296

You don't need to interact with the image in any way for this to happen, just edit the page, change anything and save.


Version: unspecified
Severity: major
URL: https://hu.wikipedia.org/wiki/Vall%C3%A1s?oldid=13696307
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=61064

Details

Reference
bz49942

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:47 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz49942.

Horrible, relatively-rare case for Parsoid, is the use of wikitext tables inside image captions. Currently, the following wikitext:

[Start]
[[Image:Foo.jpg|thumb|200px|This is an example image thumbnail caption with a table

{|
! Foo !! Bar

-
Foo1Bar1
-
Foo2Bar2
}

… and some more text.]]
[End]

… turns into the following output from Parsoid:

[Start]
<figure typeof="mw:Image/Thumb" data-parsoid="…">

<a href="./File:Foo.jpg" data-parsoid="…">
  <img resource="./File:Foo.jpg" src="http://upload.wikimedia.org/wikipedia/commons/thumb/0/06/Foo.jpg/200px-Foo.jpg"
    height="131" width="200" data-parsoid="…">
</a>
<figcaption class="mw-figcaption" data-parsoid="…">}

… and some more text.

</figcaption>

</figure>
[End]

… whereas nominally it should be (generated using an HTML table rather than a wikitext one, which seems to work fine without DOM diffs from VE end):

[Start]
<figure typeof="mw:Image/Thumb" data-parsoid="…">

<a href="./File:Foo.jpg" data-parsoid="…">
  <img resource="./File:Foo.jpg" src="http://upload.wikimedia.org/wikipedia/commons/thumb/0/06/Foo.jpg/200px-Foo.jpg"
    height="131" width="200" data-parsoid="…">
</a>
<figcaption class="mw-figcaption" data-parsoid="…">This is an example image thumbnail caption with a table

  <table data-parsoid="…">
    <tbody data-parsoid="…">
      <tr data-parsoid="…">
        <th data-parsoid="…">Foo</th>
         <th data-parsoid="…">Bar</th>
      </tr>
      <tr data-parsoid="…">
        <td data-parsoid="…">Foo1</td>
        <td data-parsoid="…">Bar1</td>
      </tr>
      <tr data-parsoid="…">
        <td data-parsoid="…">Foo2</td>
        <td data-parsoid="…">Bar2</td>
      </tr>
    </tbody>
  </table>

… and some more text.

</figcaption>

</figure>
[End]

This is a tokenizer issue. The table is not tokenized as such, likely because syntax stops prevent a pipe from being matched.

If this image is not modified by the user it should still round-trip correctly, but on modification the table would be lost.

(In reply to comment #2)

If this image is not modified by the user it should still round-trip
correctly,
but on modification the table would be lost.

The table is lost even if the user does not interact with the image in any way. See [[hu:Vallás]] - the image caption gets garbled as soon as you open the article for editing.

(In reply to comment #3)

(In reply to comment #2)

If this image is not modified by the user it should still round-trip
correctly,
but on modification the table would be lost.

The table is lost even if the user does not interact with the image in any
way.

See [[hu:Vallás]] - the image caption gets garbled as soon as you open the
article for editing.

The corruption even if you don't touch it is a bug in VisualEditor related to munging the captions of images, which we're fixing today; sorry for that.

[Parsoid component reorg by merging JS/General and General. See bug 50685 for more information. Filter bugmail on this comment. parsoidreorg20130704]

Change 107172 had a related patch set uploaded by Subramanya Sastry:
(Bug 49942) Tables can show up in captions

https://gerrit.wikimedia.org/r/107172

Change 107172 merged by Subramanya Sastry:
(Bug 49942) Tables can show up in image captions

https://gerrit.wikimedia.org/r/107172