Page MenuHomePhabricator

Parsoid removed linebreaks in wikitable; <nowikied> some resulting in-cell pipes in a following edit
Closed, ResolvedPublic

Description

In this series of edits Parsoid exhibited strange behaviour:

  1. Parsed everything correctly, no problems at all. https://en.wikipedia.org/w/index.php?title=2013_US_Open_%28tennis%29&diff=570482557&oldid=570478151
  2. removed linebreaks in the table, changing:
{{n/a}}
{{n/a}}
{{n/a}}

to:

{{n/a}}{{n/a}}{{n/a}}

https://en.wikipedia.org/w/index.php?title=2013_US_Open_%28tennis%29&diff=570483444&oldid=570482557
3: nowikied the pipes it moved in the previous edit
https://en.wikipedia.org/w/index.php?title=2013_US_Open_%28tennis%29&diff=570484857&oldid=570484805

As far as I can tell, nothing that should make any difference at all happened between 1 and 2 (they were consecutive edits to the page.


Version: unspecified
Severity: normal

Details

Reference
bz53468

Event Timeline

bzimport raised the priority of this task from to High.Nov 22 2014, 1:57 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz53468.

The real error here is the loss of line breaks in step #2. The nowiki-ing in Step #3 is expected. So, if #2 is fixed, this will be fine.

This does not seem to be a simple Parsoid issue:

echo -ne '{|\n|{{n/a}}\n|{{n/a}}\n|}' | node parse --wt2wt
{|

{{n/a}}
{{n/a}}
}

Maybe VE dropped the newlines in HTML?

This does actually seem to be a selser issue, probably in the separator handling:

in.txt:
{|

{{n/a}}
{{n/a}}
}

node parse < in.txt > out.html

cat out.html | sed s/a/b/ | node parse --oldtextfile in.txt --oldhtmlfile /tmp/.html --selser --html2wt
{|

{{n/a}}{{n/a}}
}

Change 81611 had a related patch set uploaded by GWicke:
Bug 53468: Don't consume trailing IEW when skipping about siblings

https://gerrit.wikimedia.org/r/81611

Change 81611 merged by jenkins-bot:
Bug 53468: Don't consume trailing IEW when skipping about siblings

https://gerrit.wikimedia.org/r/81611