Page MenuHomePhabricator

VisualEditor: Sometimes it's somehow possible for VE/Parsoid to insert a <br> at the start of a heading
Closed, ResolvedPublic

Description

See:
http://en.wikipedia.org/w/index.php?title=User%3ACananian&diff=564446937&oldid=564446873

I was adding a heading (after a <!-- ... --> comment) and the diff stuck a <br> inside the ==heading==.

I'm assigning this bug to VE for the moment, but it could be a parsoid bug. I need to see the intermediate parsoid HTML generated for the change to tell one way or the other.


Version: unspecified
Severity: minor

Details

Reference
bz51444

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:02 AM
bzimport set Reference to bz51444.

AzaToth made a screencast at http://youtu.be/G_xXETTVgO0 showing the issue.

Gwicke filed bug 50683 on the parsoid side -- we shouldn't allow newlines in a heading when we serialize to wikitext.

But there's still a VE issue -- where is the ↵ coming from (visible in AzaToth's screencast). VE shouldn't be putting the newline inside the heading.

  • Bug 52021 has been marked as a duplicate of this bug. ***

Does anyone have any steps to reproduce? I can't reproduce in Firefox or Chrome whilst playing around…

The steps are clear in AzaToth's video, but I can't reproduce that either: have you fixed something about ↵ being displayed, this week?

(BTW, I really couldn't play that vid on YouTube.)

Jdforrester-WMF claimed this task.

No reproduction steps in > 18 months. If you can give some more details, please re-open.

Elitre reopened this task as Open.EditedMar 6 2015, 2:23 PM

Providing steps.

  1. Start with a new or a blank page.
  2. In wikitext mode, write something on the first line, leave a blank line, write a hidden comment, leave a blank line, write something else, then Save.
  3. Now enter VE mode. Notice the hidden comment is correctly displayed as an exclamation mark; there is now a carriage return sign at the beginning of the last line.
  4. Put the cursor in the blank line between the comment and the last written line. You can now write any word, select it and turn it into a heading or a subheading; notice the entire line below, including the carriage return sign, also becomes part of said heading/subheading.
  5. Save. You now have wikitext for the heading/subheading you just made, showing up in plain sight. You can also notice that the carriage return character was turned into a br tag.

Reproducible: Always.
Tested in Win 8 with the latest versions of Chrome, Firefox, IE, Opera (the last two while logged out).

Elitre renamed this task from VisualEditor: Sometimes it's somehow possible for VE/Parsoid to insert a <br/> at the start of a heading to VisualEditor: Sometimes it's somehow possible for VE/Parsoid to insert a <br> at the start of a heading.Mar 31 2015, 12:44 PM
Elitre updated the task description. (Show Details)
Elitre set Security to None.

The page is nominated for deletion, but here's another example.

The original document was serialized with the <br/>:

$ (echo 'foo' ; echo; echo '<!--comment-->'; echo; echo 'bar') | tests/parse.js  --normalize=parsoid

<p>foo</p>
<!--comment-->
<p><br/> bar</p>

I wonder if we should be generating <p><!--comment--></p> for that case, instead of the <br/> embedded inside the next paragraph. Our current behavior is consistent with PHP:

$ (echo 'foo' ; echo; echo '<!--comment-->'; echo; echo 'bar') | php maintenance/parse.php --quiet
<p>foo
</p><p><br />
bar
</p>

...but it's certainly not ideal from an editing perspective. Maybe PHP should change here as well, although I'm guessing the reason it is putting the <br/> inside the <p> is because otherwise tidy would strip the "empty" <p> tag.

In any case, given this input DOM, VE's behavior is strange but understandable -- when you put your cursor between the exclamation mark and the last written line, VE is placing its internal insertion point between the <p> and the <br/>, so any heading conversion applies to the entire <p> contents, including the embedded <br/>. The result that VE passes back to parsoid is then unsurprising:

<p id="mwAQ">some text</p> <!-- hidden comments FTW --><h2>Foo<br id="mwAw"> moar text here</h2>

and Parsoid gives a faithful serialization of this as wikitext.

Would fixing the PHP Parser require us to have replaced Tidy first, then?

Maybe worth handling this in the normalization routines.

WYSIWYG is tricky here. In this case, the user clearly saw the entire paragraph, including the newline, being styled as part of the heading. If we handle this during normalization, then the result after saving will not be consistent with what the user saw during editing.

@Jdforrester-WMF It's possible that existing-tidy wouldn't remove <p><!--comment--></p> because of the embedded comment. Currently PHP elides all the comments, so it hands tidy the string <p></p> which tidy happily removes. So it's possible we could finesse this somehow, for example by emitting a stub comment from the PHP parser. Investigation needed.

Unless it advances the consistency of parsing, I am not convinced this is worth a lot of effort beyond normalization during serialization. The WYSIWYG behavior is clearly an edge-case since wikitext does not support multi-line headings. So, if VE folks are with it, I am fine with it.

If you want an example of how that happens on a private wiki, see above line 114 here. It doesn't involve hidden comments.

Change 253050 had a related patch set uploaded (by Subramanya Sastry):
T53444: Strip <br>s from headings via new HTML normalization routine

https://gerrit.wikimedia.org/r/253050

Change 253050 merged by jenkins-bot:
T53444: Strip <br>s from headings via new HTML normalization routine

https://gerrit.wikimedia.org/r/253050

Parsoid now normalizes this HTML to strip <br> from headings - the code is now in production.