Page MenuHomePhabricator

Parsoid doesn't handle unclosed table attributes (because of missing quote) in all scenarios
Closed, ResolvedPublic

Description

This seems to be caused by a missing closing quote in an attribute in [[pl:Szablon:Infobox mapa lokalizacyjna/core]] (after "margin:3px;"). The opening tag missing the quote is rendered as text.

This is of course trivial to fix in the template itself, but you guys might want to support this - please either make it work or close as INVALID and I'll be happy :)


Version: unspecified
Severity: normal
See Also:
https://bugzilla.wikimedia.org/show_bug.cgi?id=51043

Details

Reference
bz49839

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:08 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz49839.

We do go to some effort to support broken wikitext, but may not handle all kinds of broken wikitext well. See https://git.wikimedia.org/blob/mediawiki%2Fcore.git/7fadd2c9f3fb8914f5ac4aabddbb761d66427c64/tests%2Fparser%2FparserTests.txt#L15230 for some tests we've added.

While we'll fix what we can, please fix the template as well :) -- I think it is a good idea to keep fixing broken templates/wikitext so the need for hacks on our end is reduced.

We'll keep this ticket around to see if we want to do any additional fixes on the Parsoid end, or take a test snippet out of this and close this out.

Thanks for the reply.

For the record, I fixed[1] the template at pl.wp and the issue is no longer visible. (The infobox still looks funny in VE, but that's another thing.)

[1] https://pl.wikipedia.org/w/index.php?title=Szablon:Infobox_mapa_lokalizacyjna/core&diff=36825693&oldid=36435239

  • Bug 51786 has been marked as a duplicate of this bug. ***

Test snippet from bug 51786:

You can reproduce the bug with the following snippet in your sandbox (all the wikitext elements below are necessary for the bug to show up in its nasty form -- with some pieces removed, Parsoid handles it much better).

:{|
|-
! bgcolor="bbbbbb" | Game postponed
|}

{|
|- bgcolor="
| a || b
|}

As a short term improvement would it be possible to get syntax like this <nowiki>ed rather than deleted? Nowikis are tracked so they'll be easier to find, and it will make the markup easier to fix.

If we can tokenize the bad wikitext correctly, we can do whatever we want with it -- including fixing it up. In order to recognize it as bad wikitext to add nowikis, we have to fix the parsing to deal with it .. at which point, we can fixup the wikitext automatically. So, it is not necessarily any easier to wrap them in nowikis.

That said, we'll consider solutions if we can think of some way to protect this and other forms of broken wikitext. In some cases, we do have enough information to do auto-fixes or protect broken wikitext (See: bug 46705).

https://en.wikipedia.org/w/index.php?title=Condor_Flugdienst&diff=565754331&oldid=562922174 looks like this bug (or possibly 44498) but I can't spot any unclosed attributes and find and replace tells me there are an even number of quotes.

Is that edit on that Condor_Flugdienst (un)intentional vandalism? I couldn't reproduce the error. However, based on changes, it seems that the entire header has been deleted. Also, there is some text there which is not in the original page ("flotte CondorAvionEn servicePassagers"). So, I don't know what exactly happened there besides that.

Arlolra set Security to None.
Arlolra claimed this task.
Arlolra subscribed.

We do much better on the test snippet in T51839#585006 these days.