Page MenuHomePhabricator

Incorrect parsing of table headings and cells on the same line
Open, LowestPublic

Description

{|
|-
! 1 || a || b || c
|-
|}

produces:

<table>
<tr>
<th>1</th>
<th>a</th>
<th>b</th>
<th>c</th>
</tr>
</table>

instead of:

<table>
<tr>
<th>1</th>
<td>a</td>
<td>b</td>
<td>c</td>
</tr>
</table>

which is produced by:

{|
|-
! 1
| a || b || c
|-
|}

Details

Reference
bz549

Event Timeline

bzimport raised the priority of this task from to Lowest.Nov 21 2014, 6:57 PM
bzimport set Reference to bz549.
bzimport added a subscriber: Unknown Object (MLST).

qleah wrote:

Patch to correct table parsing

This patch allows captions, cells, and headings to occur on the same line.
Captions may only occur at the beginning of a line; cells and headings may
occur in any order.

attachment th.diff ignored as obsolete

wmahan_04 wrote:

(In reply to comment #1)

Patch to correct table parsing

Could you please provide a diff against CVS HEAD? It doesn't apply
to any recent revision.

avarab wrote:

Added the need-parsertest keyword and removed the patch keyword since it doesn't
apply to any recent revision.

zigger wrote:

*** Bug 38 has been marked as a duplicate of this bug. ***

ayg wrote:

*** Bug 4759 has been marked as a duplicate of this bug. ***

ayg wrote:

Should we fix this so it behaves logically and break stuff, or keep it behaving
oddly forever forever? This is probably pretty low-impact, and for any WMF
project anyone with a DB dump could locate all problem pages for fixing by a bot.

This has been here forever and people seem to know the work-arounds. We may try to make the behavior more consistent in the new parser, but, unless someone steps forward with an urgent use-case, this is really low priority.

*Bulk BZ Change: +Patch to open bugs with patches attached that are missing the keyword*

Someone's got to close this bug...

I think after 10 years it's safe to say this is now considered a feature. Changing this now would break a lot of existing wiki syntax out there, on WMF and on external installations.

(In reply to comment #9)

Someone's got to close this bug...

I think after 10 years it's safe to say this is now considered a feature.
Changing this now would break a lot of existing wiki syntax out there, on WMF
and on external installations.

Is there any evidence for that? I would expect that, as a minimum, a grep of the English WP should be made to see to what extent the incorrect construction is actually being used, rather than basing this on a, possibly incorrect, assumption.

Also, has this been closed by group consensus, or is it on your own initiative? I would say that this is one place where 'be bold' should not apply.

I vote to reopen, until we have some actual data on which to base the decision.

No, just my initiative. I didn't really expect anyone to care about this bug any more. If you would like to reopen, go ahead.

Realistically though, the chance of this getting "fixed" is negligible. Whether you grep enwiki or not, any change to core wikitext (particularly something as basic and widespread as table headers) is going to break stuff for some people somewhere on some wiki. People are going to use the markup that generates the look they want, so we must assume all uses of this ! || || syntax are intended to produce the "erroneous" look.

I would agree with WONTFIXING this. I have some experience with changing behavior to "fix" an old bug. The result wasn't pretty.

I haven't been following the discussions relating to Parsoid for quite a while, but I know that there was general acknowledgement early-on that there are a fairly large number of edge-case situations like this where the current parser does not behave as you might expect, and that in an ideal world these would all be fixed by the new parser. It was also acknowledged that this would, potentially, break a lot of stuff, and so fixing them might not actually be sensible.

I'm sure that a general approach has been adopted by now, so from my point of view this should follow whatever approach has been adopted for other similar issues.

  • If they are all WONTFIXed without any further research, due to potential breakage, then this should also be a WONTFIX.
  • If they are all being kept open, with a plan to resolving them in the new parser, then this should be re-opened.
  • If there is some kind of flag (global or per-feature) to enable old broken behaviour, then this should be re-opened and fixed, with an appropriate flag used to restore the existing broken behaviour.
  • If - as I suspect - they were handled on a case-by-case basis, based on some metric (e.g. prevalence in some corpus) then that same metric should be applied here.

I guess my point is that this shouldn't be handled in isolation by one person (or a couple of people) expressing an opinion. It should be handled formally via the parser team in a manner consistent with other similar issues.

On that principal, I am re-opening. Am happy for it to be re-closed by an appropriately informed person, so long as it is handled consistently with other issues and appropriate justification is given.

You're right that a more consistent approach is needed. I would welcome a "bug sprint" in the Parser category.

There are 440 open parser bugs, and while many of them are totally legitimate bugs needing to be fixed, there are a good few handfuls of bug reports that lack a clear course of action, like this one. A bug sprint would get the same pair of eyes on each of the issues, to judge them equally and knowledgeably.

Adding the bug wrangler since he has interest in bug sprints.

Izno updated the task description. (Show Details)
Izno removed a project: Parser.
Izno updated the task description. (Show Details)
Izno removed a subscriber: wikibugs-l-list.