Page MenuHomePhabricator

Newline as list item terminator is troublesome
Closed, DuplicatePublic

Description

Author: mediazilla

Description:
Currently, list items are terminated by a newline. I'd like that rule to be
modified so that list items are terminated when the next line begins with some
sort of list specifier, or when the next line is empty (or perhaps only
whitespace). Lines that begin without list specifiers should be treated as
separate paragraphs.

The current rule makes it very troublesome to create list items with multiple
paragraphs, and as far as I can tell, there is no way to embed preformatted text
in a list item. For example:

# This is the first list item, first paragraph
This is the first list item, second paragraph
  Here is some preformatted text with the first list item
  Here is the rest of the preformatted text
# This is the second list item (numbered as 2)

With the current rule, the above is rendered as a single item list, then a plain
text paragraph, then a preformatted text paragraph, then a second single item
list whose number starts with 1. This makes it difficult to create sophisticated
lists. Under my proposed change, the above would be rendered instead as a single
list, where the first item contains two plain text paragraphs and one
preformatted text paragraph.

This change is mostly compatible with current Wiki pages, since most authors
would normally place an empty line after their list before starting non-list
text. One could justify the use of a single newline to start a new paragraph
instead of an empty line because one is in a list context. (i.e. the whole list
is a single paragraph.)

By the way, as a workaround I'm manually numbering my paragraphs:

1. This is the first list item, first paragraph

This is the first list item, second paragraph

  Here is some preformatted text with the first list item
  Here is the rest of the preformatted text

2. This is the second list item (numbered as 2)

Please let me know if you have a better workaround.


Version: unspecified
Severity: enhancement
See Also:
T60429: Translation pages: display outdated translations as such, without removing them

Details

Reference
bz1115
TitleReferenceAuthorSource BranchDest Branch
Disable public druid metadata ingestionrepos/data-engineering/airflow-dags!97milimetricdisable-public-druid-metadata-ingestionmain
Customize query in GitLab

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 8:07 PM
bzimport set Reference to bz1115.
bzimport added a subscriber: Unknown Object (MLST).

tmb wrote:

Severe problems arise when Users break <a href= >.. </a> pairs by a newline.
Then the list item tag is closed </li> before the <a>-tag is closed, resulting
in incorrect HTML-code:

Example Wiki Code:
<pre>

[[WikiArticle_with_a_very_long_name_going_right_to_the_end_of_the_line|

Short Description]]
</pre>

Results in the following code:

<pre>
<ol><li> <a
href="/WikiDev/index.php?title=WikiArticle_with_a_very_long_name_going_right_to_the_end_of_the_line&amp;action=edit"
class="new" title ="WikiArticle with a very long name going right to the end of
the line">
</li></ol>
<p>Short Description</a>
</pre>

This code might not be rendered correctly by all available browsers.

casutton wrote:

This seems to be very similar to bug 1584. Probably one of them should be
marked as dup, but not sure which...

  • Bug 2280 has been marked as a duplicate of this bug. ***

jiangxin wrote:

I have fixed this bug, patch is here:
http://meta.wikimedia.org/wiki/User:Jiangxin/Patch_Blankline_As_List_Terminator

wiki code:

#item 1
blah blah blah...

item 1-1 \

continued
blah blah blah...

item 1-2

<pre>
pre text of item 1.2

above is a blank line

  1. in pre block
  2. in pre block

</pre>
#item 2
#:<pre>
blah blah blah...
</pre>

another paragraph.

rendered as:

<ol><li>item 1<br>
blah blah blah...
<ol><li>item 1-1
continued<br>
blah blah blah...
</li><li>item 1-2<br>
<pre>
pre text of item 1.2

above is a blank line

  1. in pre block
  2. in pre block

</pre>
</li></ol>
</li><li>item 2
<dl><dd><pre>
blah blah blah...
</pre>
</dd></dl>
</li></ol>
<p>another paragraph.
</p>

eminencja wrote:

JiangXin patch (comment 4) looks cool. But here is another suggestion:

Let me explicitly start and terminate lists.

A list sould be started with #( and terminated with )#.
The #( string should be translated to <OL><LI>;
)# should terminate the list with </OL>.

List items should be entered as usual, with a hash #.

A similar extension should be used for bulleted lists, i.e. *( and )*.
Note that such markup should be fairly easy to parse.

  • Bug 5308 has been marked as a duplicate of this bug. ***

adwojnowski wrote:

This is how to fix it (also filed in bug 5308 at http://bugzilla.wikimedia.org/show_bug.cgi?id=5308):

LET ME EXPLICITLY START AND TERMINATE LISTS.

A list could be started with #( and terminated with )#.

The #( string should be translated to <OL><LI>; )# should terminate the list with </OL>.

List items should be entered as usual, with a hash #.

A similar extension should be used for bulleted lists, i.e. *( and )*.

Note that such markup should be fairly easy to parse. It also does not conflict with existing pages.
Both types of lists styles can co-exist in one document! Editors can continue to use # or * for
simple lists and #( )# or *( )* for more complex layouts.

That would obviously conflict with any list where "(" appeared right
at the start of the line. I'm not sure it looks terribly obvious or
attractive either. Can you provide a sample of what this would look
like in use?

adwojnowski wrote:

Ad. 8

True, in the unlikely scenario that a "(" starts the list item the rendering would change. But the
remedy is easy -- if you really want to start a list item with a "(" -- put a space after the
asterisk/hash, like that

  • (
  • (

BTW, can you find _one_ wiki page that uses list items starting with a "("? I couldn't ;-)

How that would look in real use?

Ok, I want to have a list with three numbered items showing three steps needed to configure
something. I want a screenshot and paragraph with preformated text in the list. This is how I would
do it with the #( )# syntax:

Configuring E-mail

#( To use a fancy signature go to the http:/blah url, log in and click the Mail tab. Your screen
should look like in the figure below

[[Image:000.png]]

Select the "Send e-mail in HTML format" checkbox.

Put the following HTML code into the signature field:

<B>Joe Average</A>
<FONT COLOR="blue">Shown in Wiki using preformatted text</FONT>

Click Save when you are done.
)#

That would be rendered as a numbered list, with three items. The list starts with #( and terminates
with )#.

gangleri wrote:

Is Bug 5121: Virtual lines using a trailing escape character (as "\")
about the same topic?

ayg wrote:

(In reply to comment #8)

That would obviously conflict with any list where "(" appeared right
at the start of the line. I'm not sure it looks terribly obvious or
attractive either.

The consistent thing would be to borrow from table markup. Use {#, {*, {;/{: to
begin the respective lists, and #} *} ;}/:} to end them. A line-initial {#
followed somewhere before the end of the page by a #} is quite unlikely, and
likewise for all the rest.

ayg wrote:

*** Bug 6677 has been marked as a duplicate of this bug. ***

adwojnowski wrote:

(In reply to comment #11)

The consistent thing would be to borrow from table markup. Use {#, {*, {;/{: to
begin the respective lists, and #} *} ;}/:} to end them. A line-initial {#
followed somewhere before the end of the page by a #} is quite unlikely (...)

Ok -- I do not care about the character combination -- as long
as I am able to explicitly close lists I'll be happy :-)

wytzevanderploeg wrote:

Why isn't JiangXin's patch applied in the current versions?

ayg wrote:

Because devs are not always willing to spend their time looking over submitted
patches. This is a volunteer project, remember. The patch will also break
existing behavior by continuing a list item where currently the entire list is
ended.

stijn wrote:

I keep running into this while attempting to use MediaWiki as our helpdesk
procedure reference. It is *very* common in that scenario to want to show some
commands to type using e.g. <pre> in a list item (step 10 of 20). While I can
see that simply applying JiangXin's patch might break existing pages, the ideas
from comment 9 and comment 11 sound like a nice addition to the current markup.
No I don't have a patch, just commenting to let people know I'd love this
feature too, and I think it's important.

anon.hui wrote:

(In reply to comment #11)

The consistent thing would be to borrow from table markup. Use {#, {*, {;/{: to
begin the respective lists, and #} *} ;}/:} to end them. A line-initial {#
followed somewhere before the end of the page by a #} is quite unlikely, and
likewise for all the rest.

{#, {*, {;, and {: should be unified into only one markup.
MediaWiki should automagically determine the first item inside it, whether it is *, #, ;, or :.

Example (when unified to {* ),

{*

aaa

....

bbb

*}

{*

  • aaa

....

  • bbb

*}

smccandlish wrote:

Bug #1584 is clearly a duplicate of this one, but perhaps more cogent over there.

  • Bug 12449 has been marked as a duplicate of this bug. ***
  • Bug 13642 has been marked as a duplicate of this bug. ***
  • Bug 25587 has been marked as a duplicate of this bug. ***

It's all well and good to mark my bug 25587 as a duplicate, but this issue has been open since 2004 (six years). I believe my approach (in bug 25587) is the simplest yet proposed and should not break existing wiki articles. Most other approaches talk of explicitly opening and closing lists. My approach is simply "continue the previous list."

sumanah wrote:

I am agreeing with S. McCandlish that this issue is a subset of the issues discussed in bug #1584, so I am marking this a duplicate of that.

Also, I'm removing the "need-review" keyword and adding the "reviewed" keyword, because Aryeh reviewed the submitted patch by saying: "The patch will also break existing behavior by continuing a list item where currently the entire list is ended."

This issue also bothers me, so I'm going to add myself to the cc list for bug #1584 and I suggest you do the same, if you haven't already. Brion Vibber and Gabriel Wicke are currently working on rewriting the parser to fix issues like this one, hence the "newparser" keyword on this bug and on bug #1584 -- to follow their work, subscribe to https://lists.wikimedia.org/mailman/listinfo/wikitext-l .

*** This bug has been marked as a duplicate of bug 1584 ***

I'm not sure this is a duplicate; this bug has a more generic summary which also covered some of the issues/solutions which have been duped to it, while bug 1584 is about one specific use case.

  • Bug 8318 has been marked as a duplicate of this bug. ***

If we want backwards compatibility, we cannot abandon these wikitext features:

  • empty lines unconditionally terminate lists, and there is no nesting.
  • lines beginning with "something else" unconditionally terminate lists, and there is no nesting.

I see two ways to deal with that:

  • we add a kind of "list (item) continuation mark" to wikitext which must not be present anywhere yet, not even accidentally, or
  • we use HTML markup for lists in such cases where problems may arise, and if we want to nest lists, wikitext markup can safely be used only on the lowest, innermost, level of nesting. That also implies that transclusions inside wikitext lists are a risk and should generally be avoided.

we add a kind of "list (item) continuation mark" to wikitext which must not be present anywhere yet, not even accidentally, or

Yes, something similar was proposed in https://bugzilla.wikimedia.org/show_bug.cgi?id=9996#c5 ; see also https://meta.wikimedia.org/wiki/Help_talk:List#List-agnostic_markup_insertions

The proposed new syntax with

{#
{*
{;
{:

for the initiators of list items is good.

But the terminators

#}
*}
;}
:}

may cause problems with the existing syntax (because the closing brace may well be part of a legacy list item after its starting prefix (even if they are recognized as list item terminators instead of list item initiators only when they match in context with an list item initiator in the new syntax)

My opinion is that it will be safer if MediaWiki just recognizes the <ol></ol> or <ul></ul> tags and their <li> members, or the <dl></dl> tags and their or <dt> or <dd> members (and allows them to be mixed between the legacy Wiki syntax and the HTML syntax within the same type of list, including by recognizing a few attributes such as numbering options in ordered lists)

So the following syntax would be valid

<ol start="5">
# item 5
# item 6
<li start="10">
# item 10
<li>item 12

This paragraph is also part of item 12 (and indented just its heading line)
<li>item 13
</li>
# item 14
</ol>

Note that <li>, <dl> and <dt> don't need to be terminated (they are terminated implicitly by the next <li>, <dl> or <dt> or by the list closure with </ol>, </ul> or </dl>)

Also MediaWiki would keep the current start value but would not increment it by creating an empty list item: the next non-empty list item using the legacy syntax will keep it. It would only preserve the list items (even when they are empty).

So </li>, </dl>, </dt> closing tags would just be ignored/discarded silently in all cases (just like in HTML5 where they are also not needed at all, or in legacy HTML4; they are just needed in XHTML only for conformance with XML).

As we use explicit list initiators/terminators (with HTML tags) there's no requirement to use HTML also for list items in those enclosed lists. In such explicitly enclosed lists, paragraphs and newlines would be recognized but would not cause the list to be terminated before them. A single newline will continue the curent list item, two or more newlines will create a new paragraph within the existing list item. We could also use <div>'s in list items.

We could also still embed lists in list items: with the legacy syntax, we have to repeat the prefixes to get to the correct level, but when using explicit HTML tags, such thing is not necessary!

# item 1
<ol>
# item 1.1
# item 1.2
</ol>
# item 2

instead of:

# item 1
## item 1.1
## item 1.2
# item 2

It would also simplify the edits in talk pages using ":" (definition lists) for block indentation:

: A says something
:: B replies to A
::: C replies to B
:: D replies to A
: A says something else
<dl>
A says something
  <dl>
  B replies to A
  <dl>C replies to B</dl>
  D replies to A
  </dl>
A says something else.
</dl>

With the HTML syntax, definitions lists may be indented, but the indentation does not generate preformated blocks within talks: to do that you'd need to first terminate the list item explicitly and add a newline before the preformated block

<dl>
A says something

  preformated code by A
  preformated code by A

A continues
A continues
  <dl>
  B replies to A
  : C replies to B
  <dl> F replies to B</dl>
  D replies to A
  </dl>
A says something else.
</dl>

MediaWiki will infer the generation of <dd> items within <dl></dl> if there's no ":" or ";" leading. It will still close the current definition list and start an (un)ordered if a line starts by * or # or <li> or <ol> or <ul>. If there's no <ul> or <ol> but only a <li>, it will assume an unordered list (bulleted) and will infer a <ul></ul> automatically.

In all cases, list items will not be broken by simple newlines, or even by new paragraphs initiated with two new lines or more within the same list item or definition/talk.