Page MenuHomePhabricator

Leading colon in link label is wrongly removed from wikitext
Open, LowPublic

Description

[subbu@earth:~/work/wmf/parsoid] echo '<a rel="mw:WikiLink" href="./Foobar">:foobar</a>' | php bin/parse.php --html2wt
[[:foobar]]

[subbu@earth:~/work/wmf/parsoid] echo '<a rel="mw:WikiLink" href="./Foobar">:foobar</a>' | php bin/parse.php --html2html --body_only
<p data-parsoid='{"dsr":[0,11,0,0]}'><a rel="mw:WikiLink" href="./Foobar" title="Foobar" data-parsoid='{"stx":"simple","a":{"href":"./Foobar"},"sa":{"href":":foobar"},"dsr":[0,11,2,2]}'>foobar</a></p>

Details

Reference
bz51442

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 2:02 AM
bzimport added a project: Parsoid.
bzimport set Reference to bz51442.

I don't think we should explicitly disallow such symbol to be used in the label (might lead to other strange side-effects and inconsistencies).

However it does give another use case for bug 51438 (T55973). Having a better way to edit the label might reduce the cases where this happens. Perhaps we could make it possible to put the cursor inside the link (after the last character) and outside the link.

So that one can be in either of these positions:

  • Foo <a>bar|</a>
  • Foo <a>bar</a>|

instead of just the former (perhaps some kind of inline block wrapper around the link, e.g. like Apple does in Mail and Calendar when text contains a reference to a contact. It looks like "Foo { bar } quux." Where bar has a rounded bubble around it when hovered and the cursor can be placed inside or outside the bubble.

(In reply to comment #1)

instead of just the former (perhaps some kind of inline block wrapper around
the link, e.g. like Apple does in Mail and Calendar when text contains a
reference to a contact. It looks like "Foo { bar } quux." Where bar has a
rounded bubble around it when hovered and the cursor can be placed inside or
outside the bubble.

)

http://xkcd.com/859/

Links are no longer getting extended automatically when any text is added at the end, including commas, semicolons, etc. Possibly a result of the fix for bug 51463

(In reply to comment #3)

Links are no longer getting extended automatically when any text is added at
the end, including commas, semicolons, etc. Possibly a result of the fix for
bug 51463

Indeed, this was fixed in that.

I am reopening this since I noticed that colons at least still get included in the links: here https://en.wikipedia.org/w/index.php?title=User%3AElitre_%28WMF%29%2FSandbox&diff=573483719&oldid=573483114 the line beginning with "Performance problems: VisualEditor" has the colon included in the wikilink of the lattest word; the link works fine but you don't get to see the colon at all.
https://en.wikipedia.org/w/index.php?title=User:Elitre_(WMF)/Sandbox&diff=next&oldid=573483774 Here instead it is included in the wikilink to Issues, and I am not sure it's desirable either. Thanks.

Can you explain how those cases were created. From what I can see colons don't extend link annotations, but if you select over a colon and use the link tool you're going to end up with weird stuff like that anyway.

Yes Ed, in both cases I selected the colon and the word and then used the link tool. I know that you can get the first result also in the source editor, but the second one definitely does not produce a working link with the source editor, and anyway I was not really sure if it makes sense having a wikilink which includes a colon and then makes it invisible. If you think none of these behaviours should be prevented or worked on anyway, please just re-close this. Thanks.

About semicolons, we got another feedback that I managed to reproduce, see https://en.wikipedia.org/w/index.php?title=User%3AElitre_%28WMF%29%2FSandbox&diff=573644513&oldid=573644268 .
I wrote French; and turned that into a wikilink, including the semicolon.
Then I added some text between the end of the word and the semicolon. The result is that we now have two wikilinks to the same word, and a nowiki tag in the middle.

Can you leave some step-by-step instructions to reproduce the bug for clarification?

This is indeed reproducable, and not just with semi-colons, but it is not this bug. I've reported it separately as bug 54332 including how to reproduce it.

I am leaving aside the semicolon issue, covered elsewhere.
About what I reported in Comment 5:

  1. Type ":foobar"
  2. select all of this, from the colon (included) to the final r
  3. click on the linking tool, and make this a link to the article "foobar"
  4. save.

*outcome: the colon is included in the link; the link is working but the colon is invisible (and I think this is certainly not desirable).

Please note that this also happens to ": foobar", that is, if the colon and the typed word are separated by a space.

  1. Type "foobar:"
  2. select all of this, from the f to the colon, including it
  3. click on the linking tool, and make this a link to "foobar"
  4. save.

*outcome: the colon is included in the link, it is visible, the link works fine (and I think this is not good code).

Please note that when a piped link is created instead (for example if foobar is a link to Foobar2000), the colons will be part of the link in every case and they will all be visible.

My point is mainly that VE could easily understand when the punctuation is added by mistake, because the label and the target link will be different. So if the target does not feature a punctuation mark, but the label does, that was probably not wanted.

HTH.

This is a Parsoid bug (the colon part, at least):

<a rel="mw:WikiLink" href="./Foobar">:foobar</a>

serialises to:

[[:foobar]]

which converts back to

<a rel="mw:WikiLink" href="./Foobar">foobar</a>

(In reply to comment #11)

My point is mainly that VE could easily understand when the punctuation is
added by mistake, because the label and the target link will be different. So
if the target does not feature a punctuation mark, but the label does, that
was probably not wanted.

Unfortunately we have to handle the case when it was intentional, so we can't just blindly auto-correct it. Also the definition of what is punctuation becomes much more complex when you are dealing with thousands of languages.

Arlolra set Security to None.
ssastry lowered the priority of this task from Medium to Low.May 15 2020, 9:52 PM
ssastry updated the task description. (Show Details)
ssastry moved this task from Backlog to Bugs & Crashers on the Parsoid board.
Krinkle renamed this task from leading colon in label not converted to wikitext correctly to Leading colon in link label is wrongly removed from wikitext.May 17 2020, 4:21 PM
Krinkle unsubscribed.