Page MenuHomePhabricator

Implement link anchors to line numbers on syntax-highlighted pages (e.g. .css, .js)
Closed, ResolvedPublic

Description

When we open a SVN page like this
http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/resources/Resources.php?revision=82285&view=markup#l142
we are able to add #l___ to the url to create links pointing to a specific part of the code.

On [[MediaWiki:Common.js]] and other [[*.js]] and [[*.css]] pages, there is no way to do this at the momment.

It would be great to have such a feature, mainly for long script pages.

Similarly, it would probably be desirable to display such line numbers in those js and css pages.


Version: unspecified
Severity: enhancement
URL: http://en.wikipedia.org/wiki/MediaWiki:Common.js

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 11:22 PM
bzimport set Reference to bz27531.
bzimport added a subscriber: Unknown Object (MLST).

Implementing that isn't difficult. implementing the syntax coloring code to work in accordance with that, is probably the more challenging issue.

Checked, syntax extension does line based parsing as well, so less of a problem than I first thought.

(In reply to comment #3)

Checked, syntax extension does line based parsing as well, so less of a problem
than I first thought.

Yes, use <source line lang=...></source>. But ids per line are not added.

  • Bug 60410 has been marked as a duplicate of this bug. ***

gryllida wrote:

Is this a ::Parser bug? [[:mw:Extension:SyntaxHighlight_GeSHi]] seems to be responsible for <source> tag.

The output would probably use something like <span id="mw-source-f2-l31"> where f2 indicates it is the second <source> section on the page, and l31 line 31.

If handling potential multiple occurrences on a page is too difficult, we can at least start by supporting it on entire pages such as those in the MediaWiki and User namespaces for .js and .css.

I do support this idea.

On single block pages, that might be simply #l1, #l2 as GIT does.

Those who like to make their line numbers addressable could use the id="exampleB" attribute, which creates #exampleB_1, #exampleB_2 identifiers.

WRT to the proposal above:

span id="mw-source-f2-l31" where f2 indicates it is the second <syntaxhighlight>## section on the page

Wiki pages are dynamic, not static nor frozen.

If I have a page with two examples

  1. id="exampleA" mapped to f1
  2. id="exampleB" mapped to f2

and someone inserts on top an explanation using <syntaxhighlight> or exampleA will be removed, then exampleB and its line numbers are supposed to be addressed as f3 or f1 from the outer world now.

Change 653142 had a related patch set uploaded (by Esanders; owner: Esanders):
[mediawiki/extensions/SyntaxHighlight_GeSHi@master] Add support for line anchors on code pages

https://gerrit.wikimedia.org/r/653142

Change 653142 merged by jenkins-bot:
[mediawiki/extensions/SyntaxHighlight_GeSHi@master] Add support for line anchors on code pages

https://gerrit.wikimedia.org/r/653142

If handling potential multiple occurrences on a page is too difficult, we can at least start by supporting it on entire pages such as those in the MediaWiki and User namespaces for .js and .css.

This task only talks about code pages, not syntax blocks.

I’m not sure link anchors in syntax blocks would be as useful, but free to file a separate task for it.

Dinoguy1000 renamed this task from SyntaxHighlight_GeSHi: Implement link anchors to line numbers of <source> contents to Implement link anchors to line numbers of <syntaxhighlight> contents.Jan 3 2021, 11:59 PM

Is there a way to opt-out of this? Just saw the different display and thought it was a bug - some people might find it useful, but I don't. At the very least it should be possible to opt out via preferences. The original request was for link anchors on line numbers within <syntaxhighlight> tags, but for user code pages that don't have syntaxhighlight tags there are now annoying link anchors that were not originally the focus of this task

There is some confusion over the particular targets.

  1. request of 2011 has been for entire code pages: On [[MediaWiki:Common.js]] and other [[*.js]] and [[*.css]] pages
  2. idea has been to equip <syntaxhighlight> blocks with unique id=

The first one is feasible.

The second one breaks since there could be multiple <syntaxhighlight> blocks within one page but id needs to be unique. And there is not really a robust approach to link to the same line in the third <syntaxhighlight> block since another one might be inserted before or deleted ahead.

  • One approach could be: If <syntaxhighlight id="thisExample"> then create unique id="thisExample-1" and id="thisExample-2" and id="thisExample-3" etc.

However, task title mentioning <syntaxhighlight> is confused with task description of entire pages.

  • There a particular line can be linked always, at least by PermaLink revision, but might be stable for a while.

Asking for Tech News purposes to make sure I don't misunderstand anything: The link anchors are being implemented on all "code pages", e.g. .js and .css pages, but no other pages, correct? <syntaxhighlight> tags are not actually really relevant here, no matter what the task title says?

Esanders renamed this task from Implement link anchors to line numbers of <syntaxhighlight> contents to Implement link anchors to line numbers on syntax-highlighted pages (e.g. .css, .js).Jan 7 2021, 1:30 PM

Is there a way to opt-out of this?

You can hide the numbers with user CSS. Given that almost every interface that displays code these days shows line numbers (including the CodeEditor extension that handles these pages, and all the web interfaces used by MediaWiki code: Gerrit, GitHub, GitLab), I don’t see the value in making this a full user option (all user options come with an ongoing maintenance cost).

.mw-highlight a .linenos { display: none; }

The error messages of Scribunto should take advantage of this, is there a task for that? Now they link to the edit interface with a link like https://hu.wikipedia.org/w/index.php?title=Modul:Kép&action=edit#mw-ce-l29 (see the bottom of hu:Wikipédia:Évfordulók/szeptember), but that doesn’t seem to work.

Although looking at my test module on patchdemo more closely, it has a line number, but doesn’t have an actual HTML ID…

Thanks s lot, @Esanders, it's very helpful. Hou aboug Lua pages?

Another issue I found is that when there’s a horizontal scrollbar (this should normally not happen by default, but I reset <pre> to white-space:pre in my global.css), the scrolled text is visible behind the line number, making both the line number and that part of the scrolled text unreadable. This seems to be caused by

.mw-highlight .linenos {
	background: none;
}

in rESHG modules/pygments.wrapper.less:64 (at 9ddd742ad7e6), introduced in 98c644a639ec; removing that rule fixes the issue. I know that customizing the interface in global.css is at my own risk and I shouldn’t demand a fix in the extension because of this, but is there any benefit of having that rule? If it serves no purpose, please remove it. (If it’s useful, I will work it around on my own or live with it, of course.)

@Johan, correct.

And why is that? Why exclude Lua? Bizarre choice.

Just to clarify: Line numbers are working with Lua, see mw:Module:Gerrit, but no id= yet.

Note that there is a potential risk of invalid HTML since Lua modules may show a wikitext doc section above which could specify id="L-1" while css, js pages do contain native code only.

Then why not make sure such anchors are not used in existing Lua documentation, or use a unique enough prefix to avoid conflicts in the first place? It still makes little sense to leave out Scribunto because if line number anchors are going to be most useful on any type of pages, it must be Lua modules.

@Johan, correct.

And why is that? Why exclude Lua? Bizarre choice.

This task is about code pages handled in core. Lua pages are handled by a separate extension so were addressed in a separate task: T270991.

Just to clarify: Line numbers are working with Lua, see mw:Module:Gerrit, but no id= yet.

I hadn’t written the anchor feature when I resolved that task, but it should be trivial to fix now. Thanks for the reminder.

Thanks, that makes sense. Looking forward to anchors on modules.

Note that there is a potential risk of invalid HTML since Lua modules may show a wikitext doc section above which could specify id="L-1" while css, js pages do contain native code only.

There’s always a risk of ID clashes when wikitext is present. One can write id="toc", id="firstHeading", and so on. I don’t think L-1 would be especially prone to clashes.

Change 655493 had a related patch set uploaded (by Esanders; owner: Esanders):
[mediawiki/extensions/Scribunto@master] Add line link anchors

https://gerrit.wikimedia.org/r/655493

Re-using this task for the Scribunto (Lua modules) support.

Change 655493 merged by jenkins-bot:
[mediawiki/extensions/Scribunto@master] Add line link anchors

https://gerrit.wikimedia.org/r/655493

Note that there is a potential risk of invalid HTML since Lua modules may show a wikitext doc section above which could specify id="L-1" while css, js pages do contain native code only.

There’s always a risk of ID clashes when wikitext is present. One can write id="toc", id="firstHeading", and so on. I don’t think L-1 would be especially prone to clashes.

They are in fact prone to clashes. The most common way in which this happens is through headings in content and discussions, eg.. == toc ==. See also T9356. My general sense is that we should probably do one of the following for IDs in wikitext:

  1. Disallow them, allowing only the Parser to decide where, when and how to create IDs. For example, headings would still have an ID or anchor name to jump to, but the exact format of the header is decided by the software. An editor shouldn't have to worry about that. This naturally avoids conflicts when we solve T9356, by having only one source of user-generated IDs, and they all use the same known and reserved prefix that we can easily avoid in other parts of the software.
  2. Designate a second prefix for non-heading IDs generated by wikitext. I'm not sure what use cases this would have or when this is needed. As mentioned by others, this is almost always causing bugs even within the content because we (by design) can't and don't want to allow state to be maintained throughout the parsing of a page. Doing so in a stable manner would require some kind of counter variable, which goes against fundamental performance requirements and constrains we have for the wikitext parser (such as partial re-parsing and partial previews etc.) Nonetheless, if there is sufficient need and interest in this, a single reserved prefix seems easy to support. The wikitext sanitizer would simply ignore any IDs that don't use the prefix. This would work in the same way that we already drop styles if they contain unknown or insecure style rules, and we drop insecure data-* attributes if they use reserved prefixes etc. We could potentially transform unprefixed IDs to be prefixed, however, that seems confusing to end-users since presumably they need to reference that ID elsewhere, which doesn't work if we change it underneath them.
  3. Implement a complex and dynamic registry system in which we track all IDs used by the current page, and all code in core, skins, extensions, and gadgets interact with this registry so as to generate different IDs instead if they clash, and for this mapping to be available to consumers elsewhere. This seems very costly and not worth the effort, and would need to start, I think, with a good use case that can't be feasibly solved in another way.

@Krinkle You might want to cross post the above comment to T9356, as it’s out of scope for this task, but is a good outline for how that task might be resolved.

I think @PerfektesChaos' point is that the case introduced here is low risk compared to other cases that already exist: someone is unlikely to create a heading ==L-1== on a module documentation page.

Actually I was wondering why Lua did not get fragment identifiers but JS/CSS received them. There are small doc texts possible on JS/CSS code pages by MediaWiki:Clearyourcache but those are short and constant and may be kept free of duplicates easily.

However, I do agree that consideration of unique selectors should proceed on T9356.