Page MenuHomePhabricator

TemplateData: Allow hinting to specify auto-numbered parameter names in some fashion
Open, MediumPublicFeature

Description

Some templates use numbered arguments (see for example http://en.wikipedia.org/wiki/Module:Citation/CS1/Whitelist in the numbered_arguments list). For example, you can use author1, author2, ... (no limit).

Currently, there's no way to define such parameters in TemplateData, and the VE interface is not also optimal for them.

Would it be possible to handle numbered arguments in TemplateData: a parameter named author# would define a numbered parameter named author with a numerical suffix. It should be possible to define lower bound (1 by default) and upper bound (no limit by default).

Then VE should handle them accordingly: only propose the first parameter available in the list of parameters than can be added.


Version: unspecified
Severity: major
See Also:
T53740: TemplateData: page_props limits value length to 65535 bytes (MySQL 'blob' field)

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 1:55 AM
bzimport added a project: TemplateData.
bzimport set Reference to bz52582.

When implemented, this should also accept an empty string as the "base" ("author" in original example above) to apply to unnamed (positional) parameters. Would be useful for some templates like [[pl:Template:Dopracować]] which currently can't be properly described using TemplateData.

In the module code they use # to represent a number placeholder.
numbered_arguments = {

['author#'] = true,
['Author#'] = true,
['author-first#'] = true,
['author#-first'] = true,
['author-last#'] = true,
['author#-last'] = true,

...
}
As you can see the number can be in various places.

http://en.wikipedia.org/wiki/Module:Citation/CS1/Whitelist

This proposal is based mostly on the idea that if TemplateData exists for a template, then editors should not be permitted to add any parameters that are not already in TemplateData.

Rather than coding a variety of flexible naming/numbering schemes, it might make more sense to have a way to specify that a given template accepts (or does not accept) whatever parameters the editor chooses to name, regardless of what's in TemplateData. If the TemplateData is complete and no unlisted parameters are useable, then editors could not add others. If the TemplateData is incomplete, or if the number of parameters is unbounded, then editors could add whatever they wanted.

The foo1, foo2, foo3 etc. model exists in many templates, however in most cases the limit is quite low (when converted to Lua a user might have dropped the limit, no sure that is a good pattern). Only in cases where the value is for e.g. a table row do they have a high limit (up to 500 seems common).

e.g.

{{Something

row1=..
row2=..
row3=..
=..

such as {{TemplateBox@commons.wikimedia.org

p1=..
p2=..
p3=..
..

However both for VisualEditor, TemplateData and any other modern program or programming language it is very visible that such dynamically named parameters are an unwanted pattern that should not be encouraged. It is hard to work with and only has downsides over alternatives (e.g. hard to re-order, requires hacks to implement in wikitext and lua, hard to document, hard to use).

Especially now that we have Lua, specifying a parameter as taking a list seems more sensible (e.g. comma separated, line-break separated, whatever). Even if a template isn't converted to Lua yet, it could make use of a generic Lua module to do the string split and iteration.

Also, in the interest of not implementing too soon and get stuck in a corner, I'd like to wait for VisualEditor to catch up with the features we've already recently added to TemplateData and see how those work out.

(In reply to Krinkle from comment #4)

Especially now that we have Lua, specifying a parameter as taking a list
seems more sensible (e.g. comma separated, line-break separated, whatever).
Even if a template isn't converted to Lua yet, it could make use of a
generic Lua module to do the string split and iteration.

This solution would be awkward when several parameters are related one to each other. For example, on the cite templates, you usually have several fields for each author (first name, last name, page link, ...) and they are easier to manage for a user if they are grouped.
With lists, users would have to be careful and to be sure that several lists are coherent between each other.

(In reply to NicoV from comment #5)

This solution would be awkward when several parameters are related one to
each other. For example, on the cite templates, you usually have several
fields for each author (first name, last name, page link, ...) and they are
easier to manage for a user if they are grouped.
With lists, users would have to be careful and to be sure that several lists
are coherent between each other.

One way to deal with this could be to pass JSON blobs into template parameters. So you could have:

{{cite web

authors = { "last": "Smith", "first": "John" }, { "last": "Doe", "first": "Jane" }

}}

This is nice from a Lua perspective, but it would be confusing for users. Mixing template syntax with JSON syntax seems ugly, and probably only a select few editors will be capable of entering syntactically correct JSON into template parameters.

The established way of doing something like this is to use a sub-template. So we would have something like:

{{cite web

authors = {{authorlast=Smithfirst=John}}, {{authorlast=Doefirst=Jane}}

}}

One problem with this is that the {{author}} templates are expanded before they are passed to Lua, so we lose data granularity. To solve this, we can use Lua to make the {{author}} templates output JSON and then read that JSON from the Lua module implementing {{cite web}}. But then to guarantee that we get clean JSON code we would need to make a template structure like this:

{{cite web

authors = {{author list{{authorlast=Smithfirst=John}}{{authorlast=Doefirst=Jane}} }}

}}

With this we have data grouping that can be reliably processed by Lua and that editors used to using wikitext will find intuitive. However, this would be hard for VisualEditor etc. to process and display to users in a meaningful way, as there is no guarantee how the {{author list}} and {{author}} templates will be implemented on any given wiki. So, instead of using Lua to create the JSON, we could do it in PHP using a new parser function. (I suggest "#data".) That would give us the following:

{{cite web

authors = {{#data: {{#data:last=Smithfirst=John}}{{#data:last=Doefirst=Jane}} }}

}}

This would give us a data structure that could be reliably parsed from VisualEditor, as well as the other benefits mentioned above. Perhaps it could displayed as a dynamically generated data tree (I'm thinking about something like json2html, but editable).

http://visualizer.json2html.com/

We would need to be careful about bad input:

{{cite web

authors = foo {{#data: bar {{#data:last=Smithfirst=John}}{{#data:last=Doefirst=Jane}} baz }}

}}

To sanitise this, we could make the #data implementation throw an error (perhaps like a cite.php error) for "bar" and "baz" in this example. For "foo", we could leave it to the Lua module to properly validate the input.

We would also have to think carefully about what to do with templates etc. inside #data parser functions. If template expansion inside #data parser functions worked the same way that it did in normal wikitext, it would be difficult to make #data work well in VisualEditor. But perhaps it can be made to work in the same way that VE handles other instances of templates in template parameters.

Hmm, I've messed up the JSON in the first example. Should be:

{{cite web

authors = [{ "last": "Smith", "first": "John" }, { "last": "Doe", "first": "Jane" }]

}}

Perhaps this proves my point about users not being able to input JSON manually? :)

Arg342 subscribed.

This will affect ProveIt when implemented. ProveIt relies on TemplateData to perform its work, much in the same way the VisualEditor does.

Iniquity raised the priority of this task from Low to Medium.Apr 1 2017, 6:55 PM

I think I may increase priority of this task. A lot of new templates, gadgets and scripts have autogenerated parameters.

Regarding the UX and UI of this feature, one possibility is that when a user adds content to the Author1 field, the Author2 field becomes visible, and when s/he adds content to Author2, Author3 becomes visible, and so on.

I am also thinking of templates with an unlimited number of unnamed parameters, which are not handled well in VE currently.

For draft template I'm working on for the Wikitionary, I came with things like |définition.1.4.exemple.2.description =. That is description of 2nd example of the 4th subdefinition of the first main definition. That might be a somewhat extreme example, but that would be nice to take it into account while treating this ticket.

Has anyone got any new ideas? :)

@Iniquity, the way I'd ideally like to see this work is to just have a box people could check when editing a parameter's documentation in TemplateData indicating that it's an auto-numbered parameter (and specifying the limit, if applicable). The TemplateData would then display it intuitively in the documentation and allow people to use it when adding the template to a page.

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 15 2022, 9:40 PM

In lieu of a better plan, I think I'm just going to have the Citoid extension "guess" additional parameters from the last one rather naively by incrementing whatever integer is at the end. I.e. if it's last7, the next param will be last8

If the last param doesn't have an integer, we won't do this. So if it's just "last" or "isbn" no additional parameters will be added.

Does anyone object to this? It's a bit hacky, but it's the simplest to implement.

The potential flaw of this would be if there is a parameter out there that ends in a number that we don't want incremented. I don't know if there's an example out there, but if there were, we could potentially add parameters that don't actually exist and you could see errors in the citation (which the user could then, as always, fix.)

I expect this doesn't come up much, but a potential fix to that is to extend TemplateData to add a box to suppress auto params, if we needed it.

Change 858708 had a related patch set uploaded (by Mvolz; author: Mvolz):

[mediawiki/extensions/Citoid@master] [WIP] Extrapolate from limited templatedata

https://gerrit.wikimedia.org/r/858708

I think there are still a lot of templates that don’t use Lua and thus support exactly as many parameters as documented. They probably don’t show error messages if unknown parameters are used, either, since that would also require Lua. This means that the extra parameters would simply not appear – which doesn’t cause visual breakage, but it’s also hard for the editors to notice. What does Citoid do now if there are more authors/whatever data than what the template – according to TemplateData – supports?

WMDE-TechWish here. Our WMDE-Templates-FocusArea last year focused on TemplateData and the VisualEditor template dialog. I'm aware of two situations we need to consider here:

  1. Non-Lua templates support exactly as many parameters as documented, as @Tacsipacsi explained above. For example, there is author1 to author9, but there it ends. This causes a few problems:
    • It's a little tedious to document this.
    • Both VisualEditor and MediaWiki-extensions-TemplateWizard get messy very fast the more such parameters exist. Note that having more than one is the norm, not the exception. Here is a simple example.
    • The person writing the documentation might decide to stop earlier and document only e.g. author1 to author5 exactly to avoid the problem above. author6 ff. will still work, but can only be entered in wikitext or with the "add undocumented parameter" feature. Unfortunately this creates the next problem: author6 will be listed at the very end, not after author5.
  2. Lua-based templates can support an effectively unlimited number of such parameters. The currently only way to document this is to document a limited number, e.g. author1 to author5. This creates almost the same problems:
    • The documentation must stop at an arbitrary number. Typically this will not be author1 but e.g. author5 or author9.
    • This is tedious, as above.
    • VisualEditor and such always show all of the documented parameters.
  3. Because of all the problems above and thanks to the power of Lua users create more and more templates that accept multiple comma-separated values in a single parameter.
    • This creates effectively a custom parameter type with a custom syntax (namely being comma-separated).
    • TemplateData has no way to document this either. I mean, other than describing it to the user. But not in a machine-readable way.

A relevant consequence of all this is that we don't have a way to distinguish the first two situations when both are documented as author1 to author5.

What we can do, in my opinion:

  1. Collapse numbered parameters in VisualEditor. For example, when you start with an empty template you see only author1. The moment author1 does have a value author2 appears. And so on. But stop when the next number is not documented.
    • We can do this in VisualEditor without touching TemplateData.
    • The only thing users need to do is to make sure all possible numbers are documented.
    • This doesn't solve the situation where the number is unlimited, but allows for a workaround. Users just need to document a lot of numbers, e.g. author1 to author25.
  2. Ask users to document an unlimited number of parameters by only documenting the first one, e.g. author1, but nothing else. We can use this special case to distinguish the two unlimited vs. limited scenarios.
  3. Come up with some magic syntax for numbered parameters, e.g. "author[1-9]" means only 9 parameters exist and "author[1-]" means it's unlimited.
    • This allows to use other number systems, e.g. Hebrew author[א-ט‎].
    • Biggest problem with an approach like this is that it's entirely incompatible with existing software that doesn't understand the feature.
  4. Add a flag that does the same, e.g. "author1": { "sequence": { "name": "author", "from": "1", "to": "5" } }. Use "from": "1", "to": "1000" to make it effectively "unlimited".
    • This can be made backwards-compatible. Users can still document author1 to author5 if they want. They just need to copy the sequence name to all of them, e.g. "author2": { "sequence": { "name": "author" } }. Parameters with the same sequence are considered one by software that understands the feature.
  5. Somehow make use of the unused set feature.
  6. Just give the user a + button at the end of numbered parameters, no matter how it's documented in TemplateData. It acts very similar to the "add undocumented parameter" feature but makes sure the order is consistent.
  7. …?

I would like to avoid presenting an extra empty parameter at the end of every sequence of numbered parameters when the documentation explicitly says it doesn't exist, or we just don't know.

  1. Collapse numbered parameters in VisualEditor. For example, when you start with an empty template you see only author1. The moment author1 does have a value author2 appears. And so on. But stop when the next number is not documented.
    • We can do this in VisualEditor without touching TemplateData.
    • The only thing users need to do is to make sure all possible numbers are documented.
    • This doesn't solve the situation where the number is unlimited, but allows for a workaround. Users just need to document a lot of numbers, e.g. author1 to author25.

This may not work in all cases. For example, en:Template:Cite web have several parameters for providing authors, but to keep the example simple, let’s concentrate on the three that are included in the vertical copy sample: firstN, lastN and author-linkN. If only the first and third author have articles, I’d like to include the following:

{{cite web |first1=John |last1=Doe |author-link1=John Doe |first2=Jane |last2=Doe |first3=Jimmy |last3=Wales |author-link3=Jimmy Wales |… }}

If you don’t display author-link3 until I filled author-link2, I can’t link to Jimbo’s article without adding a red link to Jane Doe’s article, which should not be done according to the documentation.

  1. Ask users to document an unlimited number of parameters by only documenting the first one, e.g. author1, but nothing else. We can use this special case to distinguish the two unlimited vs. limited scenarios.

I have no actual example at hand, but I’m pretty sure there are templates where a trailing “1” doesn’t denote a sequence, but a year, a code (e.g. ISO 639-1), and so on.

  1. Come up with some magic syntax for numbered parameters, e.g. "author[1-9]" means only 9 parameters exist and "author[1-]" means it's unlimited.
    • This allows to use other number systems, e.g. Hebrew author[א-ט‎].
    • Biggest problem with an approach like this is that it's entirely incompatible with existing software that doesn't understand the feature.

Also, we’d need to be very careful what the syntax is, since {{{author[1-9]}} is for example a legal parameter name. Even {{{a{{!}}b}}} is legal, so pipes are neither something we can freely play with…

  1. Add a flag that does the same, e.g. "author1": { "sequence": { "name": "author", "from": "1", "to": "5" } }. Use "from": "1", "to": "1000" to make it effectively "unlimited".
    • This can be made backwards-compatible. Users can still document author1 to author5 if they want. They just need to copy the sequence name to all of them, e.g. "author2": { "sequence": { "name": "author" } }. Parameters with the same sequence are considered one by software that understands the feature.

I like this the best. However, the counter doesn’t need to be at the end (see editor2-last in {{Cite web}}), so we need a syntax to indicate where the number goes. For example, it could be a printf-style editor%d-last (with the usual %% escape if the parameter name literally contains %d), or the MediaWiki message-style editor$1-last (what’s the escape here?). Also, it needs to accept not only ASCII digits, but anything that’s sequential (non-ASCII numbers, maybe even letters).

  1. Somehow make use of the unused set feature.

According to the documentation, a set is “a group of parameters that should be used together” – this doesn’t fit this use case, since author1 can be used without author42.

To allow linking author3 without having to link the previous ones, the field author-link3 could appear when you FOCUS in author-link2, and author-link2 when you focus in author-link1, etc

This would also make it more obvious to users how to access collapsed parameters.

I think there are still a lot of templates that don’t use Lua and thus support exactly as many parameters as documented. They probably don’t show error messages if unknown parameters are used, either, since that would also require Lua. This means that the extra parameters would simply not appear – which doesn’t cause visual breakage, but it’s also hard for the editors to notice. What does Citoid do now if there are more authors/whatever data than what the template – according to TemplateData – supports?

Right now it silently just doesn't add the authors. If there's template data for 20, it'll add 20 and then stop. Unless it's a collapsed param, and then it concatenates them all together. So right now the results are actually better using a collapsed param because you get all the authors.

A lot of the non Lua templates are configured to use collapsed params, i.e. they have just one author field. In that case, all the authors are concatenated and make it into the template. I.e. this is the de wiki case. It'd only be non-lua templates that are also configured with citoid to use separate fields.

There's a related wishlist item for this: https://meta.wikimedia.org/wiki/Community_Wishlist_Survey_2022/Citations#Preventing_VE_from_silently_omitting_co-authors

There's a suggestion to using display-authors but coming up with a system to do that is even less fleshed out in terms of how we'd decide that. I'd rather just actually add the author.

There's also the option of producing two citations, one with and one without for the user to choose.

Funny edgecases for my proposed hack - a param that looks like isbn2, might accidentally produce isbn3 and isbn4 depending on how the data is configured.

If you don’t display author-link3 until I filled author-link2, I can’t […]

That's a great example, thanks! Sounds like an argument for the + button.

templates where a trailing “1” doesn’t denote a sequence, but a year, a code (e.g. ISO 639-1) […]

True, these might behave odd. We can't avoid this entirely but limit the effect by using a clever regex like /\p{L}[\p{P}\p{Z}]*1$/u. This will not pick up the examples you listed.

MediaWiki message-style editor$1-last […]

Either that or {{{1}}} to reuse a pattern users are familiar with. However, this can't replace the current parameter name for the same backwards-compatibility reason as above but must be part of the new syntax:

"author1": { "sequence": { "name": "author$1", "from": "1", "to": "5" } }

There is also another very common special case that goes like author, author2, author3, and so on without the 1. Luckily this is easy to solve with the proposed syntax:

"author": {},
"author2": { "sequence": { "name": "author$1", "from": "2", "to": "5" } }

I’ll add Community-Tech team to the task as maintainers of TemplateWizard, maybe they have ideas :)

dmaza subscribed.

We are currently only passively maintaining template wizard. See https://meta.wikimedia.org/wiki/Community_Tech/Maintenance for more.