Page MenuHomePhabricator

Add #wrap parser function
Open, LowPublicFeature

Description

Author: John

Description:
A common pattern in en.wikipedia.org templates is using #if to determine if a parameter is non-empty before wrapping the value with a prefix and/or suffix and emitting it. Here's one example from the Citation/core template:

{{#if: {{{format|}}} ({{{format}}})}}

In the example above, if the format parameter is not empty, the template emits a space and an open paren (a prefix), the value of the format parameter, and a closing paren (the suffix). This could be replaced by a new #wrap function that accepts three parameters: prefix, value, and suffix.

{{#wrap: (|{{{format|}}}|)}}

If value is empty, there is no output. If value is not-empty, the function emits the concatenation of the prefix, value, and suffix. prefix or suffix may be empty; if both are empty, the function should work, but using it is unnecessary in that case. Also, if the suffix parameter was omitted, the suffix should default to the empty string. Omitting the prefix or value parameter is invalid.

Building the test-for-empty into #wrap simplifies the code and reduces the parameter references by one. I suspect that's a marginal improvement, but this pattern is very common in templates and a small saving in lots of places would add up. In template Citation/core alone there are probably dozens of places where #wrap could replace #if.

There are more involved use cases that follow this pattern but reference multiple parameters in the #if test. In those cases, the syntax load is considerable as there are either concatenated or nested parameter references. A #wrap function would reduce the source code, and as the number of parameters used increases, the reduction gets closer and closer to 50% per call. In the long run, that may be a better reason to implement #wrap than a performance improvement.

I am not wedded to the name, so if someone suggested a better one that would be fine with me.


Version: unspecified
Severity: enhancement

Details

Reference
bz22701

Event Timeline

bzimport raised the priority of this task from to Low.Nov 21 2014, 11:11 PM
bzimport added a project: ParserFunctions.
bzimport set Reference to bz22701.
bzimport added a subscriber: Unknown Object (MLST).

As a performance enhancement, I think this would be pretty negligible in cases like

{{#if: {{{format|}}} |  ({{{format}}}) }}

however, it might have more merit for cases where {{{format|}}} is replaced on both sides with something more complicated, like its own nested conditional. Ultimately the effect should be easy to benchmark against some of EN's more complicated code.

I agree that making wiki source more legible may be it's own benefit. Though strictly speaking you don't need a parser function for that. You could simply define a normal template that translates

{{wrap | pre | value | suffix}}

into

{{#if: {{{value|}}} | {{{pre|}}}{{{value}}}{{{suffix|}}} }}

Doing that with a regular template would generate a small performance penalty in the general case, but again I'm not sure how large that would be even if its use were extensive. It is a case where some benchmarks would be helpful.

Off-hand, I'd say the case for implementing something like this as a parser function is pretty marginal, though I could be convinced otherwise if the performance impact on things like Citation/core is larger than I expect.

Ultimately though, I'd say the main path to performance enhancements with respect to complex templates is to restructure the parser cache so that when a page is edited we can look at the prior version and bring forward the results of any template calls that weren't changed. Re-evaluating every template on every edit is far slower than things need to be when most of the time very little of the page has changed.

John wrote:

Thanks for taking a look at this.

AlexZ made a comment on bug 22134 and said that "my guess is that the cite templates are slow due to feature bloat and over-standardization." Was he referring to page editing only, and not page viewing (except right after a save, of course.)

I made an assumption that template interpretation had a negative effect on rendering time based on his comment and other comments made by WP editors who complained that when a page uses a lot of templates, esp. citation templates, the page takes a long time to render. Based on your comments above, template interpretation isn't the issue there, though it's probably an issue when editing the page. Citation/core adds extra markup, but given how fast browsers deal with that markup, I'd be surprised if the browser side of things made any difference except in edge cases.

I think the code simplification might make this an idea worth pursuing. I probably wouldn't use a template as the instances where it would help are already generating complaints about template performance, so even a marginal change in the wrong direction isn't a good idea.

It's a page rendering issue definitely. So it doesn't affect anonymous reading (except immediately after an edit is saved).

What many editors don't appreciate is that some of the options in user preferences require that the page be re-rendered just for them (unless someone with an identical option set has already loaded it since the last edit). This has the effect that viewing complex pages can appear to be slow for logged-in editors. However, it won't generally appear that way to logged-out editors.

Rewiring the parser caching scheme is probably the most practical approach to helping improve the user experience around complex pages. Of course that effort is orthogonal to the question of how to improve the templates themselves.

In addition to that, many of the programmers tend to feel we should stop messing around the edges of template and parser function syntax and design a template alternative that deals more naturally with programmatic elements (including for loops, variable assignments, etc.) and has a syntax that isn't totally insane. However, redesigning template programming (or more likely providing a better alternative) is a massive problem, so there hasn't been a lot of progress.

John wrote:

I did not know about the performance impact of being logged in. I logged out and visited a couple pages with lots of citations and saw the effect you described.

I also looked at the citation templates again and I'd like to try and convince you (and others) that #wrap is worth it. I certainly understand that an algorithmic improvement (such as the caching mechanism you described) will yield a much bigger improvement than fine-tuning template code, but algorithmic improvements tend to be a lot more work, and they are more risky. On the other hand, #wrap should be easy to implement and easy to test. It has to yield some improvement, so worst-case, it's provides a small performance improvement and a nice-to-have feature in terms of code clarity, and any time saved at all is good.

I'd love to see a redesign of the template facility, perhaps implemented as a separate facility. (Leave templates in place, but add a new facility with syntax that isn't "totally insane" <g>.) I presume one hard part is designing a powerful facility that integrates with a caching system. I am not sure if there are any content management systems with a similar facility, but I recall that XSLT was designed as a declarative language in part to enable progressive rendering and specifically, to allow transforming part of a document without transforming the rest. I can't say that it succeeded; I used XSLT a lot but I am not familiar with the innards of any XSLT processors.

Thanks again for engaging this issue. If there is a place where WP technologists are discussing these sorts of issues, I'd love to know about it and participate.

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:01 AM
Aklapper removed a subscriber: wikibugs-l-list.