Page MenuHomePhabricator

Improve interface for MediaHandlers to add JavaScript
Open, LowPublicFeature

Description

Current situation:
A File object is not connected to an output context. A File has a "transform" method. The transform method returns a MediaTransformOutput object. This object only has a toHtml(), the parser directly calls toHTML on image objects from inside the parser (or actually linker). The FilePage object does something similar for the File history, as do a few Special pages.

Currently MediaHandlers that need JS, add these modules using the method parserTransformHook, which one could either directly add modules to the $parser object, or add an OuputHook, to mess with $wgOut later on. Special pages do this on title matching in the beforedisplay hook.

This is not that good an interface. Besides being rather indirect way to add resource loader modules, its (almost) impossible to trigger this from a special page where we call transform directly, and don't have a $parser object that is parsing something.

I'm not sure what a better interface would be. My first thought would be that responsibility for this should not be in MediaHandler, but instead be in the MediaTransformObject. The mto could at least have a method getModules() which special pages could call, or perhaps something like $mto->addModulesToOutput( $this->getOutput() ); Still less then ideal since people will be bound to forget to call it as most media types don't need js, but still would be much better than current situation.


Version: 1.23.0
Severity: enhancement

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:22 AM
bzimport set Reference to bz58478.
bzimport added a subscriber: Unknown Object (MLST).

A couple possible scenarios:

  1. Always load just enough code to check for if you need to do runtime JS transformations

In this scenario, every page that can show wikitext should have at least a tiny JS module loaded that hooks into an event that is called on load and again every time new parsed wikitext output is added to a page (such as by a preview, or a dialog box, or whatever).

This might be a very tiny piece of code that just asynchronously fires off a load of a fuller module, such as something that adds player controls to a video or sets up interactive rendering of a molecule on a canvas, or whatever, when it encounters its targets (and otherwise does nothing).

Advantages:

  • can use same code & event for initial load and new loads?
  • defers extra media modules until use
  • doesn't require recording modules used per page
  1. Implement any media viewers more complicated than a simple element without scripting via an <iframe>

Advantages:

  • the iframe content rendering worries about module loading, the parent page can ignore it totally
  • makes exposing those media for external embedding trivial, as we'd be doing the exact same thing we do in our content context

Using iframes would also decrease our attack surface significantly, if we use a separate domain for them. Plus it would make it simple to treat local and external files in a uniform way. That would restrict what the JS code is allowed to do, but a player probably doesn't need access to anything outside its own frame anyway.

(In reply to comment #2)

Using iframes would also decrease our attack surface significantly, if we
use a
separate domain for them. Plus it would make it simple to treat local and
external files in a uniform way. That would restrict what the JS code is
allowed to do, but a player probably doesn't need access to anything outside
its own frame anyway.

Hmm, that might get in the way of our current click the video and get a pop up dialog that has the video on it

mdale wrote:

Within kaltura proper we do sandbox the player in an iframe, but we still make use of parent javascript access for synchronous api ( postMessage is asynchronous ) Also HTML fullscreen on iPads and IE's we need parent page access to adjust the iframe layout to take up full browser page space.

The kaltura player uses a friendly ( same domain ) iframe, but this does not reduce attack surface, since you can just jump up to the parent frame and run any JS you want, furthermore you would have to structure things to server the player iframe from another domain, to have any effect on 'attack surface'.

Also, you need to do tricky iframe injection strategies [1] to support one click play on mobile chrome and iOS ( assuming we ever care about single click to play user experience )
[1] https://github.com/kaltura/mwEmbed/blob/master/kWidget/kWidget.js#L935
Thouse injection strategies only work for same domain iframes.

And finally safari blocks cross domain iframe cookies, so any personalization / customization / private media playback has to be structured post "click" in iframe, or via url parameterization.

Having a separate rendering / entry point, has its own sets of risks, that probably outweigh advantages of cross domain iframing the player.

I recommend we use normal precautions of localization string and api based playback ( no more video payload injection )

(In reply to comment #4)

Within kaltura proper we do sandbox the player in an iframe, but we still
make
use of parent javascript access for synchronous api ( postMessage is
asynchronous ) Also HTML fullscreen on iPads and IE's we need parent page
access to adjust the iframe layout to take up full browser page space.

The kaltura player uses a friendly ( same domain ) iframe, but this does not
reduce attack surface, since you can just jump up to the parent frame and run
any JS you want, furthermore you would have to structure things to server the
player iframe from another domain, to have any effect on 'attack surface'.

The point here is more to centralize js - js gets loaded with the iframe so that we could avoid including the TMH loader js on all page loads, well at the same time not have to keep track of which pages have a video on them.

Reviving this & adding projets, as the notion came up while working on other bugs recently. :)

TheDJ set Security to None.

How about:

MediaTransformOutput\getJsConfigVars()
MediaTransformOutput\getModules()
MediaTransformOutput\getModuleScripts()
MediaTransformOutput\getModuleStyles()

And then have:

ParserOutput::addTransformOutputMetadata(MediaTransformOutput transformOutput )
OutputPage\addTransformOutputMetadata (MediaTransformOutput transformOutput )

which copies the modules and adds it to a context ?

How about:

MediaTransformOutput\getJsConfigVars()
MediaTransformOutput\getModules()
MediaTransformOutput\getModuleScripts()
MediaTransformOutput\getModuleStyles()

And then have:

ParserOutput::addTransformOutputMetadata(MediaTransformOutput transformOutput )
OutputPage\addTransformOutputMetadata (MediaTransformOutput transformOutput )

which copies the modules and adds it to a context ?

Sounds good to me. For completeness we might even want something like

OutputPage::addMedia( MediaTransformOutput $transformOutput );

For the case where you're in a special page, and you want to just add an image to the output, including both its html, and its js.

While playing with this, I suddenly become aware again of the fact that the Linker is basically also an incredibly outdated piece of static function string generators....

We'll have to touch that invasively, cause in it's current state it is just not maintainable enough to add this on to it as well

This same problem also exists in TablePager

Change 249593 had a related patch set uploaded (by TheDJ):
[WIP] Add modules to MediaTransformOutput

https://gerrit.wikimedia.org/r/249593

bawolff: Maybe we need a new interface CanEmitModules
legoktm: yes

I think we have an interface of that nature -- it's the ParserOutput class. :) Do we just need an easier way to encapsulate "bit of HTML plus some modules" into ParserOutput instances?

I think we could extend MediaTransformOutput so in addition to keeping the toHtml() method which returns a raw HTML string, we add a toParserOutput() (?) method that returns a ParserOutput instance with the HTML *and* any necessary modules/metadata attached.

Then I guess we need a flavor of Linker::makeImageLink() that returns a ParserOutput instead of an HTML, as well. Usage sites that are emitting directly to an OutputPage can call $out->addParserOutput() on it, while those constructing HTML to fit into a bigger ParserOutput (like the Parser!) can concat in the HTML and extract the modules/etc via .....

Hmm.. There's $po->addOutputPageMetadata() which does what I want from an OutputPage, but not from another ParserOutput. How convenient. ;) Could easily add one for merging POs though.

Hmm, but ParserOutput is not an additive object though. It gets constructed with the full output by the Parser, and is not edited...

Change 250274 had a related patch set uploaded (by TheDJ):
[WIP] Add modules to MediaTransformOutput

https://gerrit.wikimedia.org/r/250274

Change 250274 abandoned by TheDJ:
[WIP] Add modules to MediaTransformOutput

https://gerrit.wikimedia.org/r/250274

Change 249593 abandoned by TheDJ:
[WIP] Add modules to MediaTransformOutput

https://gerrit.wikimedia.org/r/249593

Slightly related, this improved 'content' blob should then be fed to T155375: Extract thumbframe etc from parser into EmbeddedContentRenderer for positional framing inside the rest of the content.

Ebe123 subscribed.

All patches abandoned; no patches to review.

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:13 AM