Page MenuHomePhabricator

Reasonably efficient interwiki transclusion
Open, LowPublicFeature

Assigned To
None
Authored By
bzimport
May 13 2007, 1:22 AM
Referenced Files
F36913472: patch.diff
Mar 15 2023, 8:34 PM
Tokens
"Love" token, awarded by Liuxinyu970226."Doubloon" token, awarded by Nemo_bis."Love" token, awarded by Dinoguy1000."Like" token, awarded by Kozuch."Love" token, awarded by He7d3r.

Description

Author: ayg

Description:
Currently $wgEnableScaryTranscluding seems to just pull templates from an HTTP
request and cache them for some arbitrary period (probably, not having looked
closely, indefinitely if the page isn't edited, since I bet nothing invalidates
the parser cache). It goes without saying that this is relatively slow and
unreliable; therefore the feature is not enabled on Wikimedia sites. A shared
page database is necessary, and for timely invalidation of caches, the central
repository must also know about the places that are sharing with it.


Version: unspecified
Severity: enhancement
See Also:
T41610: Scribunto should support global module invocations

Details

Reference
bz9890

Related Objects

Event Timeline

bzimport raised the priority of this task from to High.Nov 21 2014, 9:40 PM
bzimport set Reference to bz9890.
bzimport added a subscriber: Unknown Object (MLST).

RSYQFIOJGWZA wrote:

Do you mean to write that caching isn't down to an art form yet? :)

A shared page database would void the point of 'interwiki' transclusions, which basically mean "This data isn't mine, but I want to use it" so that's why it's defined as "Scary" transclusion and evilly cached.

A good solution for local wiki sharing content in the way you want would probably be an extension meant for transwiking content. That's actually something I'm working on, though I need a hook in the parser or preprocessor that will let me modify data grabbed from a template before it's parsed. ParserBeforeStrip is good for doing that to content of the current page, but there is no method for doing that with the start of transcluded text.

shardsofmetal wrote:

Would it be possible to allow sending variables to the templates, like you can do with local templates? Also, I didn't know if the API would be a more efficient way to do this, assuming that the functionality hasn't already been modified to do it that way. Maybe there could be something in the api that allowed variable input for templates, then it returns the appropriate content. That way, when trying to transclude a template from Wikipedia that requires variables from a third-party wiki, even when using the correct syntax sending the variable information, you wouldn't just get the documentation page returned, telling you that you need variables to correctly use the template.

(In reply to comment #3)

Would it be possible to allow sending variables to the templates, like you can
do with local templates? Also, I didn't know if the API would be a more
efficient way to do this, assuming that the functionality hasn't already been
modified to do it that way. Maybe there could be something in the api that
allowed variable input for templates, then it returns the appropriate content.

I think what you're looking for is:

api.php?action=parse&text={{templatename|param1|param2|etc.}}

Bug 14024 is related, and probably a dependency if the API is used this way.

This isn't really an enhancement it's a fix to bug 1126 which was an enhancement.

peter017 wrote:

Could somebody sum up the difficulties to solve this problem? If I understand well, the only need is to create a dynamic list of shared templates and store it in a database.

What are the underlying difficulties? What are the differences between including templates and including images? Does it require a big amount of code? As I really want this feature enabled on WMF, I would like to know if I can be of any help, but I need more information for this...

ayg wrote:

I don't think there are any overwhelming difficulties. It should be pretty much the same as including images. You'd need to coordinate templatelinks across wikis, but I assume that this is already done for imagelinks on Commons. Someone does need to sit down and write the code, and get it reviewed -- it will probably be a substantial amount of code.

peter017 wrote:

I have proposed this task as my project for Google Summer of Code 2010 (see http://www.mediawiki.org/wiki/User:Peter17#Project_1)

peter017 wrote:

I have filled a student proposal on time, but nobody has reviewed it yet :-( and I got no mentoring proposal...

It is available here: http://socghop.appspot.com/gsoc/student_proposal/private/google/gsoc2010/peter17/t127055739388 for those who have a GSoC account.

I think this would be a good opportunity for this bug to be solved... If anyone is interested in mentoring me for this project, I would be very pleased!

Peter has completed his work, a small amount of additional work is needed. It would be great if this could be accomplished as a reasonable priority. The project recently mentioned on Signpost would work well with this, as would:

  • sharing Persondata
  • sharing interwiki link lists on articles (saving tens of millions of edits by interwiki bots)
  • sharing common templatology

I've talked to Peter and Roan about getting this merged. They will have time in the next few weeks.

Don't know why I marked this "Low" since it is something that we (were?) planning to merge. re-raising to normal.

Setting high importance. Looks much better.

(In reply to comment #11)

I've talked to Peter and Roan about getting this merged. They will have time
in the next few weeks.

What happened here?

Mark: Do you remember? (comment 11)

I'm a little confused. This bug seems to imply that retrieving foreign wiki pages directly from the database is a *bad* thing. However, on http://www.mediawiki.org/wiki/User:Peter17/Reasonably_efficient_interwiki_transclusion it talks about fetching the wikitext from a shared database. Is it one or the other?

Second question: why the difference between {{wiki:test}} and {{raw:wiki:test}}? Is there some sort of parsing difference that I'm not aware of? Would there be a reason to prefer one over the other?

And finally, a third question based on the second question: if using {{raw:wiki:test}}, isn't all the parsing and template parameter whatnot done locally, thus eliminating the problems talked about in the aforementioned link (with the exception of the caching issue)?

I might pick up this issue depending on the answer to these questions.

Aryeh / Peter / Rich: Can any of you answer the questions in comment 16?

peter017 wrote:

  1. IIRC, the code I worked on provided both ways to fetch the templates (API request and shared DB). A shared DB it the most efficient way.
  1. I don't remember. Where have you seen {{raw:wiki:test}}?
  1. I don't know.

Links to the code I developed:

To go slightly off-topic for a minute, I've had the thought for several years now (as far back as when I was active on Wikipedia, in 2009 or 2010 or so, I think) to call this "interclusion"; does this seem like a reasonable neologism or is it just a dumb idea? :^)

to call this "interclusion"

I think that might be difficult to translate, and be confusing for new users (as "transclusion" is - see T57434).

Dumb idea, then; I hadn't even stopped to consider translation (and, apparently, I've had the mistaken assumption for quite a while that basically everyone picked up on the meaning of "transclusion" as fast as I did).

Is someone working on this issue? What are the main blockers? Any implementation ideas? Also, would it work better if MW API was used as the backend to perform all the needed operations? Or parsoid?

Is someone working on this issue? What are the main blockers?

MITRE is using interwiki transclusion of Wikipedia articles; they may be asked what issues they find if any. I think the main issues this task is about are related to using crosswiki templates and so.

I was told that cache invalidation is the main issue - e.g. page is updated on remote wiki, and it needs to notify all the pages that transclude it that it has been updated. Plus this has to be recursive )

So is this not similar to the Wikidata usage tracking? Is there anything that can be learned or reused from their experience updating foreign pages?

There is also the experience of the xwiki use of Meta user pages as a
defacto xwiki user page.

In the off-chance it isn't known, interwiki transclusion is used on Wikia, where pages on Community Central can be transcluded onto any Wikia wiki. This is normally used for user pages (e.g. for staff). As far as I know, this is the only enabled case, though; I'm not aware of any other Wikia wikis whose content can be transcluded onto a different wiki.

I'm tempted to mark this fixed. ;)

I propose we use T122086 method, at least as a stopgap, as it requires no server-side code changes.

Aklapper lowered the priority of this task from High to Low.Mar 11 2018, 12:15 PM
Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:01 AM

Here is the diff file extracted from the Google Code Archive source Peter originally stored his work in if anyone would ever want to work on integrating it: