Page MenuHomePhabricator

Inclusions should not be stored in the generic 'links' table
Closed, ResolvedPublic

Description

Author: rowan.collins

Description:
It has been mentioned in various places that the current practice of storing
template inclusions in the 'links' table is unhelpful at best. It is at least
partly the cause of all the bugs I've marked as depending on this one, and
perhaps of many more.

I note that there has also been a suggestion that rather than creating a new
'templatelinks' / '{trans|in}clusionlinks' table, we should actually *merge* the
various existing links tables ('imagelinks', 'categorylinks', etc) and mark them
with a 'linktype' field instead.

Either way, I just thought it would be useful to track what issues depend on
such a change.


Version: unspecified
Severity: normal

Details

Reference
bz1065

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 8:00 PM
bzimport set Reference to bz1065.
bzimport added a subscriber: Unknown Object (MLST).

bugzillas+padREMOVETHISdu wrote:

Does this block bug 572 as bug 734 comment 4 seems to imply?

(In reply to comment #1)

Does this block bug 572 as bug 734 comment 4 seems to imply?

It shouldn't, except for those which are not in the template: namespace (bug 734) or regarding changes to
templates used in templates used in some page.

bugzillas+padREMOVETHISdu wrote:

Should an attempt to transclude an inexistent page result in adding it to the
"normal links table" or the "transcluding pages table"?

rowan.collins wrote:

(In reply to comment #3)

Should an attempt to transclude an inexistent page result in adding it to the
"normal links table" or the "transcluding pages table"?

A good question - the display in such a case is as a "red link", which are
normally stored in the brokenlinks table; are they, in fact, stored in
brokenlinks now? Perhaps if the "single table with 'link_type' field" approach
were taken, there could be a 'link_is_broken' flag; I'm thinking this would be
handy for categories as well - allowing an easy query for "category containing
pages but no introductory text".

Which brings us to another question about this idea: if I understand correctly,
some of the '*links' tables use ID->name pairs, and others use ID->ID; if there
are good reasons for this difference, there will presumably need to be at least
2 tables, one of each type. I presume this is to do with efficiency or somesuch,
since *all* links are actually to "the page with this name" not "this page
whatever it's name is", so ID->ID pairs have to be carefully kept up to date;
and obviously, any broken link can't refer to an ID, so that may put paid to the
idea of "WHERE link_type=<category> AND link_is_broken=1" etc.

bugzillas+padREMOVETHISdu wrote:

DISCLAIMER: I know absolutely nothing about MediaWiki, so if the following is
useless gibberish, please ignore.

I presume by ID you mean oldid, which I hear is not permanent, and may change on
deleting & undeleting the page. Hence any ID->ID tables must be changed to
ID->name. Which tables are those? Shouldn't a separate bug be filed for changing
them to ID->name (or at least augmenting them with another table that is
ID->name plus lot of ugly processing to decide which of the two tables must be
used at any time, since ID->ID may not always be up-to-date). Also what happened
to the fact that latest versions of pages have no IDs?

rowan.collins wrote:

(In reply to comment #5)

DISCLAIMER: I know absolutely nothing about MediaWiki, so if the following is
useless gibberish, please ignore.

I prefer to spread the knowledge. ;)

I presume by ID you mean oldid, which I hear is not permanent, and may change on
deleting & undeleting the page. [...] Also what happened
to the fact that latest versions of pages have no IDs?

No, this is not the 'old_id': each page has a 'cur_id' which refers to "whatever
the latest version of this page is" - e.g. the main page has cur_id=1; if
someone edits it, cur_id=1 still refers to the *current* main page, not some old
version of the main page. At the moment, the current content of every page is
stored in one database table, and all old versions in another; each edit creates
a new old version (with a new old_id) but the current version (with its cur_id)
is just over-written with what is now current. This will change "soon", so that
we can say "what is the current revision for page_id=1" in the same way we
already say "what old revisions are there for page_id=1".

If you think about it, this makes sense: a link is not to a page *at the time
you created the link*, but to the page in general, whenever you feel like
following the link.

bugzillas+padREMOVETHISdu wrote:

OK, so does that mean ID->ID links always point to the "latest name" when pages
get moved? (I assume moving a page assigns the new name to the same ID but
creates another ID for the old name.) In that case probably ID->ID links make
more sense than ID->name.

rowan.collins wrote:

(In reply to comment #7)

(I assume moving a page assigns the new name to the same ID but
creates another ID for the old name.)

Yes, I believe that's how it works. The redirect at the old name is technically
a new page.

In that case probably ID->ID links make more sense than ID->name.

Not really, because the text of the article still contains a link to the old
name, not the new one. So presumably all the entries in the links table get
changed to the new ID, because that's the article the links actually point to
(the redirect, at the old name). Hence why I guessed it was something to do with
efficiency, because I can't see how else it makes sense.

(In reply to comment #8)

So presumably all the entries in the links table get
changed to the new ID, because that's the article the links actually point to
(the redirect, at the old name).

Right. This is a pain in the butt, as potentially a very large number of entries may need to be updated. At
some point we will probably dump the separate links and brokenlinks tables and have a single id -> name
space which can be left peacefully intact.

  • Bug 2409 has been marked as a duplicate of this bug. ***

gangleri wrote:

Hallo!
I do not know if it is relevant here.
http://jadesukka.homelinux.org:8180/mediawiki15c/index.php?title=Betawiki:Special_pages&action=purge#see_also
does neither show [[Template:Specialpages/list_(template)]] nor
[[Template:Specialpages/list_(temp)]] as "Templates used on this page:" used in
{{Specialpages/list_(template)|Specialpages/list_(temp)}}.
Neither in edit nor in preview.
Regards Reinhardt [[user:gangleri]]

gangleri wrote:

bug 2456: "Templates used on this page:" is not updated (properly) while a page
is changed
refers here
it is neither clear how are dependencies nor what should be done first

This is fixed in CVS for release in 1.6.