Page MenuHomePhabricator

Create deprecateInterwiki.php to edit pages, converting interwiki links to external links
Open, LowPublicFeature

Description

Currently, people are reluctant to get rid of obsolete interwiki prefixes, because of historical reasons (i.e. they are still used in a lot of pages). See http://lists.wikimedia.org/pipermail/wikitech-l/2014-January/074015.html

A maintenance script should be created to go through the iwlinks table, find all the instances in which the interwiki is used, and edit the pages to replace those interwiki links with external links. Sometimes a template, e.g. [[metawikipedia:Template:w]], is used, so it would be necessary to fix those templates manually, I suppose. Those template pages don't show up in the iwlinks table.


Version: 1.23.0
Severity: enhancement

Details

Reference
bz60135

Event Timeline

bzimport raised the priority of this task from to Low.Nov 22 2014, 2:59 AM
bzimport set Reference to bz60135.
bzimport added a subscriber: Unknown Object (MLST).

A script (e.g. [[mw:Extension:InterwikiMap]]) to delete prefixes from the interwiki table could run this script prior to the deletion. Likewise with deletions done manually through Special:Interwiki.

From [[metawikipedia:Talk:Interwiki map]]: "Admins, please allow consensus to form (or at least no objections to be raised over a period of a few days) before adding new entries, as once added they are hard to remove from the many copies around the world."

Somewhat related, in Parsoid HTML we store all interwikis in expanded form. When an interwiki is removed it will automatically serialize to wikitext in the external link form. That already happens automatically for systems that store HTML like Flow.

For normal MediaWiki with wikitext as the main storage it would be useful to have a list of deprecated interwiki prefixes that should only be considered when converting from wikitext to HTML, but not when converting from HTML to wikitext. This would avoid the need to convert the entire history of a wiki at once, which is not very realistic for a site like Wikipedia.

System administrators of small wikis who want to do the deprecations manually can use [[mw:Extension:InterwikiUsage]] to see what conversions need to be done on what pages.

I made a comment like that at this section [[metawikipedia:Talk:Interwiki map#Replacing broken links]], and in addition I suggested that complex embedded coding like templates/modules and MediaWiki messages should be ignored in favor of a hand-check by a human in case it breaks something.

Noting it here to add myself to the CC list.

Wait, what problem would a server-side script like this solve that AWB could not already handle? I imagine this kind of operation might be expensive.

(In reply to TeleComNasSprVen from comment #6)

Wait, what problem would a server-side script like this solve that AWB could
not already handle? I imagine this kind of operation might be expensive.

This inspired me to write [[mw:Relative advantages of bots and server-side tools]] and post there a few musings about this topic. Usually I prefer server-side tools, because the bot frameworks I use (e.g. Peachy and Chris G's botclasses) keep becoming abandonware, although that could also occur with server-side tools, I guess.

(In reply to Nathan Larson from comment #0)

Currently, people are reluctant to get rid of obsolete interwiki prefixes,
because of historical reasons (i.e. they are still used in a lot of pages).

I don't think this is a sufficient problem statement.

A maintenance script should be created to go through the iwlinks table, find
all the instances in which the interwiki is used, and edit the pages to
replace those interwiki links with external links.

You're proposing a solution to a problem that hasn't been clearly defined.

Interwiki links are a very old and established part of wikitext. They're used on millions of pages and in many other places (for example, [[mw:foo]] is frequently used in this bug tracker). Deprecating interwiki links, if that's decided, might mean telling people to not use them, but it probably wouldn't include forcibly removing the markup, I don't think.

(In reply to MZMcBride from comment #8)

(In reply to Nathan Larson from comment #0)

Currently, people are reluctant to get rid of obsolete interwiki prefixes,
because of historical reasons (i.e. they are still used in a lot of pages).

I don't think this is a sufficient problem statement.

What's insufficient about it? Based on whatever criteria he deems relevant, a system administrator might decide that an interwiki prefix is obsolete. Maybe a system administrator simply thinks certain prefixes, like seattlewiki, are cluttering up the interwiki table, because they'll only be used a handful of times. When the decision is made to retire a prefix, it could be good to migrate over to external links.

Whether bots or maintenance scripts are the best way to do that is another question; see [[mw:Relative advantages of bots and server-side tools]]. If a maintenance script is made available, then the system administrators can choose one or the other. That script can be included, or not, in the core, but that's a separate decision.

A maintenance script should be created to go through the iwlinks table, find
all the instances in which the interwiki is used, and edit the pages to
replace those interwiki links with external links.

You're proposing a solution to a problem that hasn't been clearly defined.

Interwiki links are a very old and established part of wikitext. They're
used on millions of pages and in many other places (for example, [[mw:foo]]
is frequently used in this bug tracker). Deprecating interwiki links, if
that's decided, might mean telling people to not use them, but it probably
wouldn't include forcibly removing the markup, I don't think.

With reference to the bug tracker, why would Wikimedia get rid of mw: as an interwiki link? It's not obsolete, for Wikimedia's purposes. I don't see how making a script available for making automated edits equates to "forcibly" removing anything.

This may not be an appropriate use of UNCONFIRMED; it seems like you're using it to mean "possibly INVALID" or "perhaps should be WONTFIXed". The proper use of UNCONFIRMED seems to me to be more like "hasn't been tested, but could end up being a WORKSFORME".

Aklapper changed the subtype of this task from "Task" to "Feature Request".Feb 4 2022, 11:13 AM
Aklapper removed subscribers: GWicke, leucosticte.