Page MenuHomePhabricator

Retrieve specialPageAliases through API Query Meta
Closed, ResolvedPublic

Description

Author: lars

Description:
The MediaWiki API can retrieve metainfo for namespaces and namespacealiases.
It would be nice if it could also retrieve the translated
special page names, called specialPageAliases in the PHP source code.

I suggest this could be implemented as meta=specialpagenames

Today, there is a long list of translated special page names on
http://www.mediawiki.org/wiki/Special_page_names

I'd like that to become part of the functionality specified at
http://www.mediawiki.org/wiki/API:Query_-_Meta

Some of the same phrases are available through Allmessages,
but the special page name 'Whatlinkshere' => array( 'Verweisliste' )
doesn't correspond to

http://de.wikipedia.org/w/api.php?action=query&meta=allmessages&ammessages=Whatlinkshere
which returns the translation "Links auf diese Seite".


Version: unspecified
Severity: enhancement

Details

Reference
bz13541

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:07 PM
bzimport set Reference to bz13541.

I don't see why you'd want to use the API for this:

First, the idea of the API is to avoid the UI. When you're using special pages, that means you're using the UI.

Second, this information is very easy to get: do an HTTP HEAD request for http://de.wikipedia.org/wiki/Spezial:Whatlinkshere and you'll receive a 302 to http://de.wikipedia.org/Spezial:Linkliste . The Spezial: prefix is available from meta=siteinfo&siprop=namespaces, or by issuing another HEAD for http://de.wikipedia.org/Special:Whatlinkshere , which will 301 to Spezial:Whatlinkshere.

Third, because of the 301 and 302 mentioned above, a link to http://de.wikipedia.org/wiki/Special:Whatlinkshere will work just fine.

Marking as WONTFIX as I don't really see a use case for this (other than linking to a special page, but then the English names always work as outlined above). Please REOPEN if you do see one that I totally forgot about.

lars wrote:

Here's my application: I'm analyzing the page view logs published
by Domas Mituzas, http://dammit.lt/wikistats/

These logs contain a mix of URLs in English (e.g. Special:Whatlinkshere)
and the site's own language (e.g. Spezial:Verweisliste).
I want to add these up, as they are just aliases for the same special page.

To get a list of the namespaces and namespacealiases used on the
site (e.g. Spezial = Special and WP = Wikipedia), I can use the API.

To get a list of the specialPageAliases used, I now have to grab
the source code from SVN, guess which version the website runs,
and parse the PHP from there. It would be more logical that the
list of used specialPageAliases was available from the API.

If this was offered by the API, it would follow the pattern
of namespaces and allmessages.

(In reply to comment #2)

To get a list of the specialPageAliases used, I now have to grab
the source code from SVN, guess which version the website runs,
and parse the PHP from there. It would be more logical that the
list of used specialPageAliases was available from the API.

If this was offered by the API, it would follow the pattern
of namespaces and allmessages.

That is indeed a benign use case in which you'd want to get all the information
at once. I'll go and implement it.

lars wrote:

It would also be nice to have API access to the value of fallback8bitEncoding,
as part of meta=general I guess. It's kind of related to case="".

Bug as reported was fixed in r32538.

(In reply to comment #4)

It would also be nice to have API access to the value of fallback8bitEncoding,
as part of meta=general I guess. It's kind of related to case="".

Is that a configuration option? I couldn't find any reference to it on mediawiki.org

lars wrote:

fallback8bitEncoding is a per language variable defined in
http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/languages/messages/MessagesEn.php?view=markup

$fallback8bitEncoding = 'windows-1252';

and most west European languages (de,fr,sv,no,da,...) have English as fallback,
so they don't need to define this variable anew.

However, Polish does not but gives it a different value,
http://svn.wikimedia.org/viewvc/mediawiki/trunk/phase3/languages/messages/MessagesPl.php?view=markup

$fallback8bitEncoding = 'iso-8859-2';

Non-ASCII characters are represented as UTF-8 and hex encoded in the URLs.
But if an incoming URL contains hex that is invalid UTF-8,
this is the 8bit encoding that is tried instead.

You can access [[de:Bär]] as
http://de.wikipedia.org/wiki/B%C3%A4r
where %C3%A4r is the correct UTF-8 code for a-umlaut (ä).

but you can also reach this page as
http://de.wikipedia.org/wiki/B%e4r
where %e4 is the ISO 8859-1 (and thus windows-1252) code for a-umlaut (ä).

Again, in my analysis of the log files, I'd like to sum up the number
of page views for these two URLs, since both go to the same page.