Page MenuHomePhabricator

Access QueryPage-based special pages via API
Closed, ResolvedPublic

Description

Would be very useful to be able to view the output of various (all page/category/template/image etc ones) special pages via API queries

Then we dont have to HTML scrape the special pages to get the required output


Version: 1.13.x
Severity: enhancement

Details

Reference
bz14869

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 10:11 PM
bzimport set Reference to bz14869.

bdanee88 wrote:

So, mainly these special pages would be the most important ones to acces via API queries (maybe an unified implementation for all is better, I don't know which is easier):
*DoubleRedirects
*BrokenRedirects
*LonelyPages
*UncategorizedPages
*UncategorizedCategories
*UncategorizedImages
*UncategorizedTemplates
*UnusedTemplates
*UnusedImages
*UnusedCategories
*WantedCategories
*MissingFiles
*ShortPages
*NewPages
*AncientPages
*DeadendPages
*ListAdmins
*ListBots
*WithoutInterwiki

Bryan.TongMinh wrote:

We could do something with QueryPage.

PHP doesn't support dynamic subclassing, right?

I'll take it. I will probably do some modifications to QueryPage to integrate if with API.

Newpages is already available through list=recentchanges, ListAdmins and ListBots through list=allusers and WithoutInterwiki through list=allpages. For the other special pages, we should either have separate list= modules or add a parameter to list=allpages. I don't think using QueryPage here is a very good idea. Also note that some of these queries are slow and must be turned off on Wikipedia-sized wikis (I suspect LonelyPages, Un* and DeadendPages fall in this category).

Bryan.TongMinh wrote:

If possible the cached results should be used for the API. The new module should probably be something like: action=query&list=querypage&querypage=LonelyPages

Fair enough. As a bit of an exercise to work out what might be useful, and what is already exists, i've started a list to match API stuff to special pages, removing any that definately wouldnt be wanted

Maintenance reports

  • Broken redirects -
  • Cross-namespace links -
  • Dead-end pages -
  • Double redirects -
  • Long pages -
  • Oldest articles -
  • Orphaned pages -
  • Pages with the fewest revisions -
  • Pages without language links - list=allpage&sapfilterlanglinks=withoutlanglinks
  • Protected pages -
  • Protected titles -
  • Short pages -
  • Uncategorized categories -
  • Uncategorized files -
  • Uncategorized pages -
  • Uncategorized templates -
  • Unused categories -
  • Unused files -
  • Unused templates -
  • Unwatched pages -
  • Wanted categories -
  • Wanted pages -

List of pages

  • All pages - list=allpages&apfrom
  • All pages with prefix - list=allpages&apprefix
  • Categories - list=allpages&apnamespace=14
  • Disambiguation pages -
  • Redirects - list=allpages&apfilterredir=redirects

Users and rights

  • Blocked IP addresses and usernames -
  • Deleted contributions -
  • User contributions - list=usercontribs
  • Users - list=allusers

Recent changes and logs

  • Gallery of new files - list=logevents&letype=upload
  • Logs - list=logevents
  • My watchlist - list=watchlist
  • New pages - list=recentchanges
  • Recent changes - list=recentchanges
  • Related changes -

Media reports and uploads

  • Files -
  • MIME search -
  • Search for duplicate files -

Wiki data and tools

  • Expand templates - action=expandtemplates

Redirecting special pages

  • Random article - list=random
  • Random redirect -
  • Search - list=search

High use pages

  • Most linked-to categories -
  • Most linked-to images -
  • Most linked-to pages -
  • Most linked-to templates -
  • Pages with the most categories -
  • Pages with the most revisions -

Page tools

  • What links here - action=query&list=backlinks

Other special pages

  • External links -

(In reply to comment #6)

Fair enough. As a bit of an exercise to work out what might be useful, and what
is already exists, i've started a list to match API stuff to special pages,
removing any that definately wouldnt be wanted

  • Cross-namespace links -

That's an extension, so we're not gonna support that in core. We probably do want to add a hook somewhere to let extensions add them.

  • Unwatched pages -

list=allpages&apfilterwatchlist=unwatched was recently introduced, but reverted. It's probably better to keep it out of allpages and use this mechanism instead.

  • Disambiguation pages -

list=embeddedin&eititle=Template:Disambiguation (or whatever the disambig template is, it's set in a MediaWiki: message whose name I don't remember)

  • Blocked IP addresses and usernames -

list=blocks

  • Related changes -

We should really do something like list=recentchanges&rcrelated=Foo here.

Media reports and uploads

  • Files -
  • MIME search -
  • Search for duplicate files -

These are probably best off integrated into list=allimages

  • Random redirect -

Should be built into list=random

  • External links -

list=exturlusage

Some hints for the new list=querypage module:

  • you obviously need qulimit
  • you wanna have a generic qucontinue parameter that's fed to the QueryPage and let the QueryPage build a query based on that. qucontinue could be a timestamp, a page ID or something else
  • while you need the QueryPage to build the query, you're probably best off executing the query yourself. This makes it much easier to have continuations and generator=querypage as well
  • you wanna have an array of core QueryPages somewhere (probably in QueryPage.php itself) and (like I said earlier) have a hook that allows extensions (such as CrossNamespaceLinks) to add their own. That array should also be set as the list of allowed values for qupage (or whatever it's called) in getAllowedParams()

Thanks for filling in a few more

list=random doesnt have an option for redirects.. (Presumably you were pointing out where this should go?)

Bryan.TongMinh wrote:

(In reply to comment #7)

  • while you need the QueryPage to build the query, you're probably best off

executing the query yourself. This makes it much easier to have continuations
and generator=querypage as well

True, but that won't work for cached pages which we probably will want to support as well.

  • you wanna have an array of core QueryPages somewhere (probably in

QueryPage.php itself) and (like I said earlier) have a hook that allows
extensions (such as CrossNamespaceLinks) to add their own. That array should
also be set as the list of allowed values for qupage (or whatever it's called)
in getAllowedParams()

$wgQueryPages :)

(In reply to comment #8)

list=random doesnt have an option for redirects.. (Presumably you were pointing
out where this should go?)

Yeah, that's what I meant.

(In reply to comment #9)

(In reply to comment #7)

  • while you need the QueryPage to build the query, you're probably best off

executing the query yourself. This makes it much easier to have continuations
and generator=querypage as well

True, but that won't work for cached pages which we probably will want to
support as well.

Hmm, yeah, cached pages... IIRC, ApiPageSet doesn't really care whether it gets pageids, revids, titles or database rows containing ns/title pairs, so supporting generators shouldn't be hard.

  • Bug 3676 has been marked as a duplicate of this bug. ***
  • Bug 3450 has been marked as a duplicate of this bug. ***

soxred93 wrote:

Random redirects were added in r40686

Updated list (some changes not prior listed)

Maintenance reports

  • Broken redirects -
  • Cross-namespace links - Extension
  • Dead-end pages -
  • Double redirects -
  • Long pages -
  • Oldest articles -
  • Orphaned pages -
  • Pages with the fewest revisions -
  • Pages without language links - list=allpage&sapfilterlanglinks=withoutlanglinks
  • Protected pages -
  • Protected titles -
  • Short pages -
  • Uncategorized categories -
  • Uncategorized files -
  • Uncategorized pages -
  • Uncategorized templates -
  • Unused categories -
  • Unused files -
  • Unused templates -
  • Unwatched pages -
  • Wanted categories -
  • Wanted pages -

List of pages

  • All pages - list=allpages&apfrom
  • All pages with prefix - list=allpages&apprefix
  • Categories - list=allpages&apnamespace=14
  • Category tree - n/a
  • Disambiguation pages - list=embeddedin&eititle=Template:Disambiguation
  • Redirects - list=allpages&apfilterredir=redirects

Login / sign up

  • Log in / create account - n/a

Users and rights

  • Block user - n/a
  • Blocked IP addresses and usernames - list=blocks
  • Deleted contributions -
  • Global user list - n/a
  • My preferences - n/a
  • User contributions - list=usercontribs
  • User group rights - n/a
  • User rights management - n/a
  • Users - list=allusers

Recent changes and logs

  • Gallery of new files - list=logevents&letype=upload
  • Logs - list=logevents
  • My watchlist - list=watchlist
  • New pages - list=recentchanges
  • Recent changes - list=recentchanges
  • Related changes -

Media reports and uploads

  • File path - n/a
  • Files -
  • MIME search -
  • Search for duplicate files -
  • Upload file - n/a

Wiki data and tools

  • Expand templates - action=expandtemplates
  • Gadgets - n/a
  • Parser diff test - n/a
  • Statistics - n/a
  • System messages - n/a
  • Version - n/a

Redirecting special pages

  • External links - list=exturlusage
  • Random article - list=random
  • Random redirect - list=random&rnredirect
  • Search - list=search

High use pages

  • Most linked-to categories -
  • Most linked-to images -
  • Most linked-to pages -
  • Most linked-to templates -
  • Pages with the most categories -
  • Pages with the most revisions -

Page tools

  • Cite - n/a
  • Export pages - n/a
  • Import pages - n/a
  • View deleted pages - n/a
  • What links here - action=query&list=backlinks

Other special pages

  • Book sources - n/a
  • List of globally blocked IP addresses -
  • Local status of global blocks -
  • Login unification status - n/a
  • Wikimedia Board of Trustees election - n/a
  • Wikimedia wikis - n/a

soxred93 wrote:

So from what I see, there is:

Maintenance reports

  • Broken redirects -
  • Cross-namespace links - Extension
  • Dead-end pages -
  • Double redirects -
  • Long pages -
  • Oldest articles -
  • Orphaned pages -
  • Pages with the fewest revisions -
  • Protected pages -
  • Protected titles -
  • Short pages -
  • Uncategorized categories -
  • Uncategorized files -
  • Uncategorized pages -
  • Uncategorized templates -
  • Unused categories -
  • Unused files -
  • Unused templates -
  • Unwatched pages -
  • Wanted categories -
  • Wanted pages -
  • Wanted files -

Users and rights

  • Deleted contributions -

Recent changes and logs

  • Related changes -

Media reports and uploads

  • Files -
  • MIME search -
  • Search for duplicate files -

High use pages

  • Most linked-to categories -
  • Most linked-to images -
  • Most linked-to pages -
  • Most linked-to templates -
  • Pages with the most categories -
  • Pages with the most revisions -

Other special pages

  • List of globally blocked IP addresses -
  • Local status of global blocks -

soxred93 wrote:

Subtract * Files - , that can be done with &list=allpages&apnamespace=6.

  • Protected pages - list=allpages&apprtype=edit|move&apprlevel=sysop|autoconfirmed

is another one

Missing (/not currently matched up):

Maintenance reports

  • Broken redirects -
  • Cross-namespace links (Extension) -
  • Dead-end pages -
  • Double redirects -
  • Long pages -
  • Oldest articles -
  • Orphaned pages -
  • Pages with the fewest revisions -
  • Protected titles -
  • Short pages -
  • Uncategorized categories -
  • Uncategorized files -
  • Uncategorized pages -
  • Uncategorized templates -
  • Unused categories -
  • Unused files -
  • Unused templates -
  • Unwatched pages -
  • Wanted categories -
  • Wanted pages -
  • Wanted files -

Users and rights

  • Deleted contributions -

Recent changes and logs

  • Related changes -

Media reports and uploads

  • MIME search -
  • Search for duplicate files -

High use pages

  • Most linked-to categories -
  • Most linked-to images -
  • Most linked-to pages -
  • Most linked-to templates -
  • Pages with the most categories -
  • Pages with the most revisions -

Other special pages

  • List of globally blocked IP addresses -
  • Local status of global blocks -

Complete List:

Maintenance reports

  • Broken redirects -
  • Cross-namespace links (Extension) -
  • Dead-end pages -
  • Double redirects -
  • Long pages -
  • Oldest articles -
  • Orphaned pages -
  • Pages with the fewest revisions -
  • Pages without language links - list=allpage&sapfilterlanglinks=withoutlanglinks
  • Protected pages - list=allpages&apprtype=edit|move&apprlevel=sysop|autoconfirmed
  • Protected titles -
  • Short pages -
  • Uncategorized categories -
  • Uncategorized files -
  • Uncategorized pages -
  • Uncategorized templates -
  • Unused categories -
  • Unused files -
  • Unused templates -
  • Unwatched pages -
  • Wanted categories -
  • Wanted pages -

List of pages

  • All pages - list=allpages&apfrom
  • All pages with prefix - list=allpages&apprefix
  • Categories - list=allpages&apnamespace=14
  • Category tree - n/a
  • Disambiguation pages - list=embeddedin&eititle=Template:Disambiguation
  • Redirects - list=allpages&apfilterredir=redirects

Login / sign up

  • Log in / create account - n/a

Users and rights

  • Block user - n/a
  • Blocked IP addresses and usernames - list=blocks
  • Deleted contributions -
  • Global user list - n/a
  • My preferences - n/a
  • User contributions - list=usercontribs
  • User group rights - n/a
  • User rights management - n/a
  • Users - list=allusers

Recent changes and logs

  • Gallery of new files - list=logevents&letype=upload
  • Logs - list=logevents
  • My watchlist - list=watchlist
  • New pages - list=recentchanges
  • Recent changes - list=recentchanges
  • Related changes -

Media reports and uploads

  • File path - n/a
  • Files - list=allpages&apnamespace=6
  • MIME search -
  • Search for duplicate files -
  • Upload file - n/a

Wiki data and tools

  • Expand templates - action=expandtemplates
  • Gadgets - n/a
  • Parser diff test - n/a
  • Statistics - n/a
  • System messages - n/a
  • Version - n/a

Redirecting special pages

  • External links - list=exturlusage
  • Random article - list=random
  • Random redirect - list=random&rnredirect
  • Search - list=search

High use pages

  • Most linked-to categories -
  • Most linked-to images -
  • Most linked-to pages -
  • Most linked-to templates -
  • Pages with the most categories -
  • Pages with the most revisions -

Page tools

  • Cite - n/a
  • Export pages - n/a
  • Import pages - n/a
  • View deleted pages - n/a
  • What links here - action=query&list=backlinks

Other special pages

  • Book sources - n/a
  • List of globally blocked IP addresses -
  • Local status of global blocks -
  • Login unification status - n/a
  • Wikimedia Board of Trustees election - n/a
  • Wikimedia wikis - n/a

russblau wrote:

(In reply to comment #4)

Newpages is already available through list=recentchanges, ListAdmins and
ListBots through list=allusers and WithoutInterwiki through list=allpages. For
the other special pages, we should either have separate list= modules or add a
parameter to list=allpages. I don't think using QueryPage here is a very good
idea. Also note that some of these queries are slow and must be turned off on
Wikipedia-sized wikis (I suspect LonelyPages, Un* and DeadendPages fall in this
category).

Actually, the information yielded by list=recentchanges (with appropriate settings) is not the same as the contents of Special:NewPages. The differences mainly have to do with titles that were originally created as redirects and later edited to become pages, versus ones that were created as pages and later edited to be redirects.

(In reply to comment #20)

(In reply to comment #4)

Newpages is already available through list=recentchanges, ListAdmins and
ListBots through list=allusers and WithoutInterwiki through list=allpages. For
the other special pages, we should either have separate list= modules or add a
parameter to list=allpages. I don't think using QueryPage here is a very good
idea. Also note that some of these queries are slow and must be turned off on
Wikipedia-sized wikis (I suspect LonelyPages, Un* and DeadendPages fall in this
category).

Actually, the information yielded by list=recentchanges (with appropriate
settings) is not the same as the contents of Special:NewPages. The differences
mainly have to do with titles that were originally created as redirects and
later edited to become pages, versus ones that were created as pages and later
edited to be redirects.

NewPages was recently changed to toggle showing recently created redirects. list=recentchanges also has an option to filter out redirects (rcshow=!redirect)

Any progress on this bug?

Thanks

sam

(In reply to comment #22)

Any progress on this bug?

Thanks

sam

VasilievVV said he'd work on this, but I haven't spoken to him for a while.

(In reply to comment #23)

(In reply to comment #22)

Any progress on this bug?

Thanks

sam

VasilievVV said he'd work on this, but I haven't spoken to him for a while.

VasilievVV requested I take the bug, because he's too busy. Reassigning to me.

  • List of globally blocked IP addresses - list=globalblocks

Another matched up

Bryan.TongMinh wrote:

Added generic query module in r53149. Does broken redirects only for now, need to investigate which modules can also be used.

Todo:

  • Generator mode
  • Non-page generating query pages
  • Generator mode

(In reply to comment #26)

Added generic query module in r53149. Does broken redirects only for now, need
to investigate which modules can also be used.

Reverted in r53167, please do this kind of development in the querypage-work branch, where a lot has been done already.

  • Bug 22180 has been marked as a duplicate of this bug. ***

bug 14020 and bug 15552 closed as resolved/later as they're effectively "dupes" of this.

Add a couple of dependancies.. One is already "fixed" in the querypage-work2? branches. The other could/might aswell, be done at the same time

Removing ASSIGNED status and assigning to Sam, he's the only one doing any work on it at all currently. Sam and I intend to stick our heads together at some point and finish this one once and for all.

ProofreadPage/SpecialProofreadPages.php ProofreadPage/ProofreadPage.php r78719
ProofreadPage/SpecialPagesWithoutScans.php
SemanticDrilldown/specials/SD_Filters.php
r78717
SemanticDrilldown/specials/SD_BrowseData.php Left, back compat from getSQL
QPoll/qp_user.php QPoll/qp_results.php
Uses getIntervalResults
RdfRedland deleted from list
SemanticForms/specials/SF_Templates.php SemanticForms/specials/SF_Forms.php
r78718
ApprovedRevs/SpecialApprovedRevs.php r78721
PasswordReset/PasswordReset_Disabledusers.php
r78713
AjaxQueryPages False positive
StalePages/StalePages_body.php
r78712
SemanticMediaWiki/specials/QueryPages/SMW_QueryPage.php SMWQueryPage reimplements doQuery
SemanticMediaWiki/specials/QueryPages/SMW_SpecialTypes.php
Left, back compat from getSQL
SemanticMediaWiki/specials/QueryPages/SMW_SpecialUnusedProperties.php uses SMWQueryPage
SemanticMediaWiki/specials/QueryPages/SMW_SpecialWantedProperties.php
uses SMWQueryPage
SemanticMediaWiki/specials/QueryPages/SMW_SpecialProperties.php uses SMWQueryPage
SemanticMediaWiki/includes/SMW_Setup.php
SemanticMediaWiki/includes/SMW_SetupLight.php
MetavidWiki/includes/specials/MV_SpecialListStreams.php
r78726
CrossNamespaceLinks/CrossNamespaceLinks.php CrossNamespaceLinks/CrossNamespaceLinks_body.php r78715
Interact/Interact.php
Deleted
RefreshSpecial/RefreshSpecial.body.php Doesn't implement a querypage
Configure/settings/Settings.i18n.php Configure/settings/Settings-core.php
False Positive
EditConflict/CurrentEdits.php //Uses getIntervalResults

Branch reintegrated into trunk in r78786/r78788