Page MenuHomePhabricator

Query page to list protected pages
Closed, ResolvedPublic

Description

Author: roybb95

Description:
It would be very usefull if there was special page with a list of all protected
pages.


Version: unspecified
Severity: enhancement

Details

Reference
bz2171

Related Objects

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 21 2014, 8:29 PM
bzimport set Reference to bz2171.

avarab wrote:

Doing this with the current schema (1.4 and 1.5) is obscenely slow, the
(?:page|cur)_restrictions wasn't designed to be mass searched.

On the 2005-03-09 en.wikipedia dump:
mysql> SELECT cur_title, cur_restrictions FROM cur WHERE cur_restrictions LIKE
'%edit=sysop%';
[...]
297 rows in set (5 min 44.24 sec)

Would need to add an index on that field, I suppose.

An alternative is splitting the protection to a linked table (perhaps useful for getting at the
multiple settings in a clearer fashion than the blob we have now).

  • Bug 2608 has been marked as a duplicate of this bug. ***

stanley wrote:

is there any progress?

A friend of mine made the special page and it looks good IMO (see
http://conversion.vikimedija.org/index.php/Special:Protectedpages). The only
problem Avar suggested to me on the IRC is that the page doesn't support
caching, but I think that's not a hard thing to do. Funny thing - I didn't know
this bug even existed, so I asked Avar to make it. Now there really is a feature
request. Great.

*** Bug 4577 has been marked as a duplicate of this bug. ***

p_simoons wrote:

As reported on ANI, three admins work daily to keep the manually-generated
[[Wikipedia:Protected page]] up to date with the current situation. Automation
would apparently be a useful timesaver.

  • Bug 6211 has been marked as a duplicate of this bug. ***

And what about my code from Bug 6211? It does not search in page_restrictions, only
checks length. Is it faster? Can anyone with dumps check?

koneko wrote:

Now, the same could be asked for the semi-protected pages ;)

If there's a problem about slowness, IMO it's still useful even if the page is
only upgraded every few days or so

rotemliss wrote:

Patch

This is a complete patch, ready for commiting. I used the following query:

SELECT 'Protectedpages' AS type, page_namespace AS namespace, page_title AS
title, page_restrictions AS restrictions FROM $page WHERE page_restrictions <>
'' AND page_namespace <> $mediawiki

($page = 'page' normally, $mediawiki = NS_MEDIAWIKI)

and marked it as expensive, because page_restrictions is not (yet?) marked as
index. Running the query on a recent dump of dewiki (couldn't set up enwiki,
too slow to add all these rows) on my computer resulted:

14919 rows in set (8.29 sec)

It seems to be generally OK (my computer is a bit slow - a former test resulted
about 7 seconds), but definitely expensive. For comparison, running the query
of Special:Shortpages on dewiki resulted:

482777 rows in set (6.19 sec)

Two questions:

  1. Should I define the caching in some other place?
  2. Is it the correct query, or it should be changed somehow?

Thanks.

attachment patch ignored as obsolete

I currently run the bot that has maintains en wikipedia's list of protected
pages for while now. Such a MW feature might be useful, but it will need
protection type filtering for:
*Semi-protected (edit)
*Full-protected (edit)
*Full-protected (move only)

Also, things like redirects and deleted pages (this one might be tricky) would
need to be filtered out. Deleted pages used to *flood* WP:PP until I added a
regexp xml filter check for each page (kind of tedious though) which also checks
for redirects while its at it.

Without these two sets of filters (protection type and redir/DP type) such a
page will be of very limited use on a large wikipedia site, likely not worth
being expensive. However this would be an effective replacement for WP:PP on
smaller wiki sites.

rotemliss wrote:

*** Bug 7575 has been marked as a duplicate of this bug. ***

cmhagar wrote:

As mentioned here,
http://mail.wikipedia.org/pipermail/wikitech-l/2006-October/039204.html, this
page should perhaps be restricted to sysops, at least for semi-protected pages.

Perhaps a page_len column or a joint query would help to allow filtering which
pages show on the list to avoid deleted pages from drowing all out.

rotemliss wrote:

Comment on attachment 2463
Patch

page.page_restrictions is no longer used (the table page_restrictions is used
instead). A new patch should be created.

rotemliss wrote:

Werdna added a special page, Special:Protectedpages, in r19561.