Page MenuHomePhabricator

DBQ-11 Pages for users that don't exist
Closed, DeclinedPublic

Description

This issue was converted from https://jira.toolserver.org/browse/DBQ-11.
Summary: Pages for users that don't exist
Issue type: Task - A task that needs to be done.
Priority: Major
Status: Done
Assignee: Autocracy <jeff@storyinmemo.com>


From: MZMcBride <mzmcbride@gmail.com>

Date: Wed, 06 Feb 2008 22:27:42

If possible, I would like to get a list of all pages in NS:2 and NS:3 on en.wiki that are not redirects and that do not correspond to a registered user.

For example, if a page existed for [[User talk:MZMcbride]] that someone accidentally (or intentionally) created, it would be listed in the results of the query because User:MZMcbride does not exist on en.wiki.

Also, if possible, it would list subpages if their roots do not exist. So, if a page like [[User:MZMcbride/sandbox]] existed, it would be listed in the results of the query.

Thanks!


Version: unspecified
Severity: major

Details

Reference
bz59264

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:19 AM
bzimport set Reference to bz59264.

From: Kylu <kylu@ts.wikimedia.org>

Date: Thu, 07 Feb 2008 06:19:37

Exceptions:

  • IP pages (no registered user), yet show "IP" pages that use invalid octets.
  • Nonregistered user pages that are redirects are useful and a list also generated:

Instead of simply creating a redirect page, users should be encouraged to either: 1) register the account as a doppleganger, or 2) spell other users' names correctly ![][1]

[1]: https://jira.toolserver.org/images/icons/emoticons/smile.gif

From: Bryan Tong Minh <bryan@tools.wikimedia.de>

Date: Thu, 07 Feb 2008 19:47:16

I executed the following query:

SELECT CONCAT('* [[{{subst:ns:', page_namespace, '}}:', page_title, ']]') FROM page LEFT JOIN user ON page_title = user_name WHERE page_namespace IN (2, 3) AND user_name IS NULL AND INSTR(page_title, '/') = 0;

It however had some more results than I expected; uncompressed the file is almost 70 MB. Anyway, see the results here:

http://tools.wikimedia.de/~bryan/stats/dbquery/no_user_name.txt.bz2


From: MZMcBride <mzmcbride@gmail.com>

Date: Fri, 08 Feb 2008 00:30:04

It seems that the list generated includes both redirects and user pages of users that exist. For example, it includes pages like [[User:JJ the Crusader]] and http://en.wikipedia.org/w/index.php?title=User%3A%D0%A1%D0%B0%D1%81%D1%83%D1%81l%D0%B5&redirect=no .


From: Bryan Tong Minh <bryan@tools.wikimedia.de>

Date: Fri, 08 Feb 2008 13:06:04

Oh yes, forgot... page_title is having spaces replaced by underscores, while user_name isn't. Running new query and and filtering redirects.


From: Bryan Tong Minh <bryan@tools.wikimedia.de>

Date: Sun, 10 Feb 2008 19:14:39

Query is too heavy. I will consider doing this query again once everything is stable again.


From: MZMcBride <mzmcbride@gmail.com>

Date: Wed, 13 Feb 2008 00:29:51

Hmm... seems this query is simply too heavy. Could I get a list of all pages (subpages included) in NS:2 and NS:3 on en.wiki instead? It can be in any format, though it may be preferable to split the files. Whichever way works.


From: Autocracy <jeff@storyinmemo.com>

Date: Mon, 10 Mar 2008 13:09:34

Spoke with the requester; says he got what he needed in the end. Moving to "Resolved."


From: Autocracy <jeff@storyinmemo.com>

Date: Mon, 10 Mar 2008 13:10:09

User got the list he requested at the end. Is resolved.

This bug was imported as RESOLVED. The original assignee has therefore not been
set, and the original reporters/responders have not been added as CC, to
prevent bugspam.

If you re-open this bug, please consider adding these people to the CC list:
Original assignee: bugzilla@storyinmemo.com
CC list: b@mzmcbride.com, Bryan.TongMinh@Gmail.com, bugzilla@storyinmemo.com