Page MenuHomePhabricator

DBQ-64 list of pages by number of deleted edits in the English Wikipedia
Closed, ResolvedPublic

Description

This issue was converted from https://jira.toolserver.org/browse/DBQ-64.
Summary: list of pages by number of deleted edits in the English Wikipedia
Issue type: Task - A task that needs to be done.
Priority: Major
Status: Done
Assignee: (none)


From: Graham <grahamwp@jazi.net>

Date: Sat, 30 May 2009 04:49:19

I'd like a list, in descending order of edits, of pages with more than 100 deleted edits in the English Wikipedia, in a plain text file. I would like to use the results of this query to find out if there are pages where significant history has been deleted due to page moves, like what happened at the English Wikipedia article Turin:
http://en.wikipedia.org/w/index.php?title=Special:Log&page=Turin


Version: unspecified
Severity: major

Details

Reference
bz59319

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:22 AM
bzimport set Reference to bz59319.

From: MZMcBride <mzmcbride@gmail.com>

Date: Sat, 30 May 2009 14:43:27

After clarification on my talk page (cf. http://en.wikipedia.org/w/index.php?diff=293282255&oldid=293258934#Database_report_idea), I should be able to do this. Currently running the first query right now.


From: MZMcBride <mzmcbride@gmail.com>

Date: Sun, 31 May 2009 20:22:57

Three queries were run. Be warned, the first two text files are very large; you'll want to download them and open them in a proper text editor.

SELECT
  ar_title,
  COUNT(*)
FROM archive
RIGHT JOIN page
ON ar_namespace = page_namespace
AND ar_title = page_title
WHERE ar_namespace = 0
GROUP BY ar_namespace, ar_title
ORDER BY COUNT(*) DESC;

Results available here: http://toolserver.org/~mzmcbride/dbq/dbq-64-current-articles-only.txt

SELECT
  CONCAT('Talk:',ar_title) AS title,
  COUNT(*)
FROM archive
RIGHT JOIN page
ON ar_namespace = page_namespace
AND ar_title = page_title
WHERE ar_namespace = 1
GROUP BY ar_namespace, ar_title
ORDER BY COUNT(*) DESC;

Results available here: http://toolserver.org/~mzmcbride/dbq/dbq-64-current-talk-pages-only.txt

SELECT
  ns_name,
  ar_title,
  COUNT(*)
FROM archive
LEFT JOIN page
ON ar_namespace = page_namespace
AND ar_title = page_title
JOIN toolserver.namespace
ON ar_namespace = ns_id
AND dbname = "enwiki_p"
GROUP BY ar_namespace, ar_title
HAVING COUNT(*) > 999
ORDER BY COUNT(*) DESC;

Results available here: http://toolserver.org/~mzmcbride/dbq/dbq-64-all-pages-1000-or-more.txt

This bug was imported as RESOLVED. The original assignee has therefore not been
set, and the original reporters/responders have not been added as CC, to
prevent bugspam.

If you re-open this bug, please consider adding these people to the CC list:
Original assignee: (none)
CC list: b@mzmcbride.com, grahamwp@gmail.com