Page MenuHomePhabricator

DBQ-211 Chronology of edits on the Tagalog Wikipedia, et al.
Closed, ResolvedPublic

Description

This issue was converted from https://jira.toolserver.org/browse/DBQ-211.
Summary: Chronology of edits on the Tagalog Wikipedia, et al.
Issue type: Task - A task that needs to be done.
Priority: Major
Status: Done
Assignee: Hoo man <hoo@online.de>


From: Josh Lim <jamesjoshualim@yahoo.com>

Date: Sun, 03 Nov 2013 09:31:39

I would like to request for the following information from the database as I am currently reconstructing the history of the Tagalog Wikipedia in preparation for its tenth anniversary on December 1, 2013:

  • Oldest surviving edit in the database
  • First article created which is not the Main Page
  • 10th, 25th, 50th, 75th, 100th, 200th, 250th, 500th, 750th, 1000th, 2000th, 2500th, 5000th, 7500th, 10000th and 15000th articles created
  • First image uploaded to the local database

Thanks and I hope I'll be able to get this information soon! It would mean a lot to us in the Tagalog Wikipedia community if we can effectively reconstruct our history. ![][1]

[1]: https://jira.toolserver.org/images/icons/emoticons/smile.gif

Version: unspecified
Severity: major

Details

Reference
bz59494

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:34 AM
bzimport set Reference to bz59494.

From: Hoo man <hoo@online.de>

Date: Thu, 07 Nov 2013 22:45:37

Here you go (SQL for each result included below):

Oldest surviving edit in the database:
https://tl.wikipedia.org/w/index.php?title=Unang_Pahina&oldid=1
https://tl.wikipedia.org/w/index.php?oldid=8120 (First non-mainpage edit)
https://tl.wikipedia.org/w/index.php?oldid=10 (First non-mainpage article edit)

SELECT * FROM revision ORDER BY rev_timestamp ASC LIMIT 15;

First article created which is not the Main Page:
https://tl.wikipedia.org/wiki/Wikipedia

SELECT page_title FROM page INNER JOIN (SELECT MIN(rev_timestamp) as time, rev_page as id FROM revision GROUP BY rev_page) AS page_creations ON page_creations.id = page_id WHERE page_namespace = 0 ORDER BY page_creations.time ASC LIMIT 5;

First image uploaded to the local database:
https://tl.wikipedia.org/wiki/Talaksan:Ph_seal_batanes.png (that's the oldest, even considering deleted images)

SELECT * FROM image ORDER BY img_timestamp ASC LIMIT 1;

nth Articles:
Kalakhang_Maynila
Isla
1946
Setyembre_7
Oktubre_2
Disyembre_6
Emilio_F._Aguinaldo
Pamantasang_De_La_Salle
Pope_John_Paul_II
Juan_Bautista_ng_La_Salle
Ṭabariyyah
Franz_Rosenzweig
Ralph_Brambles
Syudad_ng_Davao
Sanitasyon
Adversary_(komiks)

SELECT page_title FROM page INNER JOIN (SELECT MIN(rev_timestamp) as time, rev_page as id FROM revision GROUP BY rev_page) AS page_creations ON rev_page = page_id WHERE page_namespace = 0 ORDER BY page_creations.time ASC LIMIT {n-1},1;

From: Josh Lim <jamesjoshualim@yahoo.com>

Date: Fri, 08 Nov 2013 04:36:04

Thanks a lot! ![][1]

I'd just like to ask though: does the nth articles list include redirects? Some of these articles (like Syudad ng Davao and Isla) might be redirects (to Lungsod ng Davao/Dabaw and Pulo, respectively), so I was hoping that the list generated is accurate.

[1]: https://jira.toolserver.org/images/icons/emoticons/smile.gif

From: Hoo man <hoo@online.de>

Date: Fri, 08 Nov 2013 07:23:17

The list does indeed include redirects, do you want it without?


From: Josh Lim <jamesjoshualim@yahoo.com>

Date: Fri, 08 Nov 2013 07:52:52

Yes, please. ![][1]

Also, will it be possible to ask for year-end article counts per year as well? I'm just trying to cover my bases statistics-wise.

[1]: https://jira.toolserver.org/images/icons/emoticons/smile.gif

From: Hoo man <hoo@online.de>

Date: Fri, 08 Nov 2013 18:27:15

nth articles (excluding redirects):
Enero
Hapon
Emilio_Aguinaldo
Setyembre_15
Oktubre_10
Disyembre_15
Agham_pangkompyuter
Marso_1
DeBarge
Gitnang_Kabisayaan
Demosthenes
Barbara_Benitez
Max_Surban
Herminia_Roman
Memphis
Matamis_(paglilinaw)

SELECT /* SLOW_OK */ page_title FROM page INNER JOIN (SELECT /* SLOW_OK */ MIN(rev_timestamp) as time, rev_page as id FROM revision GROUP BY rev_page) AS page_creations ON page_creations.id = page_id WHERE page_namespace = 0 AND page_is_redirect = 0 ORDER BY page_creations.time ASC LIMIT {n-1},1

Number of articles by the end of each year (from 2004 to 2012):
418
2047
4641
14774
20026
24109
46920
54233
58288

SELECT COUNT(*) FROM page INNER JOIN (SELECT MIN(rev_timestamp) as time, rev_page as id FROM revision GROUP BY rev_page) AS page_creations ON page_creations.id = page_id WHERE page_namespace = 0 AND page_is_redirect = 0 AND LEFT( page_creations.time, 4) <= {year};

This bug was imported as RESOLVED. The original assignee has therefore not been
set, and the original reporters/responders have not been added as CC, to
prevent bugspam.

If you re-open this bug, please consider adding these people to the CC list:
Original assignee: hoo@online.de
CC list: jamesjoshualim@yahoo.com, hoo@online.de