Page MenuHomePhabricator

DBQ-185 wanted pages in fa.wikipediad
Closed, ResolvedPublic

Description

This issue was converted from https://jira.toolserver.org/browse/DBQ-185.
Summary: wanted pages in fa.wikipediad
Issue type: Task - A task that needs to be done.
Priority: Major
Status: Done
Assignee: Hoo man <hoo@online.de>


From: reza <reza.energy@gmail.com>

Date: Mon, 28 May 2012 07:55:07

Hi,
I tried to get query but it was not possible
would you please get a query that shows
redlink number
yhank for your time


Version: unspecified
Severity: major

Details

Reference
bz59466

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:33 AM
bzimport set Reference to bz59466.

From: Hoo man <hoo@online.de>

Date: Mon, 28 May 2012 20:30:11

I hope I understand that right, you want the number of not existing articles? The following query get's the number of articles (namespace 0), which are linked from the article namespace, but don't exist.
SQL:

mysql> SELECT /* SLOW_OK */ COUNT(*) FROM (SELECT DISTINCT pl_title FROM pagelinks INNER JOIN page ON pl_from = page_id WHERE pl_title NOT IN(SELECT page_title FROM page WHERE page_namespace = 0) AND pl_namespace = 0 AND page_namespace = 0) AS innerQuery;
+----------+
| COUNT(*) |
+----------+
|   425033 |
+----------+
1 row in set (10 min 23.63 sec)

From: reza <reza.energy@gmail.com>

Date: Mon, 28 May 2012 20:39:01

I wanted special:wanted_pages query (red links with number of their uses in wiki)
i wanted to get them but the sql killed by system because it take many resources.
I used this code

SELECT /* SLOW_OK */ pl_title,count(pl_title) FROM pagelinks LEFT JOIN page ON pl_title = page_title WHERE page_title IS NULL AND pl_namespace=0 GROUP BY pl_title ORDER BY count(pl_title) DESC;


From: Hoo man <hoo@online.de>

Date: Mon, 28 May 2012 21:34:33

Oh, sure...
SQL:

SELECT pl_title AS name, COUNT(*) AS link_count FROM pagelinks INNER JOIN page ON pl_from = page_id WHERE pl_title NOT IN(SELECT page_title FROM page WHERE page_namespace = 0) AND pl_namespace = 0 AND page_namespace = 0 GROUP BY pl_title ORDER BY COUNT(*) DESC;

Result:
http://toolserver.org/~hoo/dbq/dbq-185-txt

This bug was imported as RESOLVED. The original assignee has therefore not been
set, and the original reporters/responders have not been added as CC, to
prevent bugspam.

If you re-open this bug, please consider adding these people to the CC list:
Original assignee: hoo@online.de
CC list: reza.energy@gmail.com, hoo@online.de