Page MenuHomePhabricator

DBQ-170 problem with getting query
Closed, InvalidPublic

Description

This issue was converted from https://jira.toolserver.org/browse/DBQ-170.
Summary: problem with getting query
Issue type: Task - A task that needs to be done.
Priority: Major
Status: Done
Assignee: Hoo man <hoo@online.de>


From: reza <reza.energy@gmail.com>

Date: Sun, 04 Dec 2011 23:20:26

I used this script for fa.wiki

SELECT cur_title FROM cur WHERE cur_text LIKE "%http://en.wikipedia.org/%" AND cur_text NOT LIKE "%en:%" AND cur_namespace=0

but it doesn't show correct pages, some pages doesn't have wikipedia url or some pages are redirect!
would you please solve my problem?


Version: unspecified
Severity: major

Details

Reference
bz59439

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:31 AM
bzimport set Reference to bz59439.

From: Hoo man <hoo@online.de>

Date: Mon, 05 Dec 2011 15:50:41

The table 'cur' has only been used on WMF wikis till mid-2005, so I presume you wont find much useful data their.


From: reza <reza.energy@gmail.com>

Date: Mon, 05 Dec 2011 16:44:40

thank you for your answer
how can I have query that show me pages that they have wikipedia url but they dont have any interwiki?


From: Hoo man <hoo@online.de>

Date: Mon, 05 Dec 2011 16:48:43

You mean pages that have a wikipedia URL in their text? No way from the TS SQL servers, you can only work with the database dumps here.


From: Dispenser <dispenser@toolserver.org>

Date: Tue, 06 Dec 2011 06:49:43

You could try...

SELECT page_title, el_to
FROM page
JOIN externallinks ON  el_from=page_id
/* Plain text interwiki links */
LEFT JOIN iwlinks  ON iwl_from=page_id AND iwl_prefix="w" /* AND iwl_title LIKE "en:%" [[w:]] defaults to ENWP */
/* Language links (from sidebar) */
LEFT JOIN langlinks ON ll_from=page_id AND ll_lang="en"
WHERE page_namespace=0
AND el_to LIKE "http://en.wikipedia.org/wiki/_%"
AND iwl_from IS NULL
AND ll_from IS NULL 
LIMIT 20;

From: reza <reza.energy@gmail.com>

Date: Tue, 06 Dec 2011 11:33:24

Thank you

This bug was imported as RESOLVED. The original assignee has therefore not been
set, and the original reporters/responders have not been added as CC, to
prevent bugspam.

If you re-open this bug, please consider adding these people to the CC list:
Original assignee: hoo@online.de
CC list: reza.energy@gmail.com, dispenser@toolserver.org, hoo@online.de