Page MenuHomePhabricator

-hintfile: option
Closed, DeclinedPublic

Description

Originally from: http://sourceforge.net/p/pywikipediabot/bugs/836/
Reported by: Anonymous user
Created on: 2009-01-23 23:29:52
Subject: -hintfile: option
Assigned to: purodha
Original description:
The newly introduced version -hintfile: is not well-documented or it's not working as expected.

It asks for a page to be checked \(see below\) while \(according to \[2284955\] interwiki hints from file\) it's supposed to read both a local page and a hint page from file. Please fix it. Thanks\!

python interwiki.py -hintfile:
Please enter the hint filename: hints.txt
Which page to check:

Pywikipedia \[http\] trunk/pywikipedia \(r6291, Jan 23 2009, 16:08:14\)
Python 2.5.1 \(r251:54863, Apr 18 2007, 08:51:08\) \[MSC v.1310 32 bit \(Intel\)\]


Version: unspecified
Severity: normal
See Also:
https://sourceforge.net/p/pywikipediabot/bugs/836

Details

Reference
bz55313

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 2:33 AM
bzimport set Reference to bz55313.
bzimport added a subscriber: Unknown Object (????).
  • assigned_to: nobody --> purodha

What you want to have, in the above example, can be had with:

python interwiki.py -v -hintfile: -file:
Pywikipediabot \(r6439 \(wikipedia.py\), Feb 24 2009, 21:48:26\)
Python 2.5.2 \(r252:60911, Jan 4 2009, 21:59:32\)
\[GCC 4.3.2\]
Please enter the hint filename: hints.txt
Please enter the local file name: local-page-title.txt

There is no documentation saying that -hintfile: was overriding or altering
the processing of any other parameter \(and in fact, it does not\)

Be aware that it is hardly useful to have a file with several page titles given
via -file: when -hintfile: is being used, since hints would apply to each of those
pages, provoking interwiki conflicts.
Thus -hintfile: is likely more often used with a singe page title on the command
line. That does not preclude, however, a single page title being read from a file
using -file:

If, and only if, the file given via -hintfile: has only unspecific hints, such as \[\[10:\]\]
or \[\[en:\]\] or \[\[latin:\]\], \(or all specific hinted pages do not exist\) then supplying a
list of pages via -file: would be likely free of conflicts.

There is a difference between hints and the page being processed. While for the
outcome, in properly preset cases, it is often irrelevant where the bot starts
processing, and which pages are then added because hinted, for the paths the
bot follows while collecting links, it does make a huge difference sometimes.
We can have hintless processing, but we cannot have a bot run on hints alone,
without a starting page.

Maybe we should add some of these to the documentation? Is that, which you
are asking for?

No, it's not exactly what I asked for. In the original feature request \#2284955 \[http://sourceforge.net/tracker/index.php?func=detail&aid=2284955&group\_id=93107&atid=603141\], as far as I can see, the idea was to read both starting pages and hints from the same file, line per line, and to make an array of pages to be processed and relevant hints.

\# \[\[:xx:page\_without\_interwiki\]\] \[\[:en:English\_page\_used\_as\_a\_hint\]\]

Working on a single page with -hintfile option doesn't seem to be that useful.

I guess we need to combine "TextfilePageGenerator" from pagegenerators.py and "hintfile" from interwiki.py, so that both the page title and the hint are read, line by line, from the same hintfilename - page title from the first pair of brackets \[\[\]\], and the hint - from the second pair of brackets in the same line within hintfile. Is it possible to implement this, please?

this simple code should be working for this purpose

f = codecs.open\(hintfilename, 'r', config.textfile\_encoding\)
R = re.compile\(ur'\\\[\\\[\:?\(.\*?\)\\\]\\\]\s+\\\[\\\[\:?\(.\*\)\\\]\\\]'\)
for line in R.findall\(f.read\(\)\):
pageTitle = line\[0\]
hintTitle = line\[1\]

just make a proper call to

yield wikipedia.Page\(site, pageTitle\)

and

hints.append\(hintTitle\)

anyone out there to take care of this?

Xqt triaged this task as Low priority.Apr 18 2016, 8:15 AM

interwiki.py is used very rare yet. Feel free to reopen if you need it implemented.