Page MenuHomePhabricator

pagegenerators.GeneratorFactory() uses wrong site if instantiated before command line args have been processed
Closed, ResolvedPublic

Description

In the current code, pywikibot.handleArgs() must be called before pagegenerators.GeneratorFactory(), as pagegenerators.GeneratorFactory.__init__() calls pywikibot.Site().

If pagegenerators.GeneratorFactory() is called first, the default site per user-config is used, and command line args (-family -lang -user) are ignored. See bug 63800.

This could be almost completely fixed by changing GeneratorFactory.site to be a property, loaded on access. That prevents the typical coding bug which look like:

genFactory = pagegenerators.GeneratorFactory()
for arg in pywikibot.handleArgs():
  if genFactory.handleArg(arg):
     pass

The current solution is to use and promote the pattern:

local_args = pywikibot.handleArgs()
genFactory = pagegenerators.GeneratorFactory()
for arg in local_args:
  if genFactory.handleArg(arg):
     pass

However it doesnt prevent this:

genFactory = pagegenerators.GeneratorFactory()
genFactory.handleArg('-file:' + filename):
...
pywikibot.handleArgs()

One way to prevent that is to raise an exception in pywikibot.handleArgs if it is called after pywikibot.Site() has instantiated a default site, and possibly only if -family/-lang/-user are supplied on the command line.

Another approach (very dodgy) is for pywikibot to know which Site object is the 'default' site, and pywikibot.handleArgs() change that object if -family/-lang/-user are supplied on the command line.

See Also: T65800

Event Timeline

bzimport raised the priority of this task from to Medium.Nov 22 2014, 3:58 AM
bzimport set Reference to bz72120.
bzimport added a subscriber: Unknown Object (????).

Enforcing the order of initialisation by adding an exception in pywikibot.handle_args() (previously: handleArgs) or in GeneratorFactory.init() would need to be post 2.0

A better pattern to use and promote is:

local_args = pywikibot.handleArgs()
site = pywikibot.Site()
genFactory = pagegenerators.GeneratorFactory(site)
for arg in local_args:
  if genFactory.handleArg(arg):
     pass
  ...

As that shows the fact that genFactory needs a site object.

gerritadmin wrote:

Change 166942 had a related patch set uploaded by John Vandenberg:
Use correct site for pagegenerators

https://gerrit.wikimedia.org/r/166942

gerritadmin wrote:

Change 166942 merged by jenkins-bot:
Use correct site for pagegenerators

https://gerrit.wikimedia.org/r/166942

The last patch is almost a 100% solution, but it doesnt ensure pywikibot.Site() is only being called after handle_args.

I suppose this will be fixed by T76429, am I wrong?

My current implementation of T76429 is reading all parameters (including) at once and it usually handles also the pagegenerators automatically. But it doesn't prevent this:

genFactory = pagegenerators.GeneratorFactory()
genFactory.handleArg('-file:' + filename):
...
pywikibot.handleArgs()

But using handleArg isn't recommended with my patch anymore and I'm not sure about that usage.

jayvdb lowered the priority of this task from Medium to Low.Jan 30 2015, 6:47 AM

Change 209740 had a related patch set uploaded (by John Vandenberg):
Set sysopname to also be -user: value

https://gerrit.wikimedia.org/r/209740

Change 209740 merged by jenkins-bot:
Remove Site object creation during script imports

https://gerrit.wikimedia.org/r/209740

Change 246128 had a related patch set uploaded (by John Vandenberg):
Remove panoramiopicker Site object creation

https://gerrit.wikimedia.org/r/246128

Change 246128 merged by jenkins-bot:
Remove panoramiopicker Site object creation

https://gerrit.wikimedia.org/r/246128

Xqt claimed this task.
jayvdb updated the task description. (Show Details)
jayvdb added a subscriber: Xqt.