Page MenuHomePhabricator

use one library for all http requests
Closed, ResolvedPublic

Description

requests has been chosen as the http library for pywikibot v3.0 master.

There are a few cases of urllib.urlopen (and others) being used in the pywikibot library code, and a number of scripts which use other http request routines.
Multiple routines results in multiple configuration (e.g. proxy) and multiple sets of possible bugs/errors.

All http activity should be provided by utility methods in pywikibot.comms.http, so it is easy to test and support them, and possibly use a different http library in the future if necessary.

See Also: T71204

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:11 AM
bzimport set Reference to bz66102.

The may be issues with using httplib2 for large downloads, like are possible in upload.py.

https://github.com/jcgregorio/httplib2/issues/224

A fork has been created for that, and distributed caching.

https://github.com/madlag/streaming_httplib2

site.py & weblib.py use 'import urllib', but for urlencode

urllib:
pywikibot/page.py:1841: f = urllib.urlopen(self.fileUrl())
pywikibot/version.py:199: buf = urllib.urlopen(url).readlines()

scripts/upload.py
scripts/flickrripper.py
scripts/checkimages.py
scripts/weblinkchecker.py
scripts/imagerecat.py
scripts/maintenance/wikimedia_sites.py
scripts/data_ingestion.py

urllib2:
scripts/reflinks.py

httplib (not httplib2):
pywikibot/version.py:123: conn = httplib.HTTPSConnection('github.com')

scripts/weblinkchecker.py
scripts/reflinks.py

Change 152200 had a related patch set uploaded by John Vandenberg:
HTTP requests with user-agent without version

https://gerrit.wikimedia.org/r/152200

Change 153300 had a related patch set uploaded by John Vandenberg:
Replace httplib and urllib with httplib2

https://gerrit.wikimedia.org/r/153300

Change 152200 merged by jenkins-bot:
User-agent graceful degradation

https://gerrit.wikimedia.org/r/152200

Change 153300 merged by jenkins-bot:
Replace httplib and urllib with httplib2

https://gerrit.wikimedia.org/r/153300

version.py now uses httplib2.

In addition to the list above, generate_family_file.py also uses urllib2

https://github.com/ross/python-asynchttp might be a good solution, but it doesnt appear to be very active

Another httplib2 fork, which says it provides streaming: https://github.com/fffonion/httplib2-plus

Also, we have a patch to switch to python-requests: https://gerrit.wikimedia.org/r/#/c/189821/

Change 208479 had a related patch set uploaded (by XZise):
[IMPROV] Replace openurl with http.fetch

https://gerrit.wikimedia.org/r/208479

Change 208479 merged by jenkins-bot:
[IMPROV] Replace openurl with http.fetch

https://gerrit.wikimedia.org/r/208479

jayvdb set Security to None.
jayvdb triaged this task as High priority.Jan 10 2016, 10:48 AM
jayvdb removed a project: Patch-For-Review.

Change 281131 had a related patch set uploaded (by Xqt):
[bugfix] bugfixes and improvements for checkimages

https://gerrit.wikimedia.org/r/281131

Change 281673 had a related patch set uploaded (by Xqt):
[bugfix] bugfixes and improvements for checkimages

https://gerrit.wikimedia.org/r/281673

Two new possible GCI tasks T184360 & T184361 . Need more analysis done, and expanding the description to ensure that the work is of a high quality.

Xqt raised the priority of this task from High to Needs Triage.Feb 3 2019, 12:27 PM
Xqt moved this task from Needs Review to Backlog on the Pywikibot board.
Xqt triaged this task as High priority.Feb 4 2019, 5:27 AM

I would focus on this task to avoid issues with urllib, urllib2, urllib3, httplib etc., which might be a cause for many other bugs we encounter

Xqt removed jayvdb as the assignee of this task.Jan 27 2020, 7:53 PM

Change 588145 had a related patch set uploaded (by Xqt; owner: Xqt):
[pywikibot/core@master] [bugfix] Re-enable script test for imageharvest.py

https://gerrit.wikimedia.org/r/588145

Change 588145 merged by jenkins-bot:
[pywikibot/core@master] [bugfix] Re-enable script test for imageharvest.py

https://gerrit.wikimedia.org/r/588145

Xqt claimed this task.