Page MenuHomePhabricator

imagetransfer.py has a unicode bug
Closed, ResolvedPublic

Description

imagetransfer.py has bug I run the command in below:

python pwb.py imagetransfer -family:wikipedia -lang:en 'File:John_Fowles.jpg' -tolang:fa -tofamily:wikipedia -keepname

bot shows this error:

Traceback (most recent call last):

File "pwb.py", line 171, in <module>
  run_python_file(fn, argv, argvu)
File "pwb.py", line 69, in run_python_file
  exec(compile(source, filename, "exec"), main_mod.__dict__)
File "scripts/imagetransfer.py", line 341, in <module>
  main()
File "scripts/imagetransfer.py", line 338, in main
  bot.run()
File "scripts/imagetransfer.py", line 291, in run
  self.transferImage(imagelist[todo])
File "scripts/imagetransfer.py", line 186, in transferImage
  description += '\n\n' + str(sourceImagePage.getFileVersionHistoryTable())

UnicodeEncodeError: 'ascii' codec can't encode character u'\u2013' in position 264: ordinal not in range(128)
<type 'exceptions.UnicodeEncodeError'>
CRITICAL: Waiting for 1 network thread(s) to finish. Press ctrl-c to abort


Version: core-(2.0)
Severity: normal

Details

Reference
bz71399

Event Timeline

bzimport raised the priority of this task from to Needs Triage.Nov 22 2014, 3:46 AM
bzimport set Reference to bz71399.
bzimport added a subscriber: Unknown Object (????).

Str() at line 186 should be removed or be after try:

gerritadmin wrote:

Change 163530 had a related patch set uploaded by XZise:
[FIX] Imagetransfer: Use Unicode history table

https://gerrit.wikimedia.org/r/163530

(In reply to Gerrit Notification Bot from comment #2)

Change 163530 had a related patch set uploaded by XZise:
[FIX] Imagetransfer: Use Unicode history table

https://gerrit.wikimedia.org/r/163530

also please add u before "\r\n\r\n" at

description += "\r\n\r\n" + unicode(sourceImagePage)

Afaik this is not necessary: 'foo' + u'bar' will automatically parse 'foo' as an unicode. But did you get an exception there too? I've included another solution because unicode() is not available anymore in Python 3 so I just use the formatter which automatically calls unicode.

I think he refers to:
description += "\r\n\r\n" + unicode(sourceImagePage)

Oh yeah I thought I'd uploaded a new version of that fix which included that.

So there you go. I had corrupted the commit message so git-review wasn't going to upload it.

gerritadmin wrote:

Change 163530 merged by jenkins-bot:
[FIX] Imagetransfer: Don't use str()

https://gerrit.wikimedia.org/r/163530