En planet hasnt updated since march 2
Version: wmf-deployment
Severity: normal
En planet hasnt updated since march 2
Version: wmf-deployment
Severity: normal
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 32: ordinal not in range(128)
hrm... Django/utf-8/something along these lines:
i got it to update once, so now it's March 07, but after that i ran into the same issue again. keep open
could be fixed (for now) by deleting all content from the cache directory and re-running:
root@zirconium:/var/cache/planet/en/
rm *
sudo -u planet /usr/bin/planet -v /usr/share/planet-venus/wikimedia/en/config.ini
this also fixed the atom link http://en.planet.wikimedia.org/atom.xml
dzahn reran Comment #4 and it's still stuck:
13 05:55:55 < jeremyb_> mutante: did planet finish?
13 05:56:50 < mutante> unfortunately, no. UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 32: ordinal not in range(128)
INFO:planet.runner:Loading cached data
Traceback (most recent call last):
File "/usr/bin/planet", line 138, in <module> splice.apply(doc.toxml('utf-8')) File "/usr/lib/pymodules/python2.7/planet/splice.py", line 118, in apply output_file = shell.run(template_file, doc) File "/usr/lib/pymodules/python2.7/planet/shell/__init__.py", line 66, in run module.run(template_resolved, doc, output_file, options) File "/usr/lib/pymodules/python2.7/planet/shell/tmpl.py", line 254, in run for key,value in template_info(doc).items(): File "/usr/lib/pymodules/python2.7/planet/shell/tmpl.py", line 193, in template_info data=feedparser.parse(source) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 3525, in parse feedparser.feed(data) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 1662, in feed sgmllib.SGMLParser.feed(self, data) File "/usr/lib/python2.7/sgmllib.py", line 104, in feed self.goahead(0) File "/usr/lib/python2.7/sgmllib.py", line 143, in goahead k = self.parse_endtag(i) File "/usr/lib/python2.7/sgmllib.py", line 320, in parse_endtag self.finish_endtag(tag) File "/usr/lib/python2.7/sgmllib.py", line 360, in finish_endtag self.unknown_endtag(tag) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 569, in unknown_endtag method() File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 1512, in _end_content value = self.popContent('content') File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 849, in popContent value = self.pop(tag) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 764, in pop mfresults = _parseMicroformats(output, self.baseuri, self.encoding) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 2219, in _parseMicroformats p.vcard = p.findVCards(p.document) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 2161, in findVCards sVCards += '\n'.join(arLines) + '\n'
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 32: ordinal not in range(128)
You should treat all strings in python as unicode not ASCII,
sVCards += '\n'.join(arLines) + '\n'
becomes
sVCards += u'\n'.join(arLines) + u'\n'
(In reply to comment #8)
You should treat all strings in python as unicode not ASCII,
That's surely not the root cause, right?
See http://docs.python.org/2/howto/unicode.html#the-unicode-type for exact duplication of this issue
thank you very much for the reply, but it still does not work when i changed line 2161 in feedparser.py
...
sVCards += u'\n'.join(arLines) + u'\n'
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 8: ordinal not in range(128)
INFO:planet.runner:Loading cached data
Traceback (most recent call last):
File "/usr/bin/planet", line 138, in <module> splice.apply(doc.toxml('utf-8')) File "/usr/lib/pymodules/python2.7/planet/splice.py", line 118, in apply output_file = shell.run(template_file, doc) File "/usr/lib/pymodules/python2.7/planet/shell/__init__.py", line 66, in run module.run(template_resolved, doc, output_file, options) File "/usr/lib/pymodules/python2.7/planet/shell/tmpl.py", line 254, in run for key,value in template_info(doc).items(): File "/usr/lib/pymodules/python2.7/planet/shell/tmpl.py", line 193, in template_info data=feedparser.parse(source) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 3525, in parse feedparser.feed(data) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 1662, in feed sgmllib.SGMLParser.feed(self, data) File "/usr/lib/python2.7/sgmllib.py", line 104, in feed self.goahead(0) File "/usr/lib/python2.7/sgmllib.py", line 143, in goahead k = self.parse_endtag(i) File "/usr/lib/python2.7/sgmllib.py", line 320, in parse_endtag self.finish_endtag(tag) File "/usr/lib/python2.7/sgmllib.py", line 360, in finish_endtag self.unknown_endtag(tag) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 569, in unknown_endtag method() File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 1512, in _end_content value = self.popContent('content') File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 849, in popContent value = self.pop(tag) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 764, in pop mfresults = _parseMicroformats(output, self.baseuri, self.encoding) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 2219, in _parseMicroformats p.vcard = p.findVCards(p.document) File "/usr/lib/pymodules/python2.7/planet/vendor/feedparser.py", line 2161, in findVCards sVCards += u'\n'.join(arLines) + u'\n'
UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 8: ordinal not in range(128)
You need to go through the file and make sure that all references and assignments make sVCards unicode also, otherwise you are just repeating the same error when you attempt to +=
searching through all of feedparser.py, there are just 3 occurences. sVCards = '' in the beginning, the one we changed and the returning it.
for tonight i simply commented line 2161 in feedparser.py
as the temporary fix as i figured this is just because a vCard is found in one feed. This way sVCards should simply be returned empty. ( sVCards = '').
and this made the update work for now. We are back to March 14th on https://en.planet.wikimedia.org/
thanks, i tried. but also sVCards = u'' in line 1949 in addition to that did not fix it yet. back to commenting 2161.
updates work for now, but not the real fix and we should report upstream
and compare this: https://github.com/rubys/venus/blob/master/planet/vendor/feedparser.py to the one we have from planet-venus version "0~bzr116-1" in Ubuntu precise
http://www.intertwingly.net/code/venus/
in http://www.intertwingly.net/code/venus/docs/index.html it links to http://feedparser.org/docs/ but that is a parking domain at godaddy, sigh.
http://www.intertwingly.net/code/venus/AUTHORS
Package: planet-venus
Priority: optional
Section: universe/python
Maintainer: Ubuntu Developers <ubuntu-devel-discuss@lists.ubuntu.com>
Original-Maintainer: Noah Slater <nslater@tumbolia.org>
Version: 0~bzr116-1
(In reply to comment #17)
in http://www.intertwingly.net/code/venus/docs/index.html it links to
http://feedparser.org/docs/ but that is a parking domain at godaddy, sigh.
Try https://code.google.com/p/feedparser/ and http://pythonhosted.org/feedparser/
see also http://feedvalidator.org/
lowering importance to normal/normal because updates are working and en.planet is March 15 due to live hack.
well, all that would be todo here is reporting to upstream.. but we're fine with the hack ...
If there is a clear testcase and if somebody tells me where upstream is I can forward the ticket. https://github.com/rubys/venus/issues ? https://code.google.com/p/feedparser/issues/list ?
upstream is the github link you pasted. feedparser would be upstream for them i suppose. but it might also be a request to Ubuntu to have newer packages.. hrmm