And the issue with HTML escaping
[
Date Prev][
Date Next]
[
Thread Prev][
Thread Next]
And the issue with HTML escaping
- From: Devin Bayer <l [at] t-0.be>
- To: webcheck-users <webcheck-users [at] lists.arthurdejong.org>
- Subject: And the issue with HTML escaping
- Date: Wed, 9 Nov 2011 14:03:23 +0100
webcheck: INFO: http://www.highresolution.info/
webcheck: DEBUG: parsing using webcheck.parsers.html
webcheck: DEBUG: crawler.Link.set_encoding('utf-8')
webcheck: DEBUG: html encoding: utf-8
webcheck: WARNING: page has unknown encoding: utf-8
webcheck: ERROR: problem parsing page: decoding Unicode is not supported
Traceback (most recent call last):
File "/home/dev/linkcheck/webcheck/webcheck/crawler.py", line 372, in parse
parsermodule.parse(content, link)
File "/home/dev/linkcheck/webcheck/webcheck/parsers/html/__init__.py", line
118, in parse
_parsefunction(content, link)
File "/home/dev/linkcheck/webcheck/webcheck/parsers/html/htmlparser.py", line
292, in parse
link.author = _maketxt(parser.author, link.encoding).strip()
File "/home/dev/linkcheck/webcheck/webcheck/parsers/html/htmlparser.py", line
265, in _maketxt
return htmlunescape(unicode(txt, errors='replace'))
TypeError: decoding Unicode is not supported
--
To unsubscribe send an email to
webcheck-users-unsubscribe@lists.arthurdejong.org or see
http://lists.arthurdejong.org/webcheck-users/
- And the issue with HTML escaping,
Devin Bayer