webcheck commit: r453 - webcheck/webcheck
[
Date Prev][
Date Next]
[
Thread Prev][
Thread Next]
webcheck commit: r453 - webcheck/webcheck
- From: Commits of the webcheck project <webcheck-commits [at] lists.arthurdejong.org>
- To: webcheck-commits [at] lists.arthurdejong.org
- Reply-to: webcheck-users [at] lists.arthurdejong.org
- Subject: webcheck commit: r453 - webcheck/webcheck
- Date: Sat, 8 Oct 2011 16:12:31 +0200 (CEST)
Author: arthur
Date: Sat Oct 8 16:12:30 2011
New Revision: 453
URL: http://arthurdejong.org/viewvc/webcheck?revision=453&view=revision
Log:
also handle exceptions while parsing (e.g. issue when reading the response
times out)
Modified:
webcheck/webcheck/crawler.py
Modified: webcheck/webcheck/crawler.py
==============================================================================
--- webcheck/webcheck/crawler.py Sat Oct 8 16:04:03 2011 (r452)
+++ webcheck/webcheck/crawler.py Sat Oct 8 16:12:30 2011 (r453)
@@ -363,14 +363,17 @@
if parsermodule is None:
debugio.debug('crawler.Link.fetch(): unsupported content-type: %s'
% link.mimetype)
return
- # skip parsing of content if we were returned nothing
- content = response.read()
- if content is None:
- return
- # parse the content
- debugio.debug('crawler.Link.fetch(): parsing using %s' %
parsermodule.__name__)
try:
+ # skip parsing of content if we were returned nothing
+ content = response.read()
+ if content is None:
+ return
+ # parse the content
+ debugio.debug('crawler.Link.fetch(): parsing using %s' %
parsermodule.__name__)
parsermodule.parse(content, link)
+ except KeyboardInterrupt:
+ # handle this in a higher-level exception handler
+ raise
except Exception, e:
import traceback
traceback.print_exc()
--
To unsubscribe send an email to
webcheck-commits-unsubscribe@lists.arthurdejong.org or see
http://lists.arthurdejong.org/webcheck-commits/
- webcheck commit: r453 - webcheck/webcheck,
Commits of the webcheck project