webcheck commit: r469 - webcheck
[
Date Prev][
Date Next]
[
Thread Prev][
Thread Next]
webcheck commit: r469 - webcheck
- From: Commits of the webcheck project <webcheck-commits [at] lists.arthurdejong.org>
- To: webcheck-commits [at] lists.arthurdejong.org
- Reply-to: webcheck-users [at] lists.arthurdejong.org
- Subject: webcheck commit: r469 - webcheck
- Date: Wed, 16 Nov 2011 13:28:09 +0100 (CET)
Author: devin
Date: Wed Nov 16 13:28:08 2011
New Revision: 469
URL: http://arthurdejong.org/viewvc/webcheck?revision=469&view=revision
Log:
update NEWS, README and HACKING
Modified:
webcheck/HACKING
webcheck/NEWS
webcheck/README
Modified: webcheck/HACKING
==============================================================================
--- webcheck/HACKING Wed Nov 16 13:07:51 2011 (r468)
+++ webcheck/HACKING Wed Nov 16 13:28:08 2011 (r469)
@@ -6,20 +6,21 @@
function. This graphs should present a simple overview of the modules and
order of calling the functions.
-webcheck.py - main program, command line parsing, etc
+webcheck/ - top-level namespace
+ \- cmd.py - main program entry point, command line parsing,
etc
\- config.py - configuration settings (imported from most other
| modules)
- \- debugio.py - functions for printing output (imported from
- | most other modules)
+ \- util.py - common functions imported from most other modules
+ |
\- crawler.py - module with loop and logic for traversing a
| | website and storing all the information about
| | the website that is used later
- | \- schemes/__init__.py - front-end module to make available scheme
- | | | modules for fetching content
- | | \- schemes/*.py - per scheme (ftp/file/http) a module
- | \- parsers/__init.py - front-end module to handle parsing of content
- | \- parsers/*.py - parser modules for content (html and dummy css
- | currently)
+ \- myurllib.py - module for ftp/file/http url fetching
+ |
+ \- parsers/__init__.py - front-end module to handle parsing of content
+ | \- html/ - parser modules for html content
+ | \- css.py - parser module for css (dummy currently)
+ |
\- plugins/__init__.py - front-end module for plugin modules, this calls
| all configured plugins and has some helper
| functions for plugins
Modified: webcheck/NEWS
==============================================================================
--- webcheck/NEWS Wed Nov 16 13:07:51 2011 (r468)
+++ webcheck/NEWS Wed Nov 16 13:28:08 2011 (r469)
@@ -1,3 +1,12 @@
+changes from 1.10.4 to 1.10.5 (alpha)
+-----------------------------
+
+* added setup.py for pypi/egg-based installation
+* support --levels option to control max depth
+* detect and report on endless redirects
+* move to sqlite for storing crawler state
+
+
changes from 1.10.3 to 1.10.4
-----------------------------
Modified: webcheck/README
==============================================================================
--- webcheck/README Wed Nov 16 13:07:51 2011 (r468)
+++ webcheck/README Wed Nov 16 13:28:08 2011 (r469)
@@ -64,7 +64,13 @@
INSTALLING WEBCHECK
===================
+This will install the latest version from PyPi.
+ % easy_install webcheck
+
+
+MANUAL INSTALLATION
+===================
Installation is relatively easy. These installation instructions are for
Unix-like systems. Other operating systems may differ.
@@ -78,12 +84,6 @@
3. Put the manual page in the MANPATH.
% ln -s /opt/webcheck-1.10.4/webcheck.1 /usr/local/man/man1/webcheck.1
-webcheck does not use Distutils because that tool is meant to install Python
-modules which should end up in the default Python path (from what the author
-understands of Distutils). Since webcheck does not expose any public API, it
-is an application with only private modules. A (maintainable) setup.py which
-installs webcheck outside the public patch is welcome.
-
RUNNING WEBCHECK
================
--
To unsubscribe send an email to
webcheck-commits-unsubscribe@lists.arthurdejong.org or see
http://lists.arthurdejong.org/webcheck-commits/
- webcheck commit: r469 - webcheck,
Commits of the webcheck project