lists.arthurdejong.org
RSS feed

Not getting correct report from Webcheck

[Date Prev][Date Next] [Thread Prev][Thread Next]

Not getting correct report from Webcheck



Team,
I am trying to run webcheck to find broken links for one of the websites but I am seeing 404 message for this URL. I am pretty sure that this is a valid URL and I am able to see the page when I hit it in the browser. Also, I am able to see other crawler tools able to crawl this site and generating reports. So, I am not sure why I am not able to get the report from Webcheck tool. can you please help on this?

I am attaching the version details, command that I ran and the report generated. Please help to identify the issue here. Thanks in advance.


Webcheck version : webcheck-1.10.4

Command Ran & the output:

$ python webcheck.py -o /tmp/myreport https://online.citibank.com
webcheck: checking site....
webcheck:   getting robots.txt for https://vm-eb0e-1c90.nam.nsroot.net:447
webcheck: DEBUG: crawler.crawl(): items left to check: 1
webcheck:   https://vm-eb0e-1c90.nam.nsroot.net:447/BUSID/JPS/Portal/Index.do
webcheck: DEBUG: schemes.http.fetch: connecting to 
vm-eb0e-1c90.nam.nsroot.net:447
webcheck: DEBUG: schemes.http.fetch(): HTTP response: 404 Not found
webcheck: DEBUG: schemes.http.fetch(): mimetype: text/html
webcheck: DEBUG: schemes.http.fetch(): size: 292
webcheck: done.
webcheck: postprocessing....
webcheck: DEBUG: crawler.postprocess(): adding 
https://vm-eb0e-1c90.nam.nsroot.net:447/BUSID/JPS/Portal/Index.do to bases
webcheck: DEBUG: crawler.postprocess(): items left to examine: 1
webcheck: done.
webcheck: generating reports...
webcheck:   anchors
webcheck:   sitemap
webcheck:   urllist
webcheck:   images
webcheck:   external
webcheck:   notchkd
webcheck:   badlinks
webcheck:   old
webcheck:   new
webcheck:   size
webcheck:   notitles
webcheck:   problems
webcheck:   about
webcheck: done.


-- 
To unsubscribe send an email to
webcheck-users-unsubscribe@lists.arthurdejong.org or see
http://lists.arthurdejong.org/webcheck-users/