HTTrack Website Copier
Free software offline browser - FORUM
Subject: Attempted Mirror Always Craters
Author: Jerry
Date: 04/21/2009 17:03
 
I'm writing a book (an expose about government corruption) that is based
largely on information currently available on the web.  I'd like to capture
this information now, so that I'll still have a record of it if it is later
removed.

I've put together all of the links of this material in one webpage:
<http://www.itsonline.com/srs/sources_links_d6.html>

I'd like HTTrack to capture that page and all the webpages and documents (PDF,
*.doc, etc.) linked to from that page.  My understanding is that if I set the
mirroring and external depths to "3" it should capture all this information in
an archive that can be browsed offline.

Unfortunately it's not working.  I've tried many different settings (changing
the depth, timeout, etc.), but it always stops early on before actually
capturing any of the linked documents/web pages.

Here's a typical error message:

HTTrack3.43-4+htsswf+htsjava launched on Tue, 21 Apr 2009 09:51:29 at
<http://www.itsonline.com/srs/sources_links_d6.html> +*.png +*.gif +*.jpg +*.css
+*.js -ad.doubleclick.net/* -mime:application/foobar
(winhttrack -qwr3%e3C2%Ps2u1%s%uN0%I0p3DaK0H0%kf2A25000%f#f -F "Mozilla/4.5
(compatible; HTTrack 3.0x; Windows 98)" -%F "<!-- Mirrored from %s%s by
HTTrack Website Copier/3.x [XR&CO'2008], %s -->" -%l "en, en, *"
<http://www.itsonline.com/srs/sources_links_d6.html> -O1
D:\outing_data_archive\sourcesv2\sourcesv2 +*.png +*.gif +*.jpg +*.css +*.js
-ad.doubleclick.net/* -mime:application/foobar )
Information, Warnings and Errors reported for this mirror:
note: the hts-log.txt file, and hts-cache folder, may contain sensitive
information,
 such as username/password authentication for websites mirrored in this
project
 do not share these files/folders if you want these information to remain
private
09:51:34 Error:  "Not Found" (404) at link
www.itsonline.com/srs/online%20at:%20http:/www.itsonline.com/ttid/datalock_scheme.pdf
(from www.itsonline.com/srs/sources_links_d6.html)
HTTrack Website Copier/3.43-4 mirror complete in 5 seconds : 4 links scanned,
2 files written (137537 bytes overall) [139198 bytes received at 27839
bytes/sec]
(1 errors, 0 warnings, 0 messages)

What's confusing to me is that the PDF file that it couldn't find
(http://www.itsonline.com/ttid/datalock_scheme.pdf) exists on the web.

I'm sure I'm doing something wrong (i.e., cockpit error), and would appreciate
any help and guidance.  Thanks.

Jerry
 
Reply


All articles

Subject Author Date
Attempted Mirror Always Craters

04/21/2009 17:03
Re: Attempted Mirror Always Craters

04/21/2009 18:57




4

Created with FORUM 2.0.11