I'm trying to download the internet - HTTrack Website Copier Forum

Subject: I'm trying to download the internet

Author: Matthew Ostrowski

Date: 01/26/2005 00:54

Unlike most of you, I am trying to delimit rather than limit my spidering as 
much as possible for an art project I'm working on.  I have been using wget, 
with reasonable results, but it has a tendency to die rather quickly.  I've
been 
experimenting with httrack for a few days, and it seem to have some 
advantages, but I am having trouble crossing from one domain to another: I'll

get the homepage, but no more.  I'm using the following options:

httrack <http://www.somesite.org> -O /Volumes/sounds/httrack_get 
-C0N1003s0K%e9999r9999zI0b1nBe

.httrackrc:

assume sp=text/html,php3=text/html,cgi=image/gif

ext-depth 512

user-agent "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)"

What I get is about 27 files downloaded, and then things no longer download, 
though the program is still running, displaying messages like this:

channels.netscape.com/ns/search/hotsearch.jsp (168 bytes) - OK

The matching line in my log:
23:35:33	Info: 	engine: save-name: local name: channels.netscape.com/ns/
search/hotsearch.html -> hotsearch.html

It has apparently not downloaded, just checked. (I realize that as a
javascript, 
it may not download, but html files don't either off the main domain.

Now that is Netscape, who knows what protection they have, but I get this 
link:
23:31:54	Info: 	engine: transfer-status: link recorded: 
www.throughthecracks.org/index.html -> /Volumes/sounds/
httrack_get_pan2/index-9.html

I have that file -- it's another homepage, but I'm not getting anything from 
the throughthecracks.org site past that point.  If I try the site directly, it

downloads, no problem.

I thought the e flag, plus the %e depth, would cover this.  What am I doing 
wrong?  And have you any other tips for promiscuous downloading?
Thanks,

\M

All articles

Subject	Author	Date
I'm trying to download the internet		01/26/2005 00:54
Re: I'm trying to download the internet		01/29/2005 14:25
Re: I'm trying to download the internet		01/31/2005 12:55
Re: I'm trying to download the internet		02/01/2005 01:16