HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: configuration questions
Author: Simon Lai
Date: 05/10/2002 12:13
 
> hi, any help on the following questions is much 
> appreciated:

Lots of questions :) I'll try to answer them to the
best of my knowledge - but keep in mind I am not on the
development team so my responses may not always be
right (but are from my perspective)
> 
> 1. i have a dial-up connection and am not behind a 
> firewall, do i uncheck 'use proxy for ftp transfers'?
firewalls have nothing to do with proxies. My ISP uses
a firewall and I have a personal firewall program
installed and I still use a proxy setting. Proxies are
there to "speed up" downloads as your ISP may have the
latest version from the web and hence don't need to
grab another one. If your ISP has a proxy for FTP, then
I recommend using it unless you encounter any problems
using it.
> 
> 2. since no external links are downloaded by default, 
> is the -ad.doubleclick.net/* filter necessary?> 
It's necessary unless you don't mind grabbing the
images from ads websites. HTtrack won't download HTML
files external of the given URL, but images and ZIP
files etc can still be downloaded from other sites.

> 3. the help file says a maximum mirroring depth of 3 
> means download the current page and pages up to two 
> levels below the current page. then does that mean a 
> level of 0 prevents the page from being downloaded? in 
> which situation would level 0 be useful for?> 
Yes, it means it downloads absolutely nothing (I just
checked it, hehe). Not quite sure why that is there...
;) Maybe the develop left it in so people don't accuse
them of not being able to count? :)

> 4. i have a dial-up connection with average speed of 
> 44kbps...what is the best number of connections for 
> me? download acceleration programs that i've tried had 
> this set to 4-5, so is this the optimum setting for my 
> connection?> 
This really differs depending on which site you are
downloading from and also how many connections your ISP
will allow you to open, as well as how many other
internet using programs you are concurrently running
besides HTtrack. If you are not sure then leaving it by
default is probably the best way to go. It's really a
matter of experimenting a couple of times...

> 5. under flow control, how do i set the program to 
> retry indefinitely?> 
If after 6 tries and it's still timing out then there's
probably something wrong with the website. Infinite
re-tries is probably not an option as that may
inadvertantly take up all the available connections
(eg. if you specified max number of connection to be 4
and 4 web pages keeps giving time out errors, then your
download would effectively stop).

> 6. should i check 'get HTML files first' for better 
> performance? if this option is indeed beneficial for 
> most users, then i think it should be checked by 
> default.

It probably would be set as default (I always leave it
on myself). As the help file explains, by downloading
HTML files first it can scan all the valid links first.
> 
> 7. are there any default MIME type associations, or do 
> i have to manually map everything, including html-
> >text/html etc.?> 
I am actually unsure of the significance of this
feature. Though considering most computers can probably
parse the files faster than computer can download files
this probably isn't a thing of great interest (from the
help file description it looks like it's only used to
rename files or else speed up parsing by designating
another file is of a different extension). What this
means (I think) is if HTtrack comes across an
unsupported file format then it uses a time consuming
algorithm to parse the file for URLs. But if you define
it to be of a common/pre-defined file format then it
uses a more efficient algorithm to parse the file.

Default MIME types will be used if HTtrack supports it.
That is, if it comes across HTM or HTML files then
it'll use the MIME type for it. But as for what types
are pre-defined, I am uncertain.

> 8. under primary scan rule (experts tab), what 
> does 'store html files first' mean?I am uncertain of what this feature may
mean exactly.
But my guess is this determines what files are actually
saved. The last option means it'll always save the HTML
files first (lest your run of out hard-disk space) then
it'll save the other non-HTML stuff.

Hope my responses helped somewhat? :)
 
Reply Create subthread


All articles

Subject Author Date
configuration questions

05/10/2002 09:54
Re: configuration questions

05/10/2002 12:13
Re: configuration questions

05/10/2002 14:40
Re: configuration questions

05/10/2002 15:42




1

Created with FORUM 2.0.11