HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Questions and a planted seed to grow interest
Author: Xavier Roche
Date: 12/17/2002 21:47
 
> First off, I have been using WinHTTrack for almost two 
> years now.  It was amazing software when I first started 
> using it and has had several out of this world 
> improvements since.  Thanks Xavier Roche et al.

Thanks :)

> Q:  Can we update all these sites (different project 
> files) at once with one script?
This feature is not implemented by default.
You will have to use some commandline scripting. On unix 
systems, this is quite easy, and on windows systems I 
suggest you install cygwin tools (http://cygwin.com/) to 
get the powserful unix-like scripting, and especially bash.

You then will be able to fire updates using scripting, with 
something like:

for i in `find . -type d -maxdepth 1`; do
(cd "$i" && httrack --update)
done

> Q:  Can we tell WinHTTrack to store cache and settings 
for 
> an individual site in a different place then the actual 
> copied site (for example: two seperate drives on the same 
> machine) and still have it update the site?  (would have 
> one master hard drive with all sites on it and make 
direct 
> copies of it for sending to Africa, no need to send 
> overhead files from HTTrack to Africa.)

Yes, using the -O <data-path>,<httrack-cache-path> option, 
such as:
-O ./files,./private linux

> Q:  If a site we are downloading new links to a site we 
> have already downloaded can we tell HTTrack to 
> automatically link to our local copy (seperate project 
> files)?  

No, unfortunately linking within different projects is not 
yet possible, as it would require some very long indexing.

> Q:  We have downloaded www.bartelby.com and each page has 
> a Java Script that points to an outside link (advertising 
> server).  This can do some weird things when there is not 
> Internet connectino present and even weirder things when 
> there is a slow connection.  We have already had limited 
> success with programs that can strip out this code after 
> the download.

Another solution is to use the 'No external page' option, 
that is, the --replace-external option

>  Is there a pernament solution HTTrack can 
> do during the download so we don't have to fix it after 
> every update?
No - even if post-processing is not very difficult, using 
the -V command option callback of httrack and a script

> We'd also love to hear from any hardcore proxy server 
> programmers.  It would be great to have these sites work 
> seamlessly on campuses in Africa so that students can 
type 
> in the actual URL of a site and either get the version 
> cached on a 120GB hard drive with other entire sites or 
> the new page if it has been updated

I think the best solution would be to use Squid, and either 
a redirect_program script, or a list of rules using the 
RewriteRule squid.conf option (see manpages, I forgot a bit 
the exact syntax!)


> Sorry so long.  Wanted to let the developers know that 
> their work makes a difference globally.  Not only a 
> difference but it has changed lives and sparked many 
tears 
> of joy!

It's good to see that our work is useful for many people. 
Thank you very much for this feedback!
 
Reply Create subthread


All articles

Subject Author Date
Questions and a planted seed to grow interest

12/17/2002 03:11
Re: Questions and a planted seed to grow interest

12/17/2002 06:17
Re: Questions and a planted seed to grow interest

12/17/2002 21:47
Re: Questions and a planted seed to grow interest

12/18/2002 22:57
Re: Questions and a planted seed to grow interest

09/07/2005 12:00
how update is done in HttRack

03/21/2008 19:26
how update is done in HttRack

03/21/2008 19:26




f

Created with FORUM 2.0.11