| > First off, I have been using WinHTTrack for almost two
> years now. It was amazing software when I first started
> using it and has had several out of this world
> improvements since. Thanks Xavier Roche et al.
Thanks :)
> Q: Can we update all these sites (different project
> files) at once with one script?
This feature is not implemented by default.
You will have to use some commandline scripting. On unix
systems, this is quite easy, and on windows systems I
suggest you install cygwin tools (http://cygwin.com/) to
get the powserful unix-like scripting, and especially bash.
You then will be able to fire updates using scripting, with
something like:
for i in `find . -type d -maxdepth 1`; do
(cd "$i" && httrack --update)
done
> Q: Can we tell WinHTTrack to store cache and settings
for
> an individual site in a different place then the actual
> copied site (for example: two seperate drives on the same
> machine) and still have it update the site? (would have
> one master hard drive with all sites on it and make
direct
> copies of it for sending to Africa, no need to send
> overhead files from HTTrack to Africa.)
Yes, using the -O <data-path>,<httrack-cache-path> option,
such as:
-O ./files,./private linux
> Q: If a site we are downloading new links to a site we
> have already downloaded can we tell HTTrack to
> automatically link to our local copy (seperate project
> files)?
No, unfortunately linking within different projects is not
yet possible, as it would require some very long indexing.
> Q: We have downloaded www.bartelby.com and each page has
> a Java Script that points to an outside link (advertising
> server). This can do some weird things when there is not
> Internet connectino present and even weirder things when
> there is a slow connection. We have already had limited
> success with programs that can strip out this code after
> the download.
Another solution is to use the 'No external page' option,
that is, the --replace-external option
> Is there a pernament solution HTTrack can
> do during the download so we don't have to fix it after
> every update?
No - even if post-processing is not very difficult, using
the -V command option callback of httrack and a script
> We'd also love to hear from any hardcore proxy server
> programmers. It would be great to have these sites work
> seamlessly on campuses in Africa so that students can
type
> in the actual URL of a site and either get the version
> cached on a 120GB hard drive with other entire sites or
> the new page if it has been updated
I think the best solution would be to use Squid, and either
a redirect_program script, or a list of rules using the
RewriteRule squid.conf option (see manpages, I forgot a bit
the exact syntax!)
> Sorry so long. Wanted to let the developers know that
> their work makes a difference globally. Not only a
> difference but it has changed lives and sparked many
tears
> of joy!
It's good to see that our work is useful for many people.
Thank you very much for this feedback!
| |