HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: dot com only restriction?
Author: William Roeder
Date: 08/04/2008 14:27
 
> Just to double check... in a UNIX terminal
> httrack <http://www.my.edu/> -O
> "/archive/www.my.edu" -* +*.edu/* '%P0' -v

I don't use the cl/unix but <http://httrack.com/html/fcguide.html> shows you'll
have to quote the asterisks: "-*" "+*.edu/*"

> the time. But I can restart it like this: 
> httrack <http://www.my.edu/> -O
> "/archive/www.my.edu" -* +*.edu/* '%P0' -vi

yes
> Is there a smooth way to shut it down each time so
> the collection doesn't get broken?
send one interrupt, it should finish the current transfers and stops.
Also the above link/limit options allows:
  MN maximum overall size that can be uploaded/scanned
  EN maximum mirror time in seconds (60=1 minute, 3600=1 hour)
  GN pause transfer if N bytes reached, and wait until lock file is deleted

> Also, what will come of the links to commercial
> sites? Can I give httrack instructions to deliver
> those links to a local php script for processing?No you can't get php files
from servers. Servers execute php, cgi, asp, etc files and deliver html.  A
mirror is not a backup of a site, it is a static copy.

> Since the mirror is going to end up on an intranet,
> I'd like to have a script generate a message saying
> "You have selected a commercial link that requires a
> connection to the Internet. Click here to continue
> or click here to return to the previous page."

  x   replace external html links by error pages

> And finally, will the mirror gets my school first
> and then come back to get the others? Or would I

it will start with my.edu and spider down and away from there.  Normally it
would stay on site, but the *.edu overrides that.
You might want to place separate sites in separate subdirectories:
  N104 Identical to N4 except that "web" is replaced by the site's name
 
Reply Create subthread


All articles

Subject Author Date
dot com only restriction?

08/04/2008 02:14
Re: dot com only restriction?

08/04/2008 04:58
Re: dot com only restriction?

08/04/2008 13:42
Re: dot com only restriction?

08/04/2008 14:27
$$$ Assstance Wanted

08/05/2008 11:33




8

Created with FORUM 2.0.11