HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: how to not redownload existing files
Author: William Roeder
Date: 07/30/2011 16:13
 
> on-line version.  The site is too big to crawl in
> one go so I have to cancel.  I do let the batch
> complete though.
No you don't, Pause it, minimize it, hibernate if necessary, resume later.
Don't kill the program, don't close it, don't cancel.

> The problem is that PDF files are downloaded twice. 
> They are named MYPDF-2.PDF and I effectively have
Because the file is mentioned via two different URLs. HTT has no idea if they
are the same or not (e.g. images can be different even with the same size and
timestamp.) Try options -> Spider -> Join Similar URLS

> Is there a way for winHTTrack to say "if the file
> exists locally, don't download" it does not need to
> check the server date and time, just ignore it?nope.
 
Reply Create subthread


All articles

Subject Author Date
how to not redownload existing files

07/30/2011 08:38
Re: how to not redownload existing files

07/30/2011 16:13
Re: how to not redownload existing files

07/30/2011 20:38
Re: how to not redownload existing files

07/31/2011 01:11




8

Created with FORUM 2.0.11