HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: .tmp files and the problem of continuing a mirror
Author: Jim
Date: 04/07/2012 13:17
 
You need a newer version of httrack the version 3.45.3 works to do this.

Un-install any old versions of httrack.

Get the source and unpack it, CD to the folder and ./configure and then make.
If you don't know how to do that use google, it's a typical linux compile.
You will have to have developer tools installed like "gcc" and such.
GOOGLEIT.

Now sudo make install

Check the version when you run it.

Run it where you ran it before but add --continue and if it's a large site add
to the limit on URLs like the below line:

httrack --continue -#L1000000 your.site.com

Let it run for a while, it will seem to skip a lot of URLs and those are the
ones it already got.

You will need to figure out when it's going round and round, re-downloading
pages it already got, you need to figure that out or just let it run longer
than you think it should.

Now, when you hit ctrl-C it will say it's finishing up! Let it do that and
then you should be in pretty good shape.

CD into the folder where the "NAME.html.tmp" files are and then you can delete
them with the below lines, BE CAREFUL, this will delete like crazy!!!! Use the
ls command FIRST and see if it's picking up the right files for each
command....

Of course you don't want to delete the .html files!

ls -1 *html.tmp | less        <<<   LOOK AND SEE IF IT'S OK!!!!
ls -1 *-2.html | less           <<<   THESE ARE THE FILES IT WILL DELETE!!!

ls -1 *html.tmp | xargs -n 10 -i rm -f {}
ls -1 *-2.html | xargs -n 10 -i rm -f {}
ls -1 *-3.html | xargs -n 10 -i rm -f {}
ls -1 *-4.html | xargs -n 10 -i rm -f {}
 (and so on...)

IF YOU DON'T KNOW SHELL COMMANDS GO READ ON GOOGLE!!!

That's it, everything should be wonderful.

 
Reply Create subthread


All articles

Subject Author Date
.tmp files and the problem of continuing a mirror

01/29/2012 16:38
Re: .tmp files and the problem of continuing a mirror

01/29/2012 16:59
Re: .tmp files and the problem of continuing a mirror

01/29/2012 17:11
Re: .tmp files and the problem of continuing a mirror

01/31/2012 16:20
Re: .tmp files and the problem of continuing a mirror

01/31/2012 19:23
Re: .tmp files and the problem of continuing a mirror

04/03/2012 11:08
Re: .tmp files and the problem of continuing a mirror

04/07/2012 13:17




4

Created with FORUM 2.0.11