| Hi,
I am running HTTrack version 3.46
I had a huge problem with memory on this run and had to kill it because
nothing else could be done on our machine.
****************************************************
The question is how did this memory get so
inflated?****************************************************
PID PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
5232 20 0 7690m 6.6g 608 S 0.0 21.2 0:10.39 httrack
The log is attached at the end.
It wouldn't have done much as the site now redirects and I don't allow
straying, that I understand, no problem there.
Problem is the memory consumed. I am trying to run several httrack process at
the same time, other processes were consuming around 3g of memory. 5232 was
the winner though!
Do I need to be worried that I get the string of -iC2 added at the end of the
command line when starting the update?
The aim was to extract as many as possible html and text files, which don't
necessary have any extensions; also xml files as rss feeds can be in them and
they have a lot of links. I wanted to exclude everything else.
As I understand from manual using mime types in options is slower than simple
name filters - is this correct?
Is the -#L1,000,000,000 an issue here?
I do want to go through a lot of links, some of sites I am interested are
pretty big, -#L1,000,000 was limiting for one of them ("www.bbc.co.uk/news/"
and I want as complete set as possible. I deliberately overshot not to have to
do this again, as if I am correct I cannot change options when running
"-update" and I don't want to have to start from scratch again.
If the link number is the problem, is there a way to run httrack in some other
way that will still cover a very large site and not disable my server?
I am not sure from the manual and the Fred's guide how to use the -cN option
for my advantage here, could it help?
I will be very grateful for your help
Let me know if you need any more information.
Krys
Here is entire log file for 5232:
HTTrack3.46+libhtsjava.so.2 launched on Sat, 13 Apr 2013 06:40:13 at
<http://business.financialpost.com/> -*?print=* -*?page=* -*.mp3 -*.mp4 -*.wav
-*.avi -*.dvi -*.mpg -*.mpeg -*.mov -*.bmp -*.css -*.sxml -*.xlsx -*.xls
-*.doc -*.tar -*.zip -*.swf -*.stm -*.js -*.gif -*.jpg -*.jpeg -*.png -*.pdf
(/usr/local/bin/httrack <http://business.financialpost.com/> -X0 -A100000
-#L1000000000 -z -v -O
/home/krysb/httrack/round_robin/business.financialpost.com -*?print=*
-*?page=* -*.mp3 -*.mp4 -*.wav -*.avi -*.dvi -*.mpg -*.mpeg -*.mov -*.bmp
-*.css -*.sxml -*.xlsx -*.xls -*.doc -*.tar -*.zip -*.swf -*.stm -*.js -*.gif
-*.jpg -*.jpeg -*.png -*.pdf -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2
-iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2
-iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 -iC2 )
Information, Warnings and Errors reported for this mirror:
note: the hts-log.txt file, and hts-cache folder, may contain sensitive
information,
such as username/password authentication for websites mirrored in this
project
do not share these files/folders if you want these information to
remain private
Mirror launched on Sat, 13 Apr 2013 06:40:13 by HTTrack Website
Copier/3.46+libhtsjava.so.2 [XR&CO'2010]
mirroring <http://business.financialpost.com/> -*?print=* -*?page=* -*.mp3
-*.mp4 -*.wav -*.avi -*.dvi -*.mpg -*.mpeg -*.mov -*.bmp -*.css -*.sxml
-*.xlsx -*.xls -*.doc -*.tar -*.zip -*.swf -*.stm -*.js -*.gif -*.jpg -*.jpeg
-*.png -*.pdf with the wizard help..
06:40:13 Info: engine: init
07:20:54 Debug: Cache: enabled=2, base=hts-cache/, ro=0
07:20:54 Debug: Cache: rename hts-cache/new.zip -> hts-cache/old.zip
(0x7f971a5923b4 0x7f971a5b03b4)
07:20:54 Debug: Cache: successfully renamed
07:20:54 Debug: Cache: size 1537
07:20:54 Debug: Cache index loaded: 2 entries loaded
07:20:55 Info: engine: start
07:20:55 Info: engine: check-html: primary/primary
07:20:55 Info: engine: preprocess-html: primary/primary
07:20:55 Info: engine: save-name: local name:
business.financialpost.com/index.html ->
business.financialpost.com/index.html
Exit requested to engine (signal 15)
End of log file for 5232.
| |