HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: httrack always downloads external pages... ?
Author: Richard
Date: 03/05/2001 17:14
 
Hi !

> So if you want files from your original website only,
> you must express all allowing filters (+) as
> www.downloadthissite.com*html or any other filetype
> you want. If you want all, you can just
> +www.downloadthissite.com*

As i already wrote in my posting you answered at, i cant update the filters
for each page, because - as winhttrack (or httrack) supports it - i have a
file, which winhtrack reads, with a lot of URLs in it where from i have to
download specified files automatically.

So using filters for downloading files, specified by size and type, from a
list of URLs and subdirectories at the same server (and from nowhere else),
given in "URL list (.txt):"  is impossible ?
I really have to set a filter containing -all- filetypes to download, for
-each- URL i specify in my URL-List-File ?!?
I tried it. 70 URLs. If i give them in the URL-window, all works fine (except
for downloading the damn :-) external pages).

If i add the 75 necessary filters, winhttrack crashes with:

(sorry, i get the message in german, i try to translate it into english :-)
):

WINHTTRACK causes an error in WINHTTRACK.EXE 
WINHTTRACK is being closed now.
If the problems occours again, restart your computer.
Details:
WINHTTRACK causes a page-error in module WINHTTRACK.EXE at 016f:0040ec30.
Registers:
EAX=7261702f CS=016f EIP=0040ec30 EFLGS=00010212
EBX=0000025c SS=0177 ESP=01eae604 EBP=004c2168
ECX=01eae5d8 DS=0177 ESI=004bb992 FS=53cf
EDX=00000295 ES=0177 EDI=004d8034 GS=0000
Bytes at CS:EIP:
8b 3c 90 85 ff 74 40 8b f5 8d 47 04 8a 10 8a 1e 
Stack Values:
004d8034 004bb992 01ebff64 0000025c 00438fc9 7261702f 00000400 004c2168
01eae628 00000000 00443de6 004c2168 004c2174 004c2180 0075ef2c 015c6f20 

thats a pity, because httrack looked like just what i was searching for... but
that renders the URL-List-Textfile-Import-possibility rather useless...

btw., i thought that when i set at "experts only" the "global
travel mode" to "stay on the same domain" or similar, no
pages/files from other domains are being downloaded ? but that doesnt seem to
be the case.

hm. Couldnt it be possible to implement something like variables, containing
the current URL from the filelist , to the filters, so that one could filter
something like

-*
+[current_url_from_filelist]/*.htm?+[current_url_from_filelist]/*.jpg[>10]

(i looked at httrack.c to patch it myself, but without understanding a word in
french, thus without being able to read any comments, its not that easy :-))
)...


Thanks, bye, Richard :-)
 
Reply Create subthread


All articles

Subject Author Date
httrack always downloads external pages... ?

03/05/2001 01:47
Re: httrack always downloads external pages... ?

03/05/2001 13:02
Re: httrack always downloads external pages... ?

03/05/2001 17:14
Re: httrack always downloads external pages... ?

03/05/2001 19:19
Re: httrack always downloads external pages... ?

03/06/2001 00:06
Re: httrack always downloads external pages... ?

03/07/2001 20:47




c

Created with FORUM 2.0.11