HTTrack Website Copier
Free software offline browser - FORUM
Subject: Problematic site
Author: Juan Fco Rodriguez
Date: 06/14/2005 10:17
 
Hello,

I'm experiencing problems with updates when I execute
httrack with the following command line options:

httrack -iC2 -q -k -%s -%u -%k -v -%v0 -u1 -o0 -X0 -%p -I0 -a -%P -c4 -%c10 -N
%h%p/%n.%t%k -R2 -T20 -M3221225472 -r8 -A100000000 <http://www.cccyl.es/>

The first time, it downloads a lot of PDFs documents under
www.cccyl.es/downloads....but when I try to update the site,
it doesn't download them and there is no trace of them on
the new.txt file.

I've been digging the site, and there is implemented some kind of "antileech"
mechanisms. It is necessary to follow
two 302 redirects and afterwards using a HEAD request to
be able to download the PDFs...(Im not sure of this,
I've found it out after a lot of tests with -%H option
enabled)

The saddest thing of all this is that if I execute
httrack -iC2 <http://www.cccyl.es>, it seems to work perfectly !?
What am I doing wrong ? I think there is some kind of
bug on the way 302 messages are handled by the HTTrack's
cache or something similar....please help me, Im desperate! :)

 
Reply


All articles

Subject Author Date
Problematic site

06/14/2005 10:17
Re: Problematic site

06/20/2005 11:46




e

Created with FORUM 2.0.11