HTTrack Website Copier
Free software offline browser - FORUM
Subject: Changes in alpha-8 => problems w/redirect
Author: Lars Clausen
Date: 02/05/2004 10:46
 
Have been looking at the alpha-8 version, and something
seems to have changed that gives problems.  In particular,
redirects now have their body collected rather than just the
head, which threw my arc writer for a loop.  But something
seems to be going wrong with figuring out what's the
original and what's the redirect target -- it downloads the
redirect body, but refuses to get the target body, and so
doesn't find more links.  Below is the log of an attempt at
getting www.tv2.dk which stops after the first page.

Same problem seems to have been in alpha-7.

There were also some installation problems that the
share/httrack/html link setup broke when installing on top
of old installations (I think three installations are
required for this to happen).  But that got cleared up with
make uninstall and removing the share dirs.

-Lars

HTTrack3.31-ALPHA-8-noV6-nossl launched on Thu, 05 Feb 2004
10:44:02 at www.tv2.dk
(httrack -%W receive-header=httrack-arc:get_header -%W
transfer-status=httrack-arc:dump_chunk -F "HTTrack 3.30.91
(non-archiving test version, see
www.netarkivet.dk/website/info.html)" -n -Z -d -A10000
-#L10000000 www.tv2.dk )

Information, Warnings and Errors reported for this mirror:
note:   the hts-log.txt file, and hts-cache folder, may
contain sensitive information,
        such as username/password authentication for
websites mirrored in this project
        do not share these files/folders if you want these
information to remain private

10:44:02        Info:   engine: init
10:44:02        Info:   engine: start
10:44:02        Debug:  Wait get: primary/primary
10:44:02        Info:   engine: check-html: primary/primary
10:44:02        Debug:  scan file..
10:44:02        Debug:  link detected in html: <http://www.tv2.dk>
10:44:02        Debug:  position link check <http://www.tv2.dk>
10:44:02        Debug:  build relative link
<http://www.tv2.dk> with primary/primary
10:44:02        Debug:  built relative link
<http://www.tv2.dk> with primary/primary -> www.tv2.dk/
10:44:02        Debug:  wizard link test at www.tv2.dk/..
10:44:02        Debug:  wizard test begins: www.tv2.dk/
10:44:02        Debug:  Compare addresses: www.tv2.dk!=primary
10:44:02        Debug:  result for wizard link test: 0
10:44:02        Info:   engine: save-name: local name:
www.tv2.dk/index.html -> www.tv2.dk/index.html
10:44:02        Debug:  Record: www.tv2.dk/ ->
www.tv2.dk/index.html
10:44:02        Debug:  relative link at www.tv2.dk build
with www.tv2.dk/index.html and index.html: www.tv2.dk/index.html
10:44:02        Debug:  robots.txt added at www.tv2.dk
10:44:02        Debug:  OK, NOTE: www.tv2.dk/ ->
www.tv2.dk/index.html
10:44:02        Debug:  Wait get: www.tv2.dk/robots.txt
10:44:03        Info:   engine: transfer-status: link error
(301, 'Moved Permanently'): www.tv2.dk/robots.txt
10:44:03        Debug:  File checked by cache: www.tv2.dk
10:44:03        Warning:        Moved Permanently for
www.tv2.dk/robots.txt
10:44:03        Debug:  wizard link test for moved file at
tv2.dk/robots.txt..
10:44:03        Debug:  wizard test begins: tv2.dk/robots.txt
10:44:03        Debug:  moved link accepted: tv2.dk/robots.txt
10:44:03        Warning:        Warning moved treated for
www.tv2.dk/robots.txt (real one is tv2.dk/robots.txt)
10:44:03        Info:   engine: save-name: local name:
tv2.dk/robots.txt -> tv2.dk/robots.txt
10:44:03        Debug:  Wait get: www.tv2.dk/
10:44:03        Info:   engine: transfer-status: link error
(301, 'Moved Permanently'): www.tv2.dk/
10:44:03        Debug:  File checked by cache: www.tv2.dk
10:44:03        Warning:        Moved Permanently for
www.tv2.dk/
10:44:03        Warning:        File has moved from
www.tv2.dk/ to <http://tv2.dk/>
10:44:03        Info:   engine: check-html: www.tv2.dk/
10:44:03        Debug:  scan file..
10:44:03        Debug:  link detected in html: <http://tv2.dk/>
10:44:03        Debug:  position link check <http://tv2.dk/>
10:44:03        Debug:  build relative link <http://tv2.dk/>
with www.tv2.dk/
10:44:03        Debug:  built relative link <http://tv2.dk/>
with www.tv2.dk/ -> tv2.dk/
10:44:03        Debug:  wizard link test at tv2.dk/..
10:44:03        Debug:  wizard test begins: tv2.dk/
10:44:03        Debug:  result for wizard link test: 0
10:44:03        Debug:  Record: tv2.dk/ -> www.tv2.dk/index.html
10:44:03        Debug:  relative link at tv2.dk build with
www.tv2.dk/index.html and www.tv2.dk/index.html: index.html
10:44:03        Debug:  merging similar links tv2.dk/ and
www.tv2.dk/
10:44:03        Debug:  link has already been recorded,
cancelled: www.tv2.dk/index.html
10:44:03        Debug:  link detected in html: <http://tv2.dk/>
10:44:03        Debug:  position link check <http://tv2.dk/>
10:44:03        Debug:  build relative link <http://tv2.dk/>
with www.tv2.dk/
10:44:03        Debug:  built relative link <http://tv2.dk/>
with www.tv2.dk/ -> tv2.dk/
10:44:03        Debug:  wizard link test at tv2.dk/..
10:44:03        Debug:  wizard test begins: tv2.dk/
10:44:03        Debug:  result for wizard link test: 0
10:44:03        Debug:  Record: tv2.dk/ -> www.tv2.dk/index.html
10:44:03        Debug:  relative link at tv2.dk build with
www.tv2.dk/index.html and www.tv2.dk/index.html: index.html
10:44:03        Debug:  merging similar links tv2.dk/ and
www.tv2.dk/
10:44:03        Debug:  link has already been recorded,
cancelled: www.tv2.dk/index.html
10:44:03        Debug:  Wait get: tv2.dk/robots.txt
10:44:03        Debug:  (Keep-Alive): successfully preserved
#2 (tv2.dk)
10:44:03        Info:   engine: transfer-status: link error
(404, 'Not Found'): tv2.dk/robots.txt
10:44:03        Debug:  File checked by cache: tv2.dk
10:44:03        Info:   No robots.txt rules at tv2.dk
10:44:03        Info:   No data seems to have been
transfered during this session! : restoring previous one!
10:44:03        Info:   engine: end
10:44:03        Info:   engine: free

 
Reply


All articles

Subject Author Date
Changes in alpha-8 => problems w/redirect

02/05/2004 10:46
Re: Changes in alpha-8 => problems w/redirect

02/05/2004 11:43
Re: Changes in alpha-8 => problems w/redirect

02/05/2004 23:27
Re: Changes in alpha-8 => problems w/redirect

02/06/2004 10:01




0

Created with FORUM 2.0.11