| Hi! I'm trying to mirror starting with a local file. I've
looked through the forum and am still having troubles.
1. The URL (file://C:\foo.html) works when I cut & paste it
into IE
2. all the file's links are external
3. I've turned external spidering- set max external depth
to 2, global travel mode="go everywhere on web", "can go
both up and down" to be all-inclusive
4. turned off robots.txt rules
5. turned on debug
6. I'm using WinHTTrack 3.30- any known problems with file:
on that version?7. Please add a new item to the FAQ about downloading local
files!
TIA!
-y
Here's foo.html (first few lines)
<html><head><title>zookeeper playlists as of feb 13
2004</title></head>
<body>
<a
href=http://zookeeper.stanford.edu/index.php3?seq=selList&action=viewDJ&playlist=41722004-02-10
* Meat
Man Whistle Punching Sex Context</a><BR>
<a
href=http://zookeeper.stanford.edu/index.php3?seq=selList&action=viewDJ&playlist=41202004-02-03
*
MMMMeeeet mmmmaaaaNNN wisssle pbpbpunchiinnnng SEXXX
qontest</a><br>
<a
href=http://zookeeper.stanford.edu/index.php3?seq=selList&action=viewDJ&playlist=40652004-01-27
* Meat
Man Whistle Punching Sex Contest</a><br>
<a
href=http://zookeeper.stanford.edu/index.php3?seq=selList&action=viewDJ&playlist=40032004-01-20
* Meat
Man Whistle Punching Sex Contest</a><br>
....
Here's the debug output-
HTTrack3.30+swf launched on Fri, 13 Feb 2004 10:16:42 at
<file://foo.html> +*.*
(winhttrack -qwr2%e2C2%P0ns0u1Z%sN0%Ip3BeK0H0%kf2%f#f -
F "Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)" -%
F "<!-- Mirrored from %s%s by HTTrack Website Copier/3.x
[XR&CO'2003], %s -->" -%l "en, *" <file://foo.html> -O
C:\tmp\X,C:\tmp\X +*.* -%A
php3,php,php2,asp,jsp,pl,cfm,nsf=text/html )
Information, Warnings and Errors reported for this mirror:
note: the hts-log.txt file, and hts-cache folder, may
contain sensitive information,
such as username/password authentication for websites
mirrored in this project
do not share these files/folders if you want these
information to remain private
10:16:42 Info: engine: init
10:16:42 Info: engine: start
10:16:42 Debug: Wait get: primary/primary
10:16:42 Info: engine: check-html: primary/primary
10:16:42 Debug: scan file..
10:16:42 Debug: indexing file..done
10:16:42 Debug: link detected in html: <file://foo.html>
10:16:42 Debug: position link check <file://foo.html>
10:16:42 Debug: build relative link <file://foo.html> with
primary/primary
10:16:42 Debug: built relative link <file://foo.html> with
primary/primary -> <file:////foo.html>
10:16:42 Debug: wizard link test at <file:////foo.html>..
10:16:42 Debug: wizard test begins: <file:////foo.html>
10:16:42 Debug: Compare addresses: <file://!=primary>
10:16:42 Debug: result for wizard link test: 0
10:16:42 Info: engine: save-name: local name: //foo.html -
> localhost_/foo.html
10:16:42 Debug: Record: <file:////foo.html> ->
C:/tmp/X/localhost_/foo.html
10:16:42 Debug: relative link at file:// build with
C:/tmp/X/localhost_/foo.html and C:/tmp/X/index.html:
localhost_/foo.html
10:16:42 Debug: OK, NOTE: <file:////foo.html> ->
C:/tmp/X/localhost_/foo.html
10:16:42 Debug: Wait get: <file:////foo.html>
10:16:42 Debug: link #1 is ready, no more on the stack,
skipping: <file:///foo.html>..
10:16:42 Info: No data seems to have been transfered
during this session! : restoring previous one!
10:16:42 Info: engine: end
10:16:42 Info: engine: free
| |