| Change user agent identity under "Browser ID" tab
=========================================================
Using the default options, the User Agent ID of HTTrack is
"Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
as noted in your log file above.
Changing it to
"Mozilla/4.5 (compatible; MSIE 4.01; Windows NT)"
worked for me, with all other settings at their defaults. See log snippet:
=========================================================
HTTrack3.43-7+htsswf+htsjava launched on Thu, 19 Nov 2009 12:10:14 at
<http://www.w3schools.com/css/> +*.png +*.gif +*.jpg +*.css +*.js
-ad.doubleclick.net/* -mime:application/foobar
(winhttrack -qwC2%Ps2u1%s%uN0%I0p3DaK0H0%kf2A25000%f#f -F "Mozilla/4.5
(compatible; MSIE 4.01; Windows NT)" -%F "<!-- Mirrored from %s%s by HTTrack
Website Copier/3.x [XR&CO'2008], %s -->" -%l "en, en, *"
<http://www.w3schools.com/css/> -O1 "C:\My Web Sites\W3S CSS" +*.png +*.gif
+*.jpg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar )
[...]
12:10:15 Info: Note: due to www.w3schools.com remote robots.txt rules, links
begining with these path will be forbidden: /quiztest, /banners, /images,
/ado/demo_db_edit.asp, /html/tryit.asp, /css/tryit.asp, /dom/tryit.asp,
/js/tryit.asp, /dhtml/tryit.asp, /jsref/tryit.asp, *.aspx$, *.php$ (see in the
options to disable this)
[...]
HTTrack Website Copier/3.43-7 mirror complete in 3 minutes 56 seconds : 362
links scanned, 362 files written (6146961 bytes overall) [6224304 bytes
received at 26374 bytes/sec], 32441 bytes transfered using HTTP compression in
1 files, ratio 33%
(No errors, 0 warnings, 2 messages)
=========================================================
Note the robots.txt warning. Now you have somewhere to start, if you don't
get all you want/need from the mirror, you could next disable following the
robots.txt rules under Options -> Spider.
In case you haven't already, you might want to check out my reply to your
other question:
<http://forum.httrack.com/readmsg/22308/22220/index.html>
and in response to that, I also posted
"The same or similar info can be found at
<http://www.httrack.com/html/filters.html> FYI"
| |