HTTrack Website Copier
Free software offline browser - FORUM
Subject: Change user agent identity under "Browser ID" tab
Author: Bandit
Date: 11/19/2009 18:30
 
Change user agent identity under "Browser ID" tab
=========================================================

Using the default options, the User Agent ID of HTTrack is
"Mozilla/4.5 (compatible; HTTrack 3.0x; Windows 98)"
as noted in your log file above.

Changing it to
"Mozilla/4.5 (compatible; MSIE 4.01; Windows NT)"
worked for me, with all other settings at their defaults.  See log snippet:
=========================================================

HTTrack3.43-7+htsswf+htsjava launched on Thu, 19 Nov 2009 12:10:14 at
<http://www.w3schools.com/css/> +*.png +*.gif +*.jpg +*.css +*.js
-ad.doubleclick.net/* -mime:application/foobar

(winhttrack -qwC2%Ps2u1%s%uN0%I0p3DaK0H0%kf2A25000%f#f -F "Mozilla/4.5
(compatible; MSIE 4.01; Windows NT)" -%F "<!-- Mirrored from %s%s by HTTrack
Website Copier/3.x [XR&CO'2008], %s -->" -%l "en, en, *"
<http://www.w3schools.com/css/> -O1 "C:\My Web Sites\W3S CSS" +*.png +*.gif
+*.jpg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar )
[...]
12:10:15 Info:  Note: due to www.w3schools.com remote robots.txt rules, links
begining with these path will be forbidden: /quiztest, /banners, /images,
/ado/demo_db_edit.asp, /html/tryit.asp, /css/tryit.asp, /dom/tryit.asp,
/js/tryit.asp, /dhtml/tryit.asp, /jsref/tryit.asp, *.aspx$, *.php$ (see in the
options to disable this)
[...]
HTTrack Website Copier/3.43-7 mirror complete in 3 minutes 56 seconds : 362
links scanned, 362 files written (6146961 bytes overall) [6224304 bytes
received at 26374 bytes/sec], 32441 bytes transfered using HTTP compression in
1 files, ratio 33%
(No errors, 0 warnings, 2 messages)
=========================================================

Note the robots.txt warning.  Now you have somewhere to start, if you don't
get all you want/need from the mirror, you could next disable following the
robots.txt rules under Options -> Spider.

In case you haven't already, you might want to check out my reply to your
other question:
<http://forum.httrack.com/readmsg/22308/22220/index.html>
and in response to that, I also posted
"The same or similar info can be found at 
<http://www.httrack.com/html/filters.html> FYI"

 
Reply Create subthread


All articles

Subject Author Date
Downloads only minimum files

11/19/2009 16:02
Change user agent identity under "Browser ID" tab

11/19/2009 18:30
Re: Downloads only minimum files

11/19/2009 19:31
Re: Downloads only minimum files

11/20/2009 14:10




a

Created with FORUM 2.0.11