HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: httrack on Ubuntu not downloading what expecte
Author: William Roeder
Date: 01/10/2009 16:20
 
> 11:46:45	Info: 	Note: due to www.w3.org remote
> robots.txt rules, links begining with these path
> will be forbidden: ...

> 11:46:45	Warning: 	File has moved from
> www.w3.org/1999/xhtml to
> <http://www.w3.org/1999/xhtml/>

> HTTrack Website Copier/3.42-3 mirror complete in 3
> seconds : 6 links scanned, 3 files written (8919
> bytes overall), 1 files updated [3680 bytes received
> at 1226 bytes/sec]

With robots.txt I get:
HTTrack Website Copier/3.43-2 mirror complete in 8 seconds : 16 links scanned,
16 files written (291902 bytes overall) [261160 bytes received at 32645
bytes/sec], 53457 bytes transfered using HTTP compression in 1 files, ratio
34%
(No errors, 1 warnings, 93 messages)

With no robots.txt:
HTTrack Website Copier/3.43-2 mirror complete in 15 seconds : 42 links
scanned, 44 files written (462331 bytes overall) [438824 bytes received at
29254 bytes/sec], 53615 bytes transfered using HTTP compression in 1 files,
ratio 34%
(No errors, 3 warnings, 203 messages)

Here's my winprofile.ini to compare
Near=1
Test=0
ParseAll=1
HTMLFirst=1
Cache=1
NoRecatch=0
Dos=0
Index=1
WordIndex=0
Log=1
RemoveTimeout=0
RemoveRateout=1
KeepAlive=1
FollowRobotsTxt=0
NoErrorPages=1
NoExternalPages=1
NoPwdInPages=0
NoQueryStrings=0
NoPurgeOldFiles=1
Cookies=1
CheckType=1
ParseJava=1
HTTP10=0
TolerantRequests=0
UpdateHack=1
URLHack=1
StoreAllInCache=0
LogType=1
UseHTTPProxyForFTP=1
Build=5
PrimaryScan=3
Travel=1
GlobalTravel=0
RewriteLinks=0
BuildString=%%h%%p/%%n%%q.%%t
Category=
MaxHtml=
MaxOther=
MaxAll=
MaxWait=
Sockets=2
Retry=9
MaxTime=
TimeOut=300
RateOut=5
UserID=Mozilla/4.0 (compatible; MSIE 5.0; Win32)
Footer=<!-- Mirrored from %%s%%s by HTTrack Website Copier/3.x [XR&CO'2004],
%%s -->
MaxRate=95232
WildCardFilters=+*.css +*.js -ad.doubleclick.net/* -*microsoft.com*
-*paypal.com* -*ccbill.com*%0d%0a+*.gif +*.jpg +*.png +*.tif +*.bmp -*.gif[<9]
-*.jpg[<9]%0d%0a+*.mov +*.mpg +*.mpeg +*.avi +*.asf +*.mp3 +*.mp2 +*.rm +*.wav
+*.vob +*.qt +*.vid +*.ac3 +*.wma +*.wmv%0d%0a+*.zip +*.tar +*.tgz +*.gz
+*.rar +*.z +*.exe%0d%0a-*%3dD -*b_images/* -*.fling.com*
Proxy=
Port=
Depth=2
ExtDepth=
MaxConn=5
MaxLinks=
MIMEDefsExt1=asp,php3,php,php2,asp,jsp,pl,cfm,nsf
MIMEDefsExt2=wmv
MIMEDefsExt3=rmj
MIMEDefsExt4=
MIMEDefsExt5=
MIMEDefsExt6=
MIMEDefsExt7=
MIMEDefsExt8=
MIMEDefsMime1=text/html
MIMEDefsMime2=video/x-ms-wmv
MIMEDefsMime3=application/vnd.rn-realsystem-rmj
MIMEDefsMime4=
MIMEDefsMime5=
MIMEDefsMime6=
MIMEDefsMime7=
MIMEDefsMime8=
CurrentUrl=http://www.w3schools.com/XPath/
CurrentAction=0
CurrentURLList=
 
Reply Create subthread


All articles

Subject Author Date
httrack on Ubuntu not downloading what expected

01/10/2009 11:48
Re: httrack on Ubuntu not downloading what expecte

01/10/2009 16:20
Re: httrack on Ubuntu not downloading what expecte

01/12/2009 17:12
Does not save to path

09/07/2012 19:28




6

Created with FORUM 2.0.11