HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Dowloading one single filetype only from pages
Author: William Roeder
Date: 04/23/2009 23:12
 
> 17:26:22 Info:  Note: due to www.ted.com remote
> robots.txt rules, links begining with these path
> will be forbidden: /index.php/profiles/browse (see
> in the options to disable this)

Self explanatory.
<http://www.ted.com/robots.txt> says:
User-agent: *
Disallow: /index.php/profiles/browse

> 17:26:24 Warning:  File not parsed, looks like
> binary: www.ted.com/
> 17:26:24 Error:  "Open error when decompressing"
> (-1) at link www.ted.com/ (from primary/primary)

I've seen this on some sites that don't like the default browser id. Change to
something else. I use:
Mozilla/4.0 (compatible; MSIE 5.0; Win32)
 
Reply Create subthread


All articles

Subject Author Date
Dowloading one single filetype only from pages

04/22/2009 21:12
Re: Dowloading one single filetype only from pages

04/22/2009 21:53
Re: Dowloading one single filetype only from pages

04/22/2009 21:55
Re: Dowloading one single filetype only from pages

04/23/2009 17:29
Re: Dowloading one single filetype only from pages

04/23/2009 23:12




9

Created with FORUM 2.0.11