HTTrack Website Copier
Free software offline browser - FORUM
Subject: Please check other post
Author: Bandit
Date: 11/25/2009 19:18
> Info:  Note: due to remote
> robots.txt rules, links begining with these path
> will be forbidden: /includes, /images (see in the
> options to disable this)


Please check my other post:

Referring to that, please try:
1. set the filters ("scan rules"), suggested:
-* +** 

2. change the build structure:
either %h%p/%n%[cover_id:.ID=:::].%t%[file:.:::]
or %h/%[cover_id:View.ID=:.html::]%[file::::]
(NOTE: this line:
was a copy/pasting error, sorry)

3a. set max mirroring depth to 2 (not critical)
3b. add the proper list of starting URL's
e.g. <>
to however high "cover_id=###" you want to try to test with

I don't think you need to disable robots.txt, at least not as evidenced by the
screenshot you posted.

I think the "Content-Type: application/x-download" that the server is
reporting for download_cover.php is preventing HTT from properly naming the
file as a .jpg on its own.  I haven't found a way to work around that yet,
other than setting up the user defined structure.


Reply Create subthread

All articles

Subject Author Date
Re: Get only the images from a certain path

11/25/2009 16:53
Please check other post

11/25/2009 19:18
Re: Please check other post

11/25/2009 22:03
Re: Please check other post

11/28/2009 00:16


Created with FORUM 2.0.11