| Hello!
I read your FAQ, search (e.g. "robot") and read forum.
Debian, httrack command-line mode, use urllist and ruleslist files. Robots
rules: --robots=0 (or --robots=N0, or sN0, or s0).
I load site load_domain.domain.com (blog), to urllist add
<http://load_domain.domain.com/>
httrack load <http://load_domain.domain.com/robots.txt>, read this file,,, and
finish.
OK, paste link <http://load_domain.domain.com/link.html>
Httrack load link.html, load robots.txt ,,, and finish.
In robots.txt me read:
User-Agent: *
Disallow: /
P.S.
1) Other blog this domain very good loading (robots.txt read only "User-Agent:
*") to httrack, very good.
2) Use <http://load_domain.domain.com/> from browser - good loaded and viewed.
Uses specific user-agent browser? or other method? | |