Anti-HTTrack 404 Protection - HTTrack Website Copier Forum

Subject: Anti-HTTrack 404 Protection

Author: Will

Date: 09/28/2013 02:17

I've seen larger, more established websites do this where the page loads a
file, and you are also allowed to access the file via viewing of page source.
However, when you try to access a directory containing the file and others
similar to it, the site returns a 404.

For example, sites like NotDoppler have this protection where you can crawl
<http://i.notdoppler.com/files/strikeforceheroes2.swf>, though when you try to
spider <http://i.notdoppler.com/files/> or <http://i.notdoppler.com>, httrack
returns | Error:  "Not Found" (404) at link i.notdoppler.com/

Other similar sites such as 1cup1coffee.com do not have these restrictions and
allow downloading all the swf content in the website. 

Httrack is set to ignore robots.txt, and if the described urls are entered in
a browser it will return 404 as well. How can you download the data and how
does it work?

All articles

Subject	Author	Date
Anti-HTTrack 404 Protection		09/28/2013 02:17
Re: Anti-HTTrack 404 Protection		09/28/2013 10:11
Re: Anti-HTTrack 404 Protection		09/28/2013 15:32
Re: Anti-HTTrack 404 Protection		09/29/2013 13:54
Re: Anti-HTTrack 404 Protection		09/29/2013 18:05
Re: Anti-HTTrack 404 Protection		09/30/2013 11:42
Re: Anti-HTTrack 404 Protection		10/03/2013 02:46
Re: Anti-HTTrack 404 Protection		01/17/2014 01:14
Re: Anti-HTTrack 404 Protection		04/18/2020 20:30