| Always post the command line used (or log file line two) so we know what you
did, not what you think you did.
> I've set my project to download an entire
> [website-url].com plus checked the "get non-HTML
> files", "no error pages" and "no external pages"
> options and added +*.PDF as a filter.
Since you said to get everything (get non-html) the +*.pdf does nothing.
> But if an old link like
> [another-random-url].com/file.pdf returns a HTML
> page for file not found error without clearly
You added a filter to get anything that ended with pdf. Since that page is an
html that's what you got.
> explaining that it is a error page, httrack starts
> to download the entire [another-random-url].com
> domain.
>
> Is there any way to automatically prevent this?Drop your filter.
| |