HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Ignore redirection?
Author: William Roeder
Date: 07/07/2008 22:51
 
> I'm trying to mirror a subdirectory:
> 
> <http://www.site.com/pdf/>*
> I ignore the robots.txt file
> 
> I get an error "first file can't be found"
> 
> I know there are a bunch of pdf files (unknown names
> and quantity) in this directory that are linked to
> from other folders.  When I use my browser to go
> this this directory, it redirects me to
> www.site.com.  I can view a file: 
> www.site.com/pdf/file01.pdf
> 
> Any ideas on how to mirror this specific directory? 
> I'm new to HT, so I may not have provided enough
> information.
> 
> I tried adding a filter +www.site.com/pdf/*.pdf

> I'm trying to mirror a subdirectory:
> 
> <http://www.site.com/pdf/>*
> I ignore the robots.txt file
> 
> I get an error "first file can't be found"

Asterisk is not a valid file name.  Asterisks are valid only in filters

> I know there are a bunch of pdf files (unknown names
> and quantity) in this directory that are linked to
> from other folders.  When I use my browser to go
> this this directory, it redirects me to
> www.site.com.  I can view a file: 
> www.site.com/pdf/file01.pdf

If you could view "www.site.com/pdf/" then httrack could get all files in that
directory. Since that directory is protected, you'll have to mirror the site.
 
> Any ideas on how to mirror this specific directory? 
> I'm new to HT, so I may not have provided enough
> information.
> 
> I tried adding a filter +www.site.com/pdf/*.pdf

Mirror the site.  Uncheck options -> Links -> Get non-HTML and add a filter
+*.pdf
 
Reply Create subthread


All articles

Subject Author Date
Ignore redirection?

07/07/2008 18:55
Re: Ignore redirection?

07/07/2008 22:51




5

Created with FORUM 2.0.11