HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: asp links not resolved
Author: Dan_A
Date: 10/18/2004 18:01
 
> > It looks like httracker followed all of the asp links 
> > correctly, downloading all of the files (mirror 
depth=4), 
> > but did not replace the asp links on most of the pages 
> with 
> > links pointing to the new html files.  Furthermore, it 
> > added the server in front of each pointer, whether for 
> > style sheets or images 
> > (e.g., <http://server.name/css/style.css> instead of 
> > leaving the original "css/style.css" - same for 
images).  
> 
> It means that httrack considered these links as "outside 
> the default mirror scope" -- that is, unsuitable to be 
> downloaded by default.
> 
> Check if you have robots.txt limits, and/or use scan 
rules 
> (Set Options / Scan rules) to widen the default mirorr 
> scope (using for example +www.example.com/*)
> 

The files were downloaded - but the links to those files 
were not altered:  e.g., href link to "index.asp?p1=x&p2=y" 
remained, although the file to which it actually pointed on 
the web was now renamed locally to, say, "index3c86.html?p1=x&p2=y."

I always had "no robots.txt" explicitly set, and had the 
server in the scan rules (e.g.,
-*
+www.server.tld*
etc.

Redoing all those links, even with a shell script, will 
take forever....  Any ideas to avoid that would be very 
much appreciated.

TIA.
 
Reply Create subthread


All articles

Subject Author Date
asp links not resolved

10/14/2004 15:16
Re: asp links not resolved

10/17/2004 22:42
Re: asp links not resolved

10/18/2004 18:01
Re: asp links not resolved

10/21/2004 20:44
Re: asp links not resolved

10/29/2004 22:21




a

Created with FORUM 2.0.11