HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: website recursiveness problems
Author: Xavier Roche
Date: 04/10/2002 21:00
 
> It's hard for me to pinpoint exactly what page had 
this 
> problem in particular because it only has happened 
to 
> me on crawls of large websites with hundreds or 
> thousands of pages.

By the way,
-*//*

may be a fix to (temporarily) avoid problems?
> Lastly, as an FYI I've found that httrack can't 
> successfully handle the javascript used by 
archive.org 

Yes: because all links are WRONG, and the embedded 
javascripting patch them on-the-fly after the document 
load - try to disable javascripting in IE or Netscape, 
and crawl the archive: it will be totally broken. This 
is one of the 'impossible' site for an offline 
browser, except if using a javascript engine (yuk!)


 
Reply Create subthread


All articles

Subject Author Date
website recursiveness problems

04/10/2002 10:07
Re: website recursiveness problems

04/10/2002 19:04
Re: website recursiveness problems

04/10/2002 20:48
Re: website recursiveness problems

04/10/2002 21:00
Re: website recursiveness problems

04/12/2002 17:00




d

Created with FORUM 2.0.11