HTTrack Website Copier
Free software offline browser - FORUM
Subject: Repeated scanning same page.
Author: Vincent
Date: 04/10/2002 18:25
 
I an trying to scan a site hosted on IBM/Lotus Domino.

Most of the pages will have a link back to home page, 
and the home page is a frame page contain three 
dynamically generated pages, one of it just a counter.

When I download this page, it will repeatly scan this 
homepage, and all threads were waiting for that.  So 
if this site have 10,000 pages, this pages might have 
download and scan 10,000 times.  

Same situation for each sub-area main page.

A side effect on this is that it will increase the 
counter every time, thus the page view shut up a lot 
after I tried to scan it.

What should I do to avoid this?  I tried depth, but 
it's not a good solution.

Is there anyway to add a switch that only update one 
URL once in each update operation?  Or better, might 
allow a counter for max update.

When you insert new URL into the queue, do you check 
if this link already been updated or already in queue 
or already processed?
 
Reply


All articles

Subject Author Date
Repeated scanning same page.

04/10/2002 18:25
Re: Repeated scanning same page.

04/10/2002 19:01
Re: Repeated scanning same page.

04/10/2002 20:02
Re: Repeated scanning same page.

04/10/2002 20:57
Re: Repeated scanning same page.

04/10/2002 21:10
Re: Repeated scanning same page.

04/12/2002 01:29




0

Created with FORUM 2.0.11