Re: links scanned - parsing huge site - HTTrack Website Copier Forum

Subject: Re: links scanned - parsing huge site

Author: William Roeder

Date: 04/03/2009 22:00

> Okay that clears up a lot, i expect it'll take some
> space and some time.
> 
> If I have that many files 'downloaded in the
> background' does that mean they are queued up to be
> browesed for links or mirrored? 

This means the files have been downloaded. When httrack gets to them it will
scan for new links and/or modify the links to point to the local files.

> I'm trying to figure out what my bottleneck is, i'd
> assumed it would be the sites bandwidth [limiting to
> not overload] .. but the numbers make it seem like
> the bottlenecks my cpu? Only thing is my cpu's arn't
> running near capacity..

Send out an information request, wait for the reply, realize the file hasn't
changed. About 5/sec is my max on sites that don't support persistent
connections (options -> flow) 

cpu is irrevelent. The 3.32 and earlier versions use all available
connections. The latest uses only one.
<http://forum.httrack.com/readmsg/19897/19894/index.html>

> One last thing.. i'm going about 3 deep, and the
> first two level are just large amounts of links..
> the 3rd level is the html that i want, is there a
> way to grab links in the first two levels and html
> in the last level?
to get the links you have to download the html

Create subthread

All articles

Subject	Author	Date
links scanned - parsing huge site		04/03/2009 05:32
Re: links scanned - parsing huge site		04/03/2009 16:07
Re: links scanned - parsing huge site		04/03/2009 21:20
Re: links scanned - parsing huge site		04/03/2009 22:00
Re: links scanned - parsing huge site		03/06/2021 10:36