HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Parsing HTML & ETA Estimation.
Author: William Roeder
Date: 03/14/2011 17:55
 
> 1) Is it possible to make the parsing / downloading
> any faster? 300 B/s looks like I'm gonna have to
Parsing is fast (million lines/sec,) download is dependent on your bandwidth
and the site.

> 2) Parsing doesn't mean the actual downloading,
> right? It's only some sort of scanning process?correct

> 3) Is it possible that the forum site was somehow
> overloaded by the scanning and downloading and thus
> it looks like it's stuck?very possible

> 4) How come that 0 to 180 MB was a matter of like 30
> - 45 minutes and than it looks like, it's stuck, but
> still it's doing something, but looks terribly
> slow.
Either you overloaded it and crashed it, or you over taxed it and it's slowed
you down (denial of service attack) or you overrode robots.txt and you're in a
robot trap.

Pause the mirror and wait an hour. Set the connections/sec=1 connections 1 or
2.
If that doesn't help, changing IP addresses might (on dsl/cable restarting the
modem may generate a new address.)

> 5) Any possible way, how to determine, how big that
> mirror's gonna be? I used the default settings,
Once you parse the last html AND start downloading the last file.
 
Reply Create subthread


All articles

Subject Author Date
Parsing HTML & ETA Estimation.

03/14/2011 16:31
Re: Parsing HTML & ETA Estimation.

03/14/2011 17:55
Re: Parsing HTML & ETA Estimation.

03/14/2011 22:50
Re: Parsing HTML & ETA Estimation.

03/15/2011 02:03




e

Created with FORUM 2.0.11