HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: very large file (>13GB)
Author: William Roeder
Date: 11/05/2010 15:01
 
> <http://www.lexsoft.de/lexisnexis/justizportal_nrw.cg>
> i?chosenIndex=Dummy_nv_68&chosenIndex=Dummy_nv_68&te
> mplateID=gliederung&tree_ordner_id=0000024:RootID&
> 
> HTrack says it has written 45000 files, and worked
> 4000/50000 (+46000) (what do these numbers mean?)
What is the meaning of the Links scanned: 12/34 (+5) line in
WinHTTrack/WebHTTrack? - <http://www.httrack.com/html/faq.html#QM10b>

> copy only the needed files - I'm afraid it copys the
> same site multiple times cause its hyperlink is
> different? 
Yes. HTT can't know the links go to the same page so it downloads each.

> Is there any possibility to change my
> options/filters to decrease estimated download time
> and file size? Is just "wait" a possible option, so
> it's just a matter of time and HDD-size (which
> wouldn't be a problem)?Each link on the start page is in the form:
<http://www.lexsoft.de/lexisnexis/> justizportal_nrw.cgi?
sessionID=2128682748599726980& 
chosenIndex=Dummy_nv_68& templateID=gliederung&
tree_ordner_id=0000024.3321169:RootID

The sessionID requires you mirror the entire site in one go. Otherwise it'll
get a new sessionID, download the new links and delete all the previous.

On the document level there is the hierarchy links:
<http://www.lexsoft.de/lexisnexis/> justizportal_nrw.cgi?
chosenIndex=Dummy_nv_68& templateID=document& source=tree&
chosenIndex=Dummy_nv_68& highlighting=off& xid=3321169,2&
tree_ud_xid=3321169,2&
and the next page navigation links: <http://www.lexsoft.de/lexisnexis/>
justizportal_nrw.cgi? chosenIndex=Dummy_nv_68& templateID=document&
source=lawnavi& chosenIndex=Dummy_nv_68& xid=3321169,3&
Thus, you could remove the navigational links by: -*lawnavi*

Also on each page are various options. 
The view all: <http://www.lexsoft.de/lexisnexis/justizportal_nrw.cgi>?
chosenIndex=Dummy_nv_68& templateID=document& source=document&
chosenIndex=Dummy_nv_68& xid=166694,3& task=chose_fliesstext&
#gesetz_fliesstext_166694,3
filter -*chose_filesstext*
The view full screen: <http://www.lexsoft.de/lexisnexis/justizportal_nrw.cgi>?
chosenIndex=Dummy_nv_68& templateID=vollbild_fliesstext&
filter: -*vollbild_fliesstext*
The pdf: <http://www.lexsoft.de/lexisnexis/justizportal_nrw.cgi>?
chosenIndex=Dummy_nv_68& templateID=chtmltopdf& law=1&
filter: -*chtmltopdf*
...
 
Reply Create subthread


All articles

Subject Author Date
very large file (>13GB)

11/05/2010 11:19
Re: very large file (>13GB)

11/05/2010 11:22
Re: very large file (>13GB)

11/05/2010 15:01
Re: very large file (>13GB)

11/05/2010 16:35
Re: very large file (>13GB)

11/05/2010 19:07
Re: very large file (>13GB)

11/06/2010 09:47




1

Created with FORUM 2.0.11