HTTrack Website Copier
Free software offline browser - FORUM
Subject: Following chains of urls in page-turning sites
Author: B.T. Raven
Date: 01/13/2006 18:50
 
I am trying to harvest some unedited OCR output in html files of the form:
<http://.../cgi-bin/witch/docviewer?did=060&seq=158&frames=0&view=text> The text
of interest is available both as manuscript images (gif, I think) and text. Is
it possible to mirror all the texts from ...&seq=1 to the end of the chain?
Under the text view the characters in the html source are in block:

 <h3>Text of page:</h3>
<pre>
<P><B>Page: </B>158<P>
<I>

text here

</I> 91.<BR>
</pre>


Thanks.
 
Reply


All articles

Subject Author Date
Following chains of urls in page-turning sites

01/13/2006 18:50
Re: Following chains of urls in page-turning sites

01/16/2006 22:33




1

Created with FORUM 2.0.11