HTTrack Website Copier
Free software offline browser - FORUM
Subject: generating abstract web graph
Author: kostas
Date: 04/08/2008 19:37
 
hi,

what i need to produce is:

a file, lets call it IDS:

URL1 id1\n
URL2 id2\n
URL3 id3\n
...

a file, lets call it SIZES:

id1 41234
id2 362627
id3 53667
...

a file, lets call it GRAPH:

id1 id2
id1 id3
id2 id4
...


the IDS file will contain to columns,
the first column is the absolute url of each and every page, image, etc...
the second column is a unique id associated with each url.

the SIZES file contains two columns,
the first is an id taken from the IDS file
the second is the corresponding size in bytes

the GRAPH file contains two columns,
the first indicates a FROM url
and the second indicates a TARGET url

thus the web graph can be reproduced e.g.
id1 -> id2 -> id3
    -> id4 -> id2


is there a way to produce such files with httrack?at least not the same files,
but something that i can work with...

many thanks,
kostas

 
Reply


All articles

Subject Author Date
generating abstract web graph

04/08/2008 19:37




c

Created with FORUM 2.0.11