| hi,
what i need to produce is:
a file, lets call it IDS:
URL1 id1\n
URL2 id2\n
URL3 id3\n
...
a file, lets call it SIZES:
id1 41234
id2 362627
id3 53667
...
a file, lets call it GRAPH:
id1 id2
id1 id3
id2 id4
...
the IDS file will contain to columns,
the first column is the absolute url of each and every page, image, etc...
the second column is a unique id associated with each url.
the SIZES file contains two columns,
the first is an id taken from the IDS file
the second is the corresponding size in bytes
the GRAPH file contains two columns,
the first indicates a FROM url
and the second indicates a TARGET url
thus the web graph can be reproduced e.g.
id1 -> id2 -> id3
-> id4 -> id2
is there a way to produce such files with httrack?at least not the same files,
but something that i can work with...
many thanks,
kostas
| |