| Sorry, hit the wrong key, not done yet.
So we would like to process the download HTML before
analysis the URL, and make it always point to original
form:
<HTTP://ccool.ccim.org/htdocs/Ccool.nsf/1d245ada2835949e0725>
6aaa00735c16/042DD57E7BB3918685256C120011049502ec.html
=>
<HTTP://ccool.ccim.org/htdocs/Ccool.nsf/0/042DD57E7BB3918685>
256C120011049502ec.html
I found an "User Command" switch on httracker, that did
alter the document, but it's was after analaysis the
HTML. So each article still downloaded and scanned
multiple time.
Where should I modify the code to add a function to
process this? The best way would be call system(),
like "User command", so that we could use perl to do
simple text process.
Appreciate.
Vincent | |