I heard about and read a little about python scripts and the terminal with
httrack. Would it be possible for say to use httrack spider to crawl up one
dir and down dir as configured (this would allow to get links for dynamic
pages) until it has loaded a list of links removing invalid ones. Then pass
those links to a browser environment that loads the page fully. Then have
httrack scroll and scrap that page only after that has taken place. Regular
expression or keywords would be used to discriminate between links that lead
to down or descending directories. |