Adding the URL of the page to the resulting file

Subject: Adding the URL of the page to the resulting file

Author: Deane Barker

Date: 08/30/2006 15:46

I want to use HTTrack to spider a Web site to flat files, then use a search
indexer to consume that set of files.

However, I need a way for the indexer to know the corresponding URL of the
flat file it consumes.  So, when the indexer sucks up "a_page.html", it needs
to know that the actual URL to that page on the Internet is
<http://mysite.com/directory/a_page.php?id=7>.

First, is there an easy way to do this?  Does HTTrack log the URL of the page
anywhere in the file it produces?
Can HTTtrack embed META tags when it spiders?  If I could have it drop the URL
it indexed into a META tag on the resulting file, then I can use that in the
search results to find my way to the page URL.

Possible?

All articles

Subject	Author	Date
Adding the URL of the page to the resulting file		08/30/2006 15:46
Re: Adding the URL of the page to the resulting fi		08/30/2006 15:58
Re: Adding the URL of the page to the resulting fi		08/30/2006 16:20
Re: Adding the URL of the page to the resulting fi		08/31/2006 00:52