HTTrack Website Copier
Free software offline browser - FORUM
Subject: Adding the URL of the page to the resulting file
Author: Deane Barker
Date: 08/30/2006 15:46
 
I want to use HTTrack to spider a Web site to flat files, then use a search
indexer to consume that set of files.

However, I need a way for the indexer to know the corresponding URL of the
flat file it consumes.  So, when the indexer sucks up "a_page.html", it needs
to know that the actual URL to that page on the Internet is
<http://mysite.com/directory/a_page.php?id=7>.

First, is there an easy way to do this?  Does HTTrack log the URL of the page
anywhere in the file it produces?
Can HTTtrack embed META tags when it spiders?  If I could have it drop the URL
it indexed into a META tag on the resulting file, then I can use that in the
search results to find my way to the page URL.

Possible?
 
Reply


All articles

Subject Author Date
Adding the URL of the page to the resulting file

08/30/2006 15:46
Re: Adding the URL of the page to the resulting fi

08/30/2006 15:58
Re: Adding the URL of the page to the resulting fi

08/30/2006 16:20
Re: Adding the URL of the page to the resulting fi

08/31/2006 00:52




a

Created with FORUM 2.0.11