HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: HTTrack development question.
Author: DavidS
Date: 02/26/2005 11:51
 
As for merging projects; I have had some experimental success
with using the Apache Webserver to host two HTTrack projects,
and mirroring the local webpages with HTTrack. This method makes
a new project and doesn't rely on the previous HTTrack caches.

I believe that HTTrack could do this operation without Apache
if HTTrack would accept "Directory Paths" in addition to "URLs".

Here is the method:


1) *** Be sure to back up whatever files you modify.
   *** Be sure to have your HTTrack projects that you wish
to merge,
       and be sure Apache (httpd) has Read permission for
your files.

/home/tester/websites/HS/www.healthandsafetycentre.org 
(project 1)
/home/tester/websites/CCOHS/www.ccohs.ca               
(project 2)

Step 6) shows the new project that will be made.
/home/tester/websites/CCOHSmerged                      
(merged project)

2) *** Make soft links in the directories mentioned above.
       These soft-links are harmless and facilitate hosting the
       project with Apache (the web site must have the same
name as
       the directory to make HTTrack think it is surfing the
web).

Note that ~/websites/HS/www.healthandsafetycentre.org may
have several
linked websites: eg.
~/websites/HS/whmis.healthandsafetycentre.org,
~/websites/HS/fishing.healthandsafetycentre.org etc.

You must make soft links in every
HTTrack-linked-URL-directory there is in
a given project (project directory excluded, and
URL-subdirectories excluded).
 eg. the following soft links will be required in
 ~/websites/HS/www.healthandsafetycentre.org

 ln -s ../www.healthandsafetycentre.org
www.healthandsafetycentre.org
 ln -s ../whmis.healthandsafetycentre.org
whmis.healthandsafetycentre.org
 ln -s ../fishing.healthandsafetycentre.org
fishing.healthandsafetycentre.org

~/websites/HS/whmis.healthandsafetycentre.org will require
the same,
and so will ~/websites/HS/fishing.healthandsafetycentre.org
for nine
soft links.

Furthermore, the same consideration must be given to 
~/websites/CCOHS/www.ccohs.ca with soft links that are
only related to this project.


3) *** Modify /etc/hosts to add www. webpages ***

127.0.0.2	www.healthandsafetycentre.org
127.0.0.3	www.ccohs.ca


4) *** Add some Virtual hosts to /etc/httpd/conf (Apache
2.0) ***

# Virtual host www.healthandsafetycentre.org
<VirtualHost 127.0.0.2>
 	DocumentRoot
/home/tester/websites/HS/www.healthandsafetycentre.org/
 	ServerAdmin tester@mycomputer.test
 	ServerName www.healthandsafetycentre.org
	DirectoryIndex index.php index.html index.htm index.shtml 
	
	<Directory
"/home/tester/websites/HS/www.healthandsafetycentre.org/">
	 	AllowOverride none
	</Directory> 
 	HostNameLookups off
</VirtualHost>


# Virtual host www.ccohs.ca
<VirtualHost 127.0.0.3>
 	DocumentRoot /home/tester/websites/CCOHS/www.ccohs.ca/
 	ServerAdmin tester@mycomputer.test
 	ServerName www.ccohs.ca
	DirectoryIndex index.php index.html index.htm index.shtml 
	
	<Directory "/home/tester/websites/CCOHS/www.ccohs.ca/">
	 	AllowOverride none
	</Directory> 
 	HostNameLookups off
</VirtualHost>



5) *** Start Apache ***

/sbin/service httpd start



6) *** Launch HTTrack ***
   *** Note that HTTrack does a better job of linking
directories
       (../../) in ~/websites/CCOHSmerged than /tmp/CCOHSmerged

mkdir /home/tester/websites/CCOHSmerged
cd /home/tester/websites/CCOHSmerged
httrack www.ccohs.ca www.healthandsafetycentre.org -c100


7) Clean up whatever stuff you've modified before you forget.


 
Reply Create subthread


All articles

Subject Author Date
HTTrack development question.

02/24/2005 21:20
Re: HTTrack development question.

02/26/2005 11:51




6

Created with FORUM 2.0.11