HTTrack Website Copier
Free software offline browser - FORUM
Subject: (creating a) problematic sites list
Author: Haudy Kazemi
Date: 05/13/2002 10:14
 
This is another comment/FYI thing...

I've noticed that there are some problematic/broken 
websites out there that are confusing HTTrack or 
simply make 'clean mirroring' difficult.  Here are 
some:

'Bad' servers like:
home.att.net according to Xavier "This server is just 
crap, with "200" HTML responses to .gif requests, and 
therefore accepting all "gif" files is not a good 
idea :("

Sourceforge and open source software links: beware of 
the CVS sections, and try to keep HTTrack out of the 
webCVS systems.  If you don't, you'll end up copying 
thousands of links you probably don't want.  (The 
source is usually available without going thru the 
CVS.)

Message boards: avoid them with HTTrack...they create 
thousands of links too, many recursive, so a max link 
depth is very important here.

BTW, Xavier how can I check to see what type of 
response any given server gives to a request, as in 
the case of home.att.net.  Can you mention a tool 
(command line on Windows/DOS is fine, a Linux tool if 
that's all you know of.)?
 
Reply


All articles

Subject Author Date
(creating a) problematic sites list

05/13/2002 10:14
Re: (creating a) problematic sites list

05/13/2002 22:54
HTTP Header Viewer

05/14/2002 00:51
Re: (creating a) problematic sites list

05/14/2002 01:10
Re: HTTP Header Viewer

05/14/2002 02:49
Re: (creating a) problematic sites list

05/16/2002 22:22




b

Created with FORUM 2.0.11