HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: 18 hours for a complete failure ?
Author: Charles
Date: 04/22/2012 13:41
 
The css file only handles how web pages are displayed in browsers. I don't know
why Httrack didn't catch the file.

I have used the Windows version of Httrack for about eight years but I have
not mirrored every type of website there is - so I am not an expert; maybe
just an advanced user when it comes to the types of websites that I have dealt
with.

The website you mirrored is large, and if it is an active website - which it
appears to be - content is possibly being added while the mirroring is going
on. I don't know what kind of page and link shifting is taking place when
content is added to the site.

I have no idea what your plans are for the mirror, but if I were mirroring a
large site like this, if possible I would look for ways to shorten the
mirroring process. For instance, I would ask myself if I really needed all of
the links next to where "Tags:" is written. If the links next to Tags: are
needed then fine. But getting the tags links is causing a whole lot of
redundant copying.

Getting the "Author:" links is also causing redundant copying but I can see
where I would want those links more than the tags links.


The Httrack default options, at least in the Windows version of Httrack, are
not 'general purpose' options. Different websites can need different options
to be used. There are just too many different websites, and variations of
website designs, to have a one-size-fits-all default options.

On a site like this ps3 site, in "Flow Control" I would allow only three
connections max. And as William Roeder mentioned, for browser ID, 'MSIE 6.0"
should be used. And I always use "no robots.txt rules".

Another thing that William Roeder mentioned was, "Did you use the near flag
(get non-html files related) so you get css/js no matter where stored".


I cannot tell you why the 'page not found' web pages in your mirror were not
copied, looking at the links on the original website they should have been.
 
Reply Create subthread


All articles

Subject Author Date
Re: 18 hours for a complete failure ?

04/22/2012 09:26
Re: 18 hours for a complete failure ?

04/22/2012 13:41




1

Created with FORUM 2.0.11