HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Mirroring problems
Author: Xavier Roche
Date: 05/07/2005 13:59
> 1 - A mirror process that uses generation of external
> pages and uses limit options that are not depth/filters
> (for example maximum size or maximum links - at least)
> results in some of the links to the URLs that weren't
> grabbed to be relative instead of external.html
>?link=<external-link>. It seems somehow that reaching the
> limit all the process of finishing up properly is not
> done.

Yes, it breaks links. "Low-level" limits (that is, "depth" or "max size"
limits) breaks links, and there is no workaround yet. The best way is to use
scan rules, which was designed not to break links (at least when using regular
rules, and not "size" rules), and which is generally a more powerful way to
control the mirror scope ("depth" is a very poor way to bound a mirror -
especially when most content is located in "deep" structure, and when you have
many irrelevant content in first levels)

> I've tested this on 3.3.2 and 3.3.3, getting sometimes lots of "404" pages.
> 2 - Seeing that i was going to have to fix the "404" i used the --testsite
option, but it seems that only links are followed. A missing image is not
catched by this test.
> Is there a change for a fix for these two problems?> 
> Thanks in advance,
>   Joao Luzio

Reply Create subthread

All articles

Subject Author Date
Mirroring problems

05/03/2005 20:48
Re: Mirroring problems

05/07/2005 13:59


Created with FORUM 2.0.11