HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Broken Links in my Mirror
Author: William Roeder
Date: 03/11/2010 01:15
 
> ananda/lineage/jesus-christ4246.html
> ananda/lineage/jesus-christ.html.txt
> ananda/lineage/jesus-christd5aa.html
> ananda/lineage/jesus-christ.html/rwsmsh/index.html

Look at the page source and you'll see
class="addthis_button_facebook external"
addthis:url=http://www.ananda.org/ananda/lineage/jesus-christ.html/rwsmsh/
Facebook likewise share and email etc.
So if you have options -> links -> attempt to detect=checked that's where
rwsmsh comes from.
Because of that, since you are using site structure it has to create the
directory ...lineage/jesus-christ.html/rwsmsh/ since that is what the url
states. You beleive jesus-christ.html is a file, it's not, it's a directory.

look in 4246.html and you'll see:
Mirrored from www.ananda.org/ananda/lineage/jesus-christ.html?
utm_source=addthis%26utm_medium=socialmedia%26utm_campaign=Share which the log
says from from rwsmsh/ so parameters a also involved.

Add a filter -*/rwsmsh/* and you'll get what you want.
Also most links are absolute href="/image/..." type and since httrack only
goes down by default most of the links will not be retreived. Add a filter to
override:
+www.ananda.org/* -*/rwsmsh/*
 
Reply Create subthread


All articles

Subject Author Date
Broken Links in my Mirror

03/09/2010 00:14
Re: Broken Links in my Mirror

03/09/2010 18:43
Re: Broken Links in my Mirror

03/10/2010 01:03
Re: Broken Links in my Mirror

03/11/2010 01:15
Re: Broken Links in my Mirror

03/11/2010 01:50
Re: Broken Links in my Mirror

03/13/2010 19:28




4

Created with FORUM 2.0.11