| Hi!
I'm trying to archive a blog, say the base URL is
<http://myjazzworld.blogspot.com/>
using HTTrack (latest version).
My scan rules are:
-*
+*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/*
-mime:application/foobar
+*myjazzworld.blogspot.com/2009/*
+*myjazzworld.blogspot.com/2008/*
+*myjazzworld.blogspot.com/2007/*
Options "Attempt to detect all links" and "Get HTML files first!" are
enabled.
I've tried depth = 2, 3 and 4 (GB and hours!).
All I get is 2009, a few months in 2008 and nothing from 2007 even though
there is plenty on the site and that these folders are well within 3 clicks of
(and referenced on) the home page.
Should I go for infinite? Is there any other way? Other options?
Also: is there any way to save only images from a certain domain in the same
dir as html files?
Thanks!! Zaz
| |