| Hmm, I think I see the problems now. Firstly, that tumblr page is using a lot
of javascript and there do not appear to be conventional links for
non-javascript browsers to follow. httrack tries to parse javascript but often
has trouble with it. You need to give httrack a list of links it can follow.
For convenience I've extracted all the links for you and put them here
<http://pastebin.com/Ty6rwhi0> Copy them into a .txt file and look below the Web
Addresses box in httrack and there's a "URL list" feature, where you specify
the txt file. You can leave the Web Addresses box empty. I know this seems
complicated but it is the only work-around I can think of to download that
site.
In addition, the filters in your pic are probably causing errors as well;
<http://fablefaser.tumblr.com/archive?*=*>>;;
should be
+http://fablefaser.tumblr.com/archive?*=*
and remove
+*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/*
-mime:application/foobar
as the other filters override this anyway.
>Will using command line only work better perhaps
No, they are identical apart from the user interface. Stick with the GUI
unless you are batch scripting crawls. | |