Re: HTTrack (Tumblr) Method - HTTrack Website Copier Forum

Subject: Re: HTTrack (Tumblr) Method

Author:

Date: 02/09/2012 12:46

Hmm, I think I see the problems now. Firstly, that tumblr page is using a lot
of javascript and there do not appear to be conventional links for
non-javascript browsers to follow. httrack tries to parse javascript but often
has trouble with it. You need to give httrack a list of links it can follow.
For convenience I've extracted all the links for you and put them here
<http://pastebin.com/Ty6rwhi0> Copy them into a .txt file and look below the Web
Addresses box in httrack and there's a "URL list" feature, where you specify
the txt file. You can leave the Web Addresses box empty. I know this seems
complicated but it is the only work-around I can think of to download that
site.

In addition, the filters in your pic are probably causing errors as well;

<http://fablefaser.tumblr.com/archive?*=*>>;;

should be

+http://fablefaser.tumblr.com/archive?*=*

and remove 

+*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/*
-mime:application/foobar

as the other filters override this anyway.


>Will using command line only work better perhaps

No, they are identical apart from the user interface. Stick with the GUI
unless you are batch scripting crawls.

Create subthread

All articles

Subject	Author	Date
HTTrack (Tumblr) Method		02/08/2012 12:12
Re: HTTrack (Tumblr) Method		02/08/2012 12:14
Re: HTTrack (Tumblr) Method		02/08/2012 15:51
Re: HTTrack (Tumblr) Method		02/08/2012 18:18
Re: HTTrack (Tumblr) Method		02/08/2012 18:34
Re: HTTrack (Tumblr) Method		02/08/2012 18:34
Re: HTTrack (Tumblr) Method		02/08/2012 18:37
Re: HTTrack (Tumblr) Method		02/08/2012 23:58
Re: Update		02/09/2012 11:33
Re: HTTrack (Tumblr) Method		02/09/2012 11:35
Re: HTTrack (Tumblr) Method		02/09/2012 12:46
Re: HTTrack (Tumblr) Method		02/09/2012 15:15
Re: HTTrack (Tumblr) Method		02/09/2012 20:48
Re: HTTrack (Tumblr) Method		02/11/2012 13:34
Re: HTTrack (Tumblr) Method		02/15/2012 06:26