HTTrack Website Copier
Free software offline browser - FORUM
Subject: My settings for downloading a Tumblr blog
Author: Maria
Date: 02/12/2018 05:54
 
My settings for downloading a Tumblr blog with HTTrack
 
Title: myhostname
 
URL: <http://myhostname.tumblr.com>
 
URL list (.txt): I've previously extracted a list of links of the raw images
that are on <http://data.tumblr.com/> (links which I renamed to start with
<http://s3.amazonaws.com/data.tumblr.com/>), and audio and video from
<https://a.tumblr.com/>, <https://www.tumblr.com/audio_file/myblogname> and
<https://vt.tumblr.com> with an application for ripping Tumblr media called
TumblrThree. It extracted the links with https protocol, but with Notepad++ I
renamed all the links to start with http and then I've put all those links in
a single txt file.
 
Tab Scan Rules:
 
+*.png +*.gif +*.jpg +*.css +*.js -ad.doubleclick.net/*
-mime:application/foobar
+http://myhostname.tumblr.com/* <http://static.tumblr.com/>*
+http://assets.tumblr.com/* +http://media.tumblr.com/*
+http://*.media.tumblr.com/* -*?*=* -*=*
+http://myhostname.tumblr.com/archive?*=* +http://www.tumblr.com/photo/*
+http://myhostname.tumblr.com/post/*
+http://s3.amazonaws.com/data.tumblr.com/*
+http://a.tumblr.com/* +http://www.tumblr.com/audio_file/myhostname/*
+http://vt.tumblr.com/*
-en.wikipedia.org/*  -www.google-analytics.com/*
--disable-security-limits
--max-rate 5000000
--assume php=text/html
 
Tab Spider: no robots.txt
 
Tab Limits:
 
Max transfer rate (B/s): 0 or a big number like 999999 or 5000000
Max connections / seconds: 50
 
Tab Flow Control:
 
Number of connections:50
Retries: 5
 
I may have put some unnecessary scan rules but that's it. I'm experimenting
with the program.
 
Maria
 
Reply


All articles

Subject Author Date
My settings for downloading a Tumblr blog

02/12/2018 05:54
Re: My settings for downloading a Tumblr blog

02/12/2018 05:56




2

Created with FORUM 2.0.11