HTTrack Website Copier
Free software offline browser - FORUM
Subject: help solve this challenge
Author: James Clear
Date: 10/15/2016 02:33
 
well a friend for school for some reference / stats ...

was trying to grab some or all of 

<http://www.indiegogo.com/>

but could not so he asked me to help

I tried could not

now it became more of a personal curiosity and challenge to do it ... to see
why ... how ... please help decipher

life is long ... can learn for the future ...

so

#1 I needed to change my agent to be something like 

Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like
Gecko) Chrome/53.0.2785.143 Safari/537.36

(else was getting <https://www.indiegogo.com/explore#/browse/landing> )

then I was getting

<https://www.indiegogo.com/distil_r_blocked.html?Ref=/campaign_collections/home-decor-diy-items>

says: 

1. You're a power user moving through this website with super-human speed.

2. You've disabled JavaScript in your web browser.

needed to change the 

WinHTTrack projects
Set options > Limits > Max connections / seconds to 0.01

one new connection about every 2 minutes

and 

Flow Control > Number of connections TO 1 (and remove check box for Persistent
connections (Keep-Alive)

BUT I am still not able to rip

in about 2 minutes it throws me to 

WinHTTrack stops and says it noticed that the mirror is EMPTY

with log showing 

... 20:04:39 Error:  "File Cache Entry Not Found" (-1) at link
»www.indiegogo.com/distil ··· 15000141 (from »www.indiegogo.com/campai
··· iy-items)
HTTrack Website Copier/3.48-22 mirror complete in 2 minutes 58 seconds : 3
links scanned, 2 files written (37608 bytes overall), 2 files updated [12841
bytes received at 72 bytes/sec], 37608 bytes transferred using HTTP
compression in 2 files, ratio 30%
(1 errors, 0 warnings, 0 messages)

hmmmmmmmm

I notice that if I go to <https://www.indiegogo.com/>
in say Chrome
and right click on any link and select

Save Link As

it saves the HTML as I want it

how can we do it with WinHTTrack? what settings need to be changed?
ONE LEVEL DEEP would be good enough
even maybe giving it a list of exact links to download - how?
OR as an alternative

hmm what other website ripper can I use successfully?
or what Chrome site ripper could be used?
thank you so much looking forward to an interesting discussion ... thank you 
 
Reply


All articles

Subject Author Date
help solve this challenge

10/15/2016 02:33
Re: help solve this challenge

10/15/2016 02:51
Re: help solve this challenge

10/28/2016 09:05




9

Created with FORUM 2.0.11