| well a friend for school for some reference / stats ...
was trying to grab some or all of
<http://www.indiegogo.com/>
but could not so he asked me to help
I tried could not
now it became more of a personal curiosity and challenge to do it ... to see
why ... how ... please help decipher
life is long ... can learn for the future ...
so
#1 I needed to change my agent to be something like
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like
Gecko) Chrome/53.0.2785.143 Safari/537.36
(else was getting <https://www.indiegogo.com/explore#/browse/landing> )
then I was getting
<https://www.indiegogo.com/distil_r_blocked.html?Ref=/campaign_collections/home-decor-diy-items>
says:
1. You're a power user moving through this website with super-human speed.
2. You've disabled JavaScript in your web browser.
needed to change the
WinHTTrack projects
Set options > Limits > Max connections / seconds to 0.01
one new connection about every 2 minutes
and
Flow Control > Number of connections TO 1 (and remove check box for Persistent
connections (Keep-Alive)
BUT I am still not able to rip
in about 2 minutes it throws me to
WinHTTrack stops and says it noticed that the mirror is EMPTY
with log showing
... 20:04:39 Error: "File Cache Entry Not Found" (-1) at link
»www.indiegogo.com/distil ··· 15000141 (from »www.indiegogo.com/campai
··· iy-items)
HTTrack Website Copier/3.48-22 mirror complete in 2 minutes 58 seconds : 3
links scanned, 2 files written (37608 bytes overall), 2 files updated [12841
bytes received at 72 bytes/sec], 37608 bytes transferred using HTTP
compression in 2 files, ratio 30%
(1 errors, 0 warnings, 0 messages)
hmmmmmmmm
I notice that if I go to <https://www.indiegogo.com/>
in say Chrome
and right click on any link and select
Save Link As
it saves the HTML as I want it
how can we do it with WinHTTrack? what settings need to be changed?
ONE LEVEL DEEP would be good enough
even maybe giving it a list of exact links to download - how?
OR as an alternative
hmm what other website ripper can I use successfully?
or what Chrome site ripper could be used?
thank you so much looking forward to an interesting discussion ... thank you | |