HTTrack Website Copier
Free software offline browser - FORUM
Subject: js "INVALID_ACCESS_ERR" + server "Session Expired"
Author: Albretch Müller
Date: 01/03/2017 22:52
 
// __ javascript "INVALID_ACCESS_ERR" and server end "Session Expired" errors
...

 I am a teacher tired of having to repeatedly access the NYC's Regents each
time I need a previous exam (which are public domain anyway).

 They keep all those files behind deep javascript obfuscations which generate
dynamic, one time links to them 
 even changing the order in which the files are displayed!

 wget doesn't have javascript capabilities so I hoped httrack would help me
out with this. However, I am not so sure if my run-time options in order to
get those pdf files are correct. I have tried a number of options from letting
only html and pdf files to downloading everything, but I httrack hasn't been
able to download the pdf files of the previous Regents exams (it seems to only
download the pdf files with direct, explicit links to them).
 
 Also, I have been looking for a "referrer" option which I can't find?!?~
// __ my script:

_NM="<name>"
_ODIR="<output directory>"
_LOG_DIR="<...>"

_LOG="${_LOG_DIR}/${_NM}_$(basename "${_ODIR}")_$(date +%Y%m%d%H%M%S).log"

_START_URL=http://nysl.cloudapp.net/awweb/guest.jsp?smd=2&cl=library1_lib&nid=16/17/25/513/18961/18962/9537

 httrack  \
  --extra-log  \
  --debug-log  \
  --verbose  \
  --extended-parsing=N  \
  --near  \
  --test  \
  -U \
  --user-agent "${_USR}"  \
  --robots=0   \
  "${_START_URL}"  \
  "+mime:text/html +mime:application/pdf"   \
  "-r6" >> "${_LOG}" 2>&1
~
 Please, let me know what is wrong with it and how can I troubleshoot it.
Otherwise, in case httrack would not be capable to handle those kinds of
javascript obfuscations, do you know of any other application you would
suggest?
 
Reply


All articles

Subject Author Date
js "INVALID_ACCESS_ERR" + server "Session Expired" 01/03/2017 22:52




7

Created with FORUM 2.0.11