| Hi William, thanks for the response. I have got some way further as a result
of your help, but have not managed to mirror the course.
Httrack seems to login ok and seems to scan the term309 server - it looks as
though it is doing the job properly. However, looking at the log file it has
only scanned help files on the term309 server.
Browsing the mirrored result now shows the normal page reached after logging
in (portal.univ.ac.uk/index.html), which lists all the courses I have taken
(i.e. httrack has successfully logged to the first server).
The link to the course I want to mirror shows as:
i.e. an external link from the mirror and clicking it takes me to the term309
server, not to the mirror copy.
The mirror 'index of locally available sites' lists a local term309, but
clicking through shows a login redirect page (telling the user to go back to
portal and log in).
So it seems httrack is not logged in for the second url, the term309 server.
Thinking about it, there are two ways httrack can get to term309 : firstly via
the url I gave it, which appears to have a login failure, or secondly via the
link on the page listing all the courses (shown above).
It does not appear follow the link either. Is there a way I can get it to
follow the link :
I am guessing it is something to do with the fact that the links are all
showing as php ?
[Answers to your other questions below]
> >
> <http://[user:pass]@portal.univ.ac.uk/login/index.php>
> > ?>postfile:[myc]hts-post0
> > This seems to login fine, but only gets files
> from
> Does the broswer bring up a pop up (http) to log in
> at that url
> or is it a form.
It is a form.
> <http://httrack.kauler.com/help/Authentication>
> if you open just your term309... don't you also get
> the same login?No. It asks you to login on the portal server. It won't
allow direct login to this server.
> > portal.univ.ac.uk. I ONLY want to get the
> specific
> > course on term309.portal.univ.ac.uk. I have
> After log in add that url
Thought I had done that when I set up these two urls :
> > experimented with Scan Rules without success, eg
> :
> Don't post what you tried, post the actual command
> line used (log file line two)
(For obvious reasons I have replaced my username and password here as well as
the actual server address)
> > +*.png +*.gif +*.jpg +*.css +*.js
> These should be last. Actually you should use just
> the near flag (get non-html related) as these will
> not get things like getImage.php?ID=xx
Please excuse my ignorance : near flag ?Could this be the reason it is not
getting .php?ID=xx ?
> > +*[name].http://term309.portal.univ.ac.uk/*
> invalid filter
> Eliminate everything but
> -*.portal.univ.ac.uk/*
> +term309.portal.univ.ac.uk/*
This worked and allowed me to get much further than before.
> > I have also set all php and asp types to be
> > text/html and disabled robot.txt
> WHY?Experimenting to see if php was the problem. I have unset the MIME types
Robot.txt appeared in the first error log.
> > The logfile shows :
> > Error: "Not Found" (404) at link
> Not an HTT problem, bad links on the site or a login
> problem.
| |