HTTrack Website Copier
Free software offline browser - FORUM
Subject: How to grab a website like this.
Author: Kenny Siu
Date: 07/26/2012 21:06
 
Hello it's my first time to post here. However, I have already  undergone
several painful waiting time and failures before I am asking here. I've also
read the user manual but I really cant understand it.

My purpose is clear - to download the everything in a particular website, that
doesn't include any external link or banners out of the domain. The site
requires HTTP authentication and I am a legal member of it. 

I've been trying to download the sites but HTTrack always starts without
problems but it then, after some time then it is getting to show a lot of
"error:....." and "warning.......'
that seems to last "forever", and those messages does not seem to stop after
many hours, not even when I have completed a ten hours of sleep. I don't know
what's exactly happening because I am not a website expert but I can "smell"
the problem.

For security reasons I can't give my login information for the site but I
don't think that's a website with complicated structure to work with. I am
here to try to describe how the websites are like.



1. most of the files I am interested in are located under
<http://www.DOMAIN.com/members/>...............

2. the main page under member area is 
<http://www.DOMAIN.com/members/index_eng.php>

PHP structure? what is PHP?
3. Under the main page". there are a few sub-pages.
they are mostly of the same structure. For example,

<http://www.DOMAIN.com/members/videos_eng.php>
<http://www.DOMAIN.com/members/gallery_eng.php>

In fact video and gallery comprises the majority of the websites. 

4. Under the video section, there are some sub pages to sort the video into a
clear arrangement. The web link for these subpage are

<http://www.DOMAIN.com/members/videos_eng.php?page=1>
<http://www.DOMAIN.com/members/videos_eng.php?page=2>
<http://www.DOMAIN.com/members/videos_eng.php?page=3>
.........

For each page there are some downloadable video links
but they are stored in very "deep" directory.

e.g. <http://www.DOMAIN.com/members/video/142/download/video142.mov>

All other downloadable video has the similar link structure except 142
replaced by other numbers.


5. For the gallery section it is the most complicated,
under the gallery section. There are quite a number of 
introducing thumbnails that lead to a big collection of gallery.

The link for these thumbnails are difficult to me. For example, 
<http://www.DOMAIN.com/members/gallery/gallery_eng.php?img_gallery_id=362&return=%2Fmembers%2Fgallery_eng.php>

All other thumbnails there have a link like the above structure.

To click into any thumbnails, it leads to another page contains hundreds of
smaller thumbnails, these thumbnails have a link structure like , for
example,

<http://www.DOMAIN.com/members/gallery/viewimage_eng.php?img_gallery_entry_id=67731>

when those small thumbnails are clicked that leads to a larger version of the
image, on the top of the html page of the larger version of the image, there
are backward,forward and return button to lead to other images.





Although i am not very familiar with webpage, I think it is not a complicated
webpage. However, I can't make it. I 've tried adding "rules" to include mov,
jpg, etc. I have also left the max depth blank and download only the domain. I
am sure it should not go to external links. 



Please help! Highly appreciated for any help!! Thank you!



 
Reply


All articles

Subject Author Date
How to grab a website like this.

07/26/2012 21:06
Re: How to grab a website like this.

07/27/2012 17:32
Most of the time it's about Scan Rules

07/29/2012 04:35
Re: Most of the time it's about Scan Rules

07/29/2012 11:28
Re: Most of the time it's about Scan Rules

07/29/2012 16:14




c

Created with FORUM 2.0.11