| > 1) Site with session id.
> (This has been already mentioned by Julio
> and Peter Drakes.)
> It seems no solution by now.
Changing session IDs are really hard to handle, I will
try to find a way to avoid this problem, but due to
the post-load system, this isn't obvious (see below)
> 2) In the option 'Expert Only' item 'Travel
> Mode', the smallest scope
> of downloading is 'Stay in same directory'.
> If a site has the following paging
> structure, like
>
> <http://www.aaaa.com/news.asp>
> (2-1)
>
> <http://www.aaaa.com/history.asp>
> (2-2)
>
> <http://www.aaaa.com/product.asp>
> (2-3)
> This assumes that there are several sections
> in that site. So, for
> 'history' section, there are serial pages
> like:
> <http://www.aaaa.com/history.asp?>>; subid=0001 (2-4)
> <http://www.aaaa.com/history.asp?>>; subid=0002 (2-5)
> ...
>
> My question is that, I only want 'history'
> part including its main
> page and each content pages (2-2),(2-4), (2-
> 5),...
> By now, the only solution is by
> using 'Exclude' in option tab 'Scan
> Rules' in which I need to put every link I
> do not want, like
> excluding '*news*', '*product*'. If the site
> has big amount of
> sections, I will be crazy after I exclude
> all others.
>
> Do you have any good solutions?
-* +www.aaaa.com/history.asp?* +*.gif +*.jpg +*.css
+*.js
might do the trick?
> 3) Use 2)'s example.
> How can I get only one page, like (2-2)?
To only get first level (types into the url list)
pages:
-*
To get other ones:
-* +www.aaaa.com/history.asp?subid=0002
> 4) Use 2)'s example.
> How can I download only pages like (2-4) and
> (2-5) without
> downloading page (2-2)?> WinHTTrack does not have the logic:
>
> 'Include' the link containing 'history.asp?> subid=',
> but will at the same time
> 'Exclude' the links containing 'history.asp'
That is, -* +*history.asp?* -*history.asp*[] ?
> 5) How to download the content from a linking
> page?> E.g. if a page
> <http://www.aaaa.com/article_links.html> has various
> external links to other sites for articles,
> how can I download
> only other sites' articles without other
> content of the linked
> sites?
Maybe using external depth of 1 or 2, but this isn't a
very good idea, as it will take all external links
> 6) I found that if I use WinHTTrack to download
> dynamic web pages
> (like asp, jsp, pl,... with dynamic server
> behind), it actually
> parses the pages one by one -- very slow.
Use --assume option, that is, with 3.05, use "MIME"
tab in options to force MIME types (will speed up the
download!)
> It seems the setting of 'Number of
> connections' in option
> 'Flow control' tab does not work in this
> case. The problem is
> that those each dynamic page's response time
> is normally quite
> long, so the whole downloading of the
> WinHTTrack from such
> dynamic sites is very, very, very slow.
Yes, with asp pages WITHOUT mime definitions (the
engine has to detect the file type each time)
> I hope the above mentioned can give you some ideas.
If
> WinHTTrack
> can get through with them, I think it would be the
> real king of
> site downloaders.
Eheh :)
| |