HTTrack Website Copier
Free software offline browser - FORUM
Subject: feature idea...explorer-like selection of website
Author: Haudy Kazemi
Date: 01/02/2002 00:46
 
Hello,

I've come up with an idea for managing the downloading 
of a website.  I haven't seen, nor do I know of, any 
programs that implement this.  (Does anyone else?)

Anyway the idea is the offline browser program (like 
HTTrack) downloads a page (say cnn.com's home page), 
then it stops and waits for you to select which links 
of that page to continue downloading (say the Space 
and World news sections).  Then it'd download those 
pages (Space and World, and present the next sublevel 
of links.)

This would all be presented in a Explorer tree-like 
manner; pretend <http://cnn.com/> is the root of the 
tree (like C:\ or My Computer) with each folder being 
a link from the root.  Expanding a folder would reveal 
subfolders of additional links.  Now, to choose to 
include a folder and its subfolders, you'd put a 
checkbox next to any given folder...i.e. if you put a 
checkbox next to the root, you'd download the whole 
website (subfolders inherit the checkboxes), if you 
put it next to http:/cnn.com/space, you'd get the root 
home page, and the Space section and its subsections.  
(The idea being you get the specific section you are 
interested in, and the pages below it in the path back 
to the root.)

If any of you have seen how you select what folders to 
share in programs like Napster, Limewire, etc. you'll 
see where I get the checkboxes-next to an Explorer-
tree idea.  Now just to replace those explorer folders 
with URLs.

Implementation-wise comments (my opinions):
-default should be to download nothing, except that 
which is checkboxed and what is underneath the 
checkboxed levels.
-default should start at the root of a website 
(probably <http://abc.com/>, but maybe not...)
-default should start with the root collapsed, and 
only sections specifically clicked on are expanded to 
see their subfolders (sublinks).  This is probably 
important, because you can only see the sublinks by 
downloading the HTML files containing those links, and 
if you're doing that you've already started mirroring 
the site.  Thus only download HTML files needed for 
explicitly expanded folders and subfolders.

-this idea can probably be built on top of the HTTrack 
filters relatively easily, requiring at most an 
interface change or add-on.  Its conceivable to me 
that one could create a helper program to HTTrack that 
creates the proper filters to use based on what 
branches of the tree are checkboxed, and then the user 
copies/pastes those filters into HTTrack.  That's the 
main problem with using HTTrack as-is to do this: 
setting up the needed filters to do what I'm 
suggesting is hard to do manually, yet automating it 
would enhance the power of HTTrack a lot.  To be the 
most site and net-friendly I guess you'd exclude 
everything in general, and then start including 
specific branches as checkboxed.

Last idea (came to me after I typed the above, and 
hasn't been thought thru much yet): it might be better 
to give three options instead of a simple checkbox 
(Include, Neutral, Exclude) or some variation thereof 
(ex. Allow, Neutral, Deny).  That's how security is 
handled in NT with permissions being Neutral (and 
inherited) unless explicitly Allowed.  The Deny 
overrules Allow and Neutral.

If any of you read this over, I'd love to hear your 
comments.  Sorry it's so long but I wanted to make it 
as clear as I could for anyone who knows how or is 
interested in implementing it :)

-Haudy Kazemi
 
Reply


All articles

Subject Author Date
feature idea...explorer-like selection of website

01/02/2002 00:46
Re: feature idea...explorer-like selection of website

01/13/2002 11:00




e

Created with FORUM 2.0.11