|
So, HTTrack downloads too much and WebReaper too little.
As a novice you probably used default settings.
The default settings may need adjustment.
HTTrack default setting is to download an entire site
(or try to download it by following links)
If you don't want that, you must change the settings
The site has more links than you may realize.
(In Netscape, hit Ctrl-I>>Links or View>>PageInfo>>Links)
It appears as though you want, ONLY:
1. Top Questions
2. Help Categories
And NOT:
1. Browse Help
2. Navigation Tabs
3. Account Sign In
4. Search
5. Yahoo Tips
6. Legal Notices
7. Other Yahoo Services
If this is the case, Try setting:
1. URL(s) Project Web Addresses:
<http://help.yahoo.com/help/trav>
2. Option>>Limit>>mirror depth = 4 (INTERNAL ONLY)
(External Mirror Depth = 0)
3. Option>>Scan Rules to:
-*
-help.yahoo.com/*
+us.rd.yahoo.com/travel/clks/help/tq/1/*
+help.yahoo.com//help/us/trav/flights/flights-*.html
+rd.yahoo.com/travel/clks/help/*.html
+help.yahoo.com/help/us/trav/*.html
-us.rd.yahoo.com/travel/clks/help/nav/*
-www.yahoovacationstore.com/*
-us.rd.yahoo.com/travel/nav/*
-login.yahoo.com/*
-edit.yahoo.com/*
-us.ard.yahoo.com/
-srd.yahoo.com/*
-help.yahoo.com/help/us/tips.html
-privacy.yahoo.com/*
-docs.yahoo.com/*
-search.cc.yahoo.com/*
-search.travel.yahoo.com/*
-travel.yahoo.com/*
-search.yahoo.com/*
Use Copy (Ctrl-C) & Paste (Ctrl-V) to put in scan-rules
The site and task you chose were complex in structure.
Most sites are simpler in comparison.
It may be simpler to download each help category seperately
| |