| Sorry for the accidental post.
### Rule definition need to revise!! ###
Complete URL or lower level definition should be given more
priority than ignore rule of upper incomplete wildcard
definition or less complete URL. Never erase when lower,
deep, more complete level. Yes, seems to me, HTtrack scans
the lower. more complete URL, but the result is NOTHING.
DOES HTT erase them?
You would advise me, use "Just scan link" option for
experts would help solve, but it is a trick or tips and
then output LOG needs to be more user-frendly like mass
downloader like FlashGet or GetRight.
Recent most sites always have a direct link to not fully
defined "top level" dir. This dynamic construction makes a
rule definition silly and ugly. When I just want like a
page - a complete copy of
<http://www.site.com/asp?dynamic=3267+en+keyword;want>. I just tons of ignore
list
-http://www.site.com/index.asp?=a*
-http://www.site.com/index.asp?=b*
....+http://www.site.com/asp?d=* .... until ...?=z* HT
stop at the top site of a very big site and test all asps
or phps for minites. Great job but dynamic page cache
option with TTL (ofcurse, URL-based switch for avoiding bad
results - result might change by reffer... At least seems
dynamic page cache is not working or nothing
Well,,,, Does I miss something important rule features? -
If so, sorry for that... please be good.
BTW, Just for one page, I use web integrated mass
downler's "download all on this page" to see what to get.
It very handy. But resulting rule is a very ugly. Yes, just
one page is technically difficult.
Well, for a workaround, I put must-have item URL on start
page, but auto-created index become ugly... Please rewrite
the code to give priority to more deeper URLs then uppers.
HTtrack is one of the most useful tool - WORKING ANYWAY ,
great job!! - a must-have tool! but let me "DO MORE THAN
LESS" grin.
| |