| Hi there,
I've been using the httrack and it seems like I could not crawl the internal
pages but only the "index one", also the link is changed, index.html is
appended at the end of the URL.
I need to crawl online the subfolder of collections : example:
<http://domain.com/collections>. The collections folder has internal pages:
<http://domain.com/collections/inventory/>
It seems like I could only retrieve the <http://domain.com/collections> but not
the internal pages.
Here's my code:
httrack <http://domain.com/collections> -o "/data/temp/httrack"
"+*.domain.com/collections/*" -a -B -r5 -K -D -I0 -c4 -T15 -E43200 -v
Cheers & Thanks. I would appreciate anyone who could help me out on this | |