| I am having this same issue trying to crawl our organization's public website.
I am using the latest 3.48-9 version of WinHTTrack Website Copier 64-bit
installer version.
Using the default "can only go down" and "stay on same address" settings, when
I attempt to capture only this specific page and its children:
<http://www.ll.mit.edu/about/facilities.html>
HTTrack crawls up the directory structure and captures other children of
"about" not just "facilities" and its children.
How do I prevent this? Can the filter posted in reply to the original post be
used with the GUI version or only the command line? If it can be used with the
GUI version, where in the options would I specify that? I do not see an
obvious place for it.
Thank you,
Marisa Bruhns | |