| > Often I would like to mirror a site, but only from a
> particular line of code to another particular line of
> code. For example, there might be a news source I would
> like to copy daily to read off-line, but only the actual
> news - not the headers and footers and advertisements and
> sidebars, etc. Site Scooper [1] seems to be capable of
> this - could this feature be added to HTTrack?
Err, automatically, not really. You can create a project,
and capture only what you need, using filters (scan
rules) ; but there is not yet any "automatic scheduled"
download.
This could be done, however, with a commandline .bat, and
the windows scheduler:
- first create your project (example: "FooBar Daily News")
- then, create an update.bat file, which contains:
C:
cd "C:\My Web Sites\FooBar Daily News\"
"C:\Program Files\WinHTTrack\httrack.exe" --quiet --update
- then, using the scheduled tasks (start / prefereces), add
a new task, for example every day, selecting the update.bat
file
> Additionally, Plucker [2] has the ability to mirror a
> page 'as if' it were the referring page itself. For
> example, to view the Zippy the Pinhead comic strip at the
> San Francisco Gate newspaper Web site, you have to
convince
> their server that you are 'at' a particular URL. If you
> try and mirror it without 'being' the referring URL, you
> only get a 'this image not available' graphic instead of
> the comic strip. Check it out yourself and it will make
> more sense than my description. Anyway, would it be
> possible to add this feature to HTTrack?
Err, httrack always sends the http referer according to the
http RFC ; so it should work. But httrack does not send a
referer to the FIRST url.
| |