|
> I'm having trouble limiting file types
> that are harvested. I'm interested in
> downloading only .jpeg and .gif files.
> I typically will enter the URL's to,
> for conversations sake, 10 sites dedicated to
> animal photography. I'm looking to mass
> download as many pictures as possible. I
> have no need for ANY other file types other
> than JPEG AND GIF! What would the command line look
> like for this? Also, I have a very fast machine and
> a high bandwith connection.
Uh. First, even if you have fast bandwidth, beware NOT to
overload the remote server and NOT to clobber the bandwidth
of other users ; I suggest you use Options/Limits/Maximum
transfer rate and maximum number of connections.
Then, you have to get html content to detect links ; if
html pages are htm or html, use:
-* +*.gif +www.yoursite.com/*.htm +www.yoursite.com/*.html
Of course if you have more that one site, you'll have to
add the proper filters for each sites, such as:
-* +*.gif +www.yoursite.com/*.htm +www.yoursite.com/*.html
+www.yoursite2.com/*.htm +www.yoursite2.com/*.html ...
But, again: limit your bandwidth, and, if possible, do that
during non-working hours
| |