| Hi All,
I think this would be a useful feature to allow external files are stored in
the base structure of the scrape.
E.g. I want to scrape a forum (mainly for the pictures) and store it in the
base structure for later cataloging.
The base site is www.somesite.com/forum/subforum4/
within that are various posts:
www.somesite.com/forum/subforum4/subject1.html
www.somesite.com/forum/subforum4/subject2.html
www.somesite.com/forum/subforum4/subject3.html
www.somesite.com/forum/subforum4/subject4.html
Which contain images hosted on various hosting sites:
www.somehost.com/get_image.php?image=12345
www.pichost.com/a/b/c/d/somepic.jpg
etc.
I'd like the ability to specify in the filters as follows,
-*
+www.somesite.com/forum/area1/subject*.html
++www.somehost.com/get_image.php?image=*
++www.pichost.com/*/*/*/*/*.jpg
which would result in the files being saved like so:
www.somehost.com/forum/subforum4/subject23/abcde.jpg
www.somehost.com/forum/subforum4/subject23/get_image_9f34.jpg
..etc.
Is this implementable? :) | |