| HI,
I have a list of websites which I wish to crawl and download sepcific file
types only.
In the Action I am using "Get Seperated Files"
Then for the websites I have .txt file with seperated site:
www.mysite.com
www.hersite.co.uk
www.hissite.net
etc
I only want to download certain files mainly PDF, TXT, DOC, EXCEL no css, html
files etc
my filters box is currently set up like this
+*.png +*.gif +*.jpg *.pdf *.txt *.doc *.docx
+www.*.com/*.html +*.zip +*.pdf
+www.*.co.uk/*.html +*.zip +*.pdf
+www.*.net/*.html +*.zip +*.pdf
+www.*.org/*.html +*.zip +*.pdf
But I still not seem to be getting what I am after. I know that for example
www.mysite.com does have PDF's on it so why wont it find and get them?
Any help would be greatly appretiated, Thanks | |