Hi - I just want to know if the following is possible. I want to crawl a
website but I only want to download files of a specific type (PDF) on pages
that match a particular text string in the body of the website. Thus for
example on a page if "text_string" is found on the page then download all PDF
files on that page. If the text string is not found then do nothing and move
on.
Any help in this regard will be greatly appreciated. |