| Mercola's site is a tremendous loss. I am retried and, due to the challenges of
aging, my health is beginning to fail. I credit Mercola's old site for keeping
me alive and I still have much more research to do but it is gone now.
During the 48 hours before the website turned into a pumpkin, I spent hours
with various Linux OS tools trying to scrape the Mercola website. I ran into
the same roadblocks as you found and did not have time to learn enough about
scraping (including programmable spiders) to archive the Mercola site.
All pages that I scraped contained the email signup bribe page so I believe
none of the actual content was scraped (ASP creates and serves a page that
looks like a popup on top of content, but the page is actually a facade and
there is no actual content.)
With HTTrack, I think a user must import either POST data or cookies from
their browser to get ASP to serve the desired page but I never figured out how
to do it. I believe that Dr. Mercola's old site was the most extensive and
valuable heath resource on the web. Due to actions related to personal threats
that I do not understand, he has taken down his life's work!
Much of the site is still here in this archive, though many of its new scrapes
result in redirected pages:
<https://archive.is/articles.mercola.com>
If either Admin or JD have achieved any success in archiving this life-saving
resource, would you be willing to share it with me in some manner?
I did manage to scrape some .pdf files of his articles, but it's no where near
a comprehensive library of his work.
| |