| Hello!
Wondering which options I'd need to configure in order to download a website
including the comments coming from the "disqus" service.
Currently I get the main content of the website, but it's missing the "disqus"
content.
The used command is/are:
httrack
<http://mac.appstorm.net/quick-look/life-a-new-option-for-journal-keepers/> -n
-c8 -R3 -v -O ./mac/ "+disqus.com/*" "+*.disqus.com/*"
"+macappstorm.disqus.com/*" "+tempest.services.disqus.com/*"
"+disquscdn.com/*" "+referrer.disqus.com/*" "+*.disquscdn.com/*"
"+a.disquscdn.com/*" "+envato.s3.amazonaws.com/*"
httrack
<http://mac.appstorm.net/quick-look/life-a-new-option-for-journal-keepers/> -%P
-n -c8 -R3 -v -O ./mac/ "+disqus.com/*" "+*.disqus.com/*"
"+macappstorm.disqus.com/*" "+tempest.services.disqus.com/*"
"+disquscdn.com/*" "+referrer.disqus.com/*" "+*.disquscdn.com/*"
"+a.disquscdn.com/*" "+envato.s3.amazonaws.com/*"
Having inspected via Firebug the calls, I came up with adding + filters with
the observed domains. But it didn't help yet.
The website testing on is:
<http://mac.appstorm.net/quick-look/life-a-new-option-for-journal-keepers/>
The Disqus content seems being loaded asynchronously/lazy-loaded in an own
frame.
The "frame's call" inspected is:
disqus.com/embed/comments/?base=default&disqus_version=a4d38e7c&f=macappstorm&t_i=64848%20http%3A%2F%2Fmac.appstorm.net%2F%3Fp%3D64848&t_u=http%3A%2F%2Fmac.appstorm.net%2Fquick-look%2Flife-a-new-option-for-journal-keepers%2F&t_e=Life%3A%20A%20New%20Option%20for%20Journal-Keepers&t_d=Life%3A%20A%20New%20Option%20for%20Journal-Keepers&t_t=Life%3A%20A%20New%20Option%20for%20Journal-Keepers&s_o=default&l=#2
I'm not familiar how Disqus works here, but looking at the source it seems the
actual URL gets assembled on the fly.
(function() {
var dsq = document.createElement('script'); dsq.type = 'text/javascript';
dsq.async = true;
dsq.src = '//' + disqus_shortname + '.' + 'disqus.com' +
'/embed.js?pname=wordpress&pver=2.74';
(document.getElementsByTagName('head')[0] ||
document.getElementsByTagName('body')[0]).appendChild(dsq);
})();
I've browsed in this forum the following thread:
<http://forum.httrack.com/readmsg/26627/26620/index.html?q=javascript>
Is this the same situation here?On-the-fly invoked JavaScript-based
URL-assembled request gets is not feasible for HTTRACK?
Thanks for either confirming that it is not possible to fetch that sort of
embedded Disqus comments,
and even more thanks if there is a way with "correct" options to get it
downloaded!
Merry XMAS
cherokee | |