| hello dear Xavier
again me!
Hmmm - i guess there is a main difference between retrieving HTML (on the one
handside) and retrieving an image (on the other handside).
Retrieving a image - with the Perl-code and the FireFox[/B] (see the code that
includes the FireFox part in Mechanize) seems to be much much smarter than -
for example doing it with httrack (the famous tool). With the little
Perl-snippet we re able to do nice rendering, and interpreting css/js. The
regular browser (automated) such as firefox is able do a good job here.
On a sidenote: Considering to do the fetching-job this little Perl-Snippet is
far more powerful -than httrack - since this job is not something httrack
would do easily. HTTrack is only able to grab part of website(s), but is not
able to do any rendering of any sort, nor interpreting css/js.
[PHP]
#!/usr/bin/perl
use WWW::Mechanize::Firefox;
my $mech = WWW::Mechanize::Firefox->new();
open(INPUT, "urls.txt") or die "Can't open file: $!";
while (<INPUT>) {
chomp;
$mech->get($_);
my $png = $mech->content_as_png();
}
close(INPUT);
exit;
[/PHP]
Well: There is absolutly no need to fetch HTML-Contents.
Caching the image is done easily with the Perl-Snippet. And therefore Httrack
is (absolutley) not the tool that i should take into consideration.
what do you think !?
greetings | |