HTTrack Website Copier
Free software offline browser - FORUM
Subject: Problem getting images
Author: -
Date: 11/22/2011 12:57
 
I'm working on a compression benchmark; I want to grab png images from the
Alexa top 100 websites, yet I can't make it to work even with number 1,
google.

What I'm running is 
httrack "google.com" -B %H -e -n -t -%e1R1 -%P "+*.png"

As you can see I tried probably all brute force options, but I get an empty
page. Weird. Log shows:
12:55:54	Warning: 	Redirected link is identical because of 'URL Hack' option:
<http://google.com/robots.txt> and www.google.com/robots.txt

12:55:54	Warning: 	File has moved from <http://google.com/robots.txt> to
<http://www.google.com/robots.txt>

12:55:55	Warning: 	Redirected link is identical because of 'URL Hack' option:
<http://google.com/> and www.google.com/

12:55:55	Warning: 	File has moved from <http://google.com/> to
<http://www.google.com/>

12:55:55	Error: 	"Unable to get server's address: Unknown error" (-5) after 2
retries at link %h/robots.txt (from primary/primary)

12:56:01	Error: 	"Unable to get server's address: Unknown error" (-5) after 2
retries at link %h/ (from primary/primary)


If I add www:
httrack "www.google.com" -B %H -e -n -t -%e1R1 -%P "+*.png"
it grabs way too much, but still skips the main logo.

Could anybody suggest the fix, please?
 
Reply


All articles

Subject Author Date
Problem getting images

11/22/2011 12:57




3

Created with FORUM 2.0.11