Re: wanted in htttrack - lwp-rget's best feature :

Subject: Re: wanted in htttrack - lwp-rget's best feature :

Author: Haudy Kazemi

Date: 01/09/2003 09:45

> Neither this solution (using --near) nor the solution
> suggested by the Xavier Roche (using -* 
+*.gif+*.jpg...etc)
> works.
> 
> All I want is that downloading <http://www.google.com> 
should
> get me:
> 
>     index.html
>     images/logo.gif
> 
> That's it.  But no combination of flags I use in httrack 
(on
> Windows or Linux!) seems to get me that.
> 
> The --near seemed very promising, but doesn't seem to do 
the
> trick.
> 
> What am I missing?  Can someone try it out with the
> extremely simple requirement above in mind and tell me 
what
> the magic incantation is?> 
> Best regards,
> 
> Sitaram

Well, I just tried getting www.google.com with WinHTtrack 
3.23 (beta), let it go for about a minute, and it easily 
crawled 40+ pages with images, including that logo.gif 
mentioned above.  However I noticed these problems (some 
of which look like bugs to me...):

1.) you may have needed to be more general about the 
domain (use google.com instead of www.google.com) however 
the following results mean there is more at play...

2.) even when I included the more generic google.com on 
the project sites list, the local copy had problems with 
the links right above the search box (Images, Groups, 
Directory, News-New!)  When browsing the local copy, 
trying to access these links resulted in going to the web 
copy of the pages, however when I looked at the saved html 
pages, they were actually saved and ready to use.  The 
link you saw on mouseover, and you got by a right-click 
Copy Shortcut was correct, but not the link you got when 
it was actually clicked.

Perhaps this is a javascript issue?  The source of these 
html pages did appear to still contain a lot of 
www.google.com URLs that should have been rewritten to 
local URLs.

Create subthread

All articles

Subject	Author	Date
wanted in htttrack - lwp-rget's best feature :-)		01/08/2003 05:12
Re: wanted in htttrack - lwp-rget's best feature :-)		01/08/2003 06:15
Re: wanted in htttrack - lwp-rget's best feature :-)		01/08/2003 07:17
Re: wanted in htttrack - lwp-rget's best feature :		01/09/2003 08:28
Re: wanted in htttrack - lwp-rget's best feature :		01/09/2003 09:45
Re: wanted in htttrack - lwp-rget's best feature :		01/09/2003 17:55
Re: wanted in htttrack - lwp-rget's best feature :		01/10/2003 09:20