Re: Can I use httrack to spider IBM WebSphere sites?

Subject: Re: Can I use httrack to spider IBM WebSphere sites?

Author: Xavier Roche

Date: 01/08/2005 17:11

> I want to spider a site using IBM WebSphere and I can't
> succeed for the moment.
> The problem with such a site is that there are no standard
> link between the pages: all the links are javascript 
links.
> Moreover, the homepage uses frames and the URL of the 
frames
> are complex and computed by a javascript script.

Yuk. I will never understand why people use products that 
were so badly designed. Using javascript to produce links 
inside a website environment is a really stupid way of 
doing things IMHO - especially when standard and simple 
technologies such as plain DHTML can be used.

This is bad, because no crawler will even succeed to crawl 
this site: 
- offline browsers will never be able to copy the site
- search engines will never index the site (a pretty uncool 
feature, uh?)
- and of course disabled/blind people will never get the 
chance to read it because most braille systems just can not 
cope with javascript

> If HTTrack cannot spider such a site, do you know of any
> other application?
No. I don't think that this is possible, except for very 
simple cases (httrack can already handle quite simple 
cases, I mean _really_ simple ones). Analyzing javascript 
sites to rebuild their structure is a REALLY hard thing to 
do - I mean every harder than interpreting javascript.

Create subthread

All articles

Subject	Author	Date
Can I use httrack to spider IBM WebSphere sites?		01/08/2005 16:58
Re: Can I use httrack to spider IBM WebSphere sites?		01/08/2005 17:11
Re: Can I use httrack to spider IBM WebSphere site		01/08/2005 19:10
Re: Can I use httrack to spider IBM WebSphere site		01/09/2005 13:59