| > as a foreign language teacher i frequently use the web as
> the world's largest corpus of sample texts. whenever i'm
> interested in finding out about the usage of a word, i
> simply turn to google, search for the word and have a
look
> at its context(s).
> however, i feel that my work could be easier if there was
a
> possibility to save all the pages that google finds in
> plain vanilla ascii to one text file and then parse it
with
> a concordance software such as winconc.
> is it possible to include such a feature into a future
> version of httrack?
Well, this is a very specific feature, and I won't
implement it in the near future (there are many other
pending features to-be-implemented before).
If I understood the idea, the goal is to:
- fetch "all" found pages from google (how many?)
- transform them in plain ascii
- merge them all in one signle file
This would require some advanced scripting ; and even if it
is beyond the scope of an offline browser like httrack,
this may be feasable using bash-style scripting.
| |