| hi there
i am trying to make a copy of a phpBB -any experiences here!?
i need to harvest - and collect some data. Why do i need to collect the data,
you may ask: i am an researcher and i want to do some socio-ethnographic
research. Therefore i need the data: i want to harvest the data.
Harvest is an integrated set of tools to gather, extract, organize, search,
cache, and replicate relevant information. I need to gather information out of
a phpBB2.
Can we tailor httrack to harvest and to digest information in some different
formats.
I need to fetch data out of a online-forum (a phpBB-board) and to store it
locally in a mysql-db)
Is this possible with httrack.
Well, i need to collect some of the data out of a site -
look forward to hear from you...
How to do a kinda of a spider.... okay i do not need all of the data. But for
the empirical research that i have in mind i need all the threads with data of
poster (username)
timestamp
threadnumber
thread-title
messagebody (of course)
and some other tiny data sets....
what do you think? Is this possible with httrack. Or should i set up a certain
Perl-script to run it against the forums.
I have a beginning - a script: see here for some beginnings:
forums.devshed.com/perl-programming-6/minor-change-in-lwp-need-ideas-how-to-accomplish-388061.html
and phpbbdoctor.com/doc_tables.php for a list of the tables - i do not need
all of them.
I look forward to hear from you. I want to see how i can solve this issue with
httrack? Mail me to floobee@web.de - i hope to hear from you soon.
regards
butcher | |