HTTrack Website Copier
Free software offline browser - FORUM
Subject: Web site structure not hierarchical -> problem
Author: oboe
Date: 04/05/2007 16:04
 
Hello, 

I want to download with the latest version of HTTrack the files of the French
Bulletin officiel des impĂ´ts (BOI, in short) which is located at
<http://alize.finances.gouv.fr/dgiboi/boi2001/boi.htm>

As you can see, there is no home page, and the files can be accessed from any
of the .../boi[yyyy]/boi.htm pages. I give HTTrack the
<http://alize.finances.gouv.fr/dgiboi/boi2001/boi.htm> page as the one from
which to begin. I have set HHTrack's filter options so as to ask him to get
PDF files and to go to outside level 3 but it is not enough. For instance, I
go down to page <http://alize.finances.gouv.fr\dgiboi\boi2007\4FEPUB\4fe_a.htm>
but I need it to go down to the next, 4th level, in order to get files such as
<http://alize.finances.gouv.fr/dgiboi/boi2007/4FEPUB/textes/4a207/4a207.pdf>

In other words, the PDF files located deep in the other years' folders do not
appear. How can I get them without making one mirror per year ?
Does anyone have a clue ?
 
Reply


All articles

Subject Author Date
Web site structure not hierarchical -> problem

04/05/2007 16:04
Re: Web site structure not hierarchical -> problem

04/17/2007 00:25




a

Created with FORUM 2.0.11