You can certainly add inclusive filters.
Like: +<web root>/ViewPdf?artid=*
Project size makes incomplete mirror possible
Are there Article Link/Index pages?
Are article URLs predictible?
It may be better to break this into many projects.
Is there a simple way to verify completeness of mirror? (Do you know the file
size of the total articles?)
HTTrack follows links & downloads based on:
domain/URL/file-name/file-type/file-size
Are articles identifiable by these aspects?
What do the sites/articles looks like?
|