| > > > Does the first page mirrored show the results
> > page?> >
> > Yes. I ran a test limit of 100 on the site
> > description above. The page displays correctly,
> but
> > when I click on an image link (or record link) it
> > takes me to an ARC Time-Out Page. The same
> happens
> > for the record link.
> Those were not-mirrored pages. You can prevent
> those with options -> Build -> No external pages
Okay. Did it.
>
> > > what files aren't you getting?> > The linked image files (gif). Scan
Rule is set
> to
> > get *gif, *jpg, etc.
> Do you mean +*gif +*jpg
Yes, sorry. +*gif, etc.
>
> > > What does the log say? Did you set the log to
> > > debug?> > I don't know if the log is set to debug (I'm
> options -> log -> create log files -> Select box
Yes. Set by default.
>
> Some of the links look like:
> src="/arc/laf/nara/images/select/iconLAButt_2.gif"
> The default is only scan downword from the starting
> url.
> Try options -> experts -> Travel mode= up/down
>
Okay. Did it.
I tried several tests (errors = 0). The initial page displays correctly, (the
captured URL page) but the links on that page consistantly send me to a
"Search Not Available" page.
The links are associated with (a) a thumbnail gif (or image icon); and (b) a
minor text record description. Clicking the gif or image icon will get a large
version of the image on the NARA site. Clicking on the short text record
description will get a full text record on the NARA site.
These two items, when clicked locally, send me to a "Search Not Available"
page.
HTTrack is mentioned on the Archives site at:
<http://www.archives.gov/records-mgmt/bulletins/2005/2005-02b.html>
In the first paragraph of that page, there is a link to general and technical
specifications for harvesting (see Appendix B and C). I hope this helps.
Thanks for your continued help. I'm not so sure we're going to be able to
make this work, but I'm willing to keep trying if you are.
| |