| Bill et al,
Thanks for your help. R5 and Robots were used as precautions, but neither are
needed or related to my original issue. I believe the real issue is that all
sub dir's below <http://www.readynas.com/download/GPL/> are generated by php
code ie. <http://www.readynas.com/download/GPL/index.php?dir=">;.
Bill, Unless you can provide examples of code, I believe HTTrack is not
capable of mirroring the sub dir's. I've read you comments throughout the
forums. You are very knowledgable, but seem to only comment on things. I
would be greatful for guidance on a solution if it is possible.
This may help others:
Using TTrack's %[], I have developed code to create a skeleton directory of
the php coded sub dir's below ".../GPL/" and created an index.html for each.
The web pages from the created index.html files are linked back to my server.
CODE: {httrack <http://www.readynas.com/download/GPL/> -O
"./ReadyNAS_GPL_Backup" +* -*.zip -*.bz2 -*.tar -*.gz -N "%h%p/%[dir]/%n.%t"
-n -v}
NOTICE: Sometimes I've received "Error: o"Not Found" (404) at link
www.readynas.com/download/GPL/". I believe folks at ReadyNAS are starting to
block this code.
RESULTS: The links form the index.html of the ".../GPL/" dir works fine.
However, the links from the index.html of any of its sub dir's are coded for
the local server.
It would be nice if I could figure out how to make the links within the
index.html's from the sub dir's refer back to the original server, but I
believe the www.readynas.com php code is causing this.
IDEA: What are your thoughts? If all links within each index.html refered back
to the host server, I might be able to use apache to set up a local site then
use HTTrack to create a full local mirror.
QUESTION: Does anyone have code to list the urls embeded in the index.html
that I've created. I've tried sed, awk, bash, and get_links.py with no
success.
Thanks
| |