HTTrack Website Copier
Free software offline browser - FORUM
Subject: "Global Travel Mode" and Rules
Author: Stefan Wagner
Date: 07/20/2007 12:07
 
Hallo,

i want to spider a set of documents on various domains, i have  some web
adresses, set a list of rules and set the "global travel mode" to "go
everywhere on the web", now my rules don't apply anymore - could that be?
Here my rules: 
---------------
-*
+www.fh-nuernberg.de/institutionen/fachbereiche/informatik/*
+www.fh-nuernberg.de/seitenbaum/home/fachbereiche/informatik/*
+linux-wiki.informatik.fh-nuernberg.de/*
+*fbi.informatik.fh-nuernberg.de/*
+virtuohm.fh-nuernberg.de/pruefungsanmeldung/*
+*.informatik.fh-nuernberg.de/*
+jobboerse.fh-nuernberg.de/*
+www.fh-nuernberg.de/institutionen/fachbereiche/allgemeinwissenschaften/*
+www.fh-nuernberg.de/seitenbaum/home/fachbereiche/allgemeinwissenschaften/*
+www.ai.fh-nuernberg.de/*
+www.fh-nuernberg.de/aw/profs/*
-mime:*/* +mime:text/html +mime:text/text +mime:application/pdf
+*.pdf +*.html
-*.zip -*.tar -*.tgz -*.gz -*.rar -*.z -*.exe
-*.gif -*.jpg -*.png -*.tif -*.bmp
-*.css -*.js
-----------------

The URL to spider is
<http://www.fh-nuernberg.de/institutionen/fachbereiche/informatik/sitemap/page.html>

All documents which are linked from here (according to the rules) should be
spidered, but it also fetches URLs which are "outside" of my interest...

Any hints or ideas? 

Ciao!
  Stefan
 
Reply


All articles

Subject Author Date
"Global Travel Mode" and Rules

07/20/2007 12:07




9

Created with FORUM 2.0.11