HTTrack Website Copier
Free software offline browser - FORUM
Subject: Limiting HTTrack to One Domain
Author: Jake
Date: 01/21/2015 18:50
 
I've been searching around on the internet/this forum for an answer to this
problem, but I have not been able to find an answer that works.

I am trying to download only text (.html, .php, etc., but not .jpg, .mov,
etc.) from a single domain. Let's say it's MyDomain.com.

So, I open the newest version of HTTrack, for Windows x64, with GUI. I make a
new project, and in the Scan Rules I have the following:

+*http://mydomain.com/*
+*.htm +*.html +*.txt +*.pdf +*.asp +*.jsp

Followed by all the listed filetypes in the GUI, but changed to "-"
-*.gif -*.jpg -*.jpeg -*.png -*.tif -*.bmp
-*.zip -*.tar -*.tgz -*.gz -*.rar -*.z -*.exe
-*.mov -*.mpg -*.mpeg -*.avi -*.asf -*.mp3 -*.mp2 -*.rm -*.wav -*.vob -*.qt
-*.vid -*.ac3 -*.wma -*.wmv


This does indeed get only the text files, but when it downloads the text files
from every linked domain. Meaning that if "mydomain.com" linked to
"otherdomain.org/somepage.html", HTTrack will make a new directory called
"otherdomain.org" and inside it will stick "somepage.html". How do I prevent
this? I would like HTTrack to explore every link on the domain, but no links
from other domains.

I have looked at various forum posts, like this one:
<http://forum.httrack.com/readmsg/16341/16336/index.html>

And have tried what they suggested, but have found nothing that actually
limits HTTrack.
 
Reply


All articles

Subject Author Date
Limiting HTTrack to One Domain

01/21/2015 18:50
Re: Limiting HTTrack to One Domain

02/04/2015 05:22
Re: Limiting HTTrack to One Domain

12/19/2019 10:08




c

Created with FORUM 2.0.11