HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: can not mirror www.w3cschool.com
Author: William Roeder
Date: 10/19/2010 17:53
 
> +*.png +*.gif +*.jpg +*.css +*.js
> -ad.doubleclick.net/* -mime:application/foobar
> +*.asp
Adding the asp allows mirroring asp files from ANYWHERE, potentially mirroring
the entire internet.
Don't use filters this way. If you want everything just use the near flag (get
non-html files related)

> This is the rules, all the other setting are
> default,however I can not mirror
> "www.w3schools.com",many of its pages are asp.
asp is irrevelent. You can not get server side files from the public side of a
web server, only the html output and related files.

> However this site can be
> mirrored---http://www.w3school.com.cn/ Its page
> are asp also.
That's because that site doesn't have a robot.txt blocking /images etc., which
you would have known had you looking at the log file.
 
Reply Create subthread


All articles

Subject Author Date
can not mirror www.w3cschool.com

10/19/2010 16:35
Re: can not mirror www.w3cschool.com

10/19/2010 17:53
Re: can not mirror www.w3cschool.com

10/20/2010 01:37
Re: can not mirror www.w3cschool.com

10/21/2010 16:53




9

Created with FORUM 2.0.11