> I'm trying to download;
> <http://forums.station.sony.com/mxo/forums/list.m>
> I've set the rules;
> +http://forums.station.sony.com/mxo/posts/*
> +http://forums.station.sony.com/mxo/forums/*
> +*.css +*.js -ad.doubleclick.net/*
> -mime:application/foobar
> +*.gif +*.jpg +*.png +*.tif +*.bmp
> +*.zip +*.tar +*.tgz +*.gz +*.rar +*.z +*.exe
>
> But the problem is that HTTrack doesn't pick up the
> posts.
You don't need the gif/jpg/etc since you don't filter anything out. Just set
links -> get non-html
You don't nee the +*/mxo/forums/* since that is your starting point.
You do need the +*/mxo/posts/* since the default is to go down only and posts
are up and over fom forums.
You don't get anything else because the page has:
<meta name="robots" content="nofollow" />
set options -> spider = no robots.txt |