| Hello,
I'm trying to have HTTrack download podcasts from this site:
<http://learnrussianstepbystep.com/language/learn-russian/>
I only want mp3 files, but from what I've read, I need to get the html files
since HTTrack must download these to find the links. After having no success
with include filters, I've settled on this filter instead:
-*. ai -*.aif -*.aifc -*.aiff -*.asf -*.asr -*.asx -*.au -*.avi -*.axs
-*.bcpio -*.bin -*.bmp -*.cat -*.cdf -*.cdf -*.cer -*.class -*.clp -*.cmx
-*.cod -*.cpio -*.crd -*.crl -*.crt -*.csh -*.dcr -*.der -*.dir -*.dll -*.dms
-*.doc -*.dot -*.dvi -*.dxr -*.eps -*.etx -*.evy -*.exe -*.fif -*.flr -*.flv
-*.gif -*.gtar -*.gz -*.hdf -*.hlp -*.hqx -*.hta -*.ico -*.ief -*.iii -*.ins
-*.isp -*.jfif -*.jpe -*.jpeg -*.jpg -*.js -*.latex -*.lha -*.lsf -*.lsx
-*.lzh -*.m13 -*.m14 -*.m3u -*.man -*.mdb -*.me -*.mid -*.mny -*.mov -*.movie
-*.mp2 -*.mp3 -*.mpa -*.mpe -*.mpeg -*.mpg -*.mpp -*.mpv2 -*.ms -*.msg -*.mvb
-*.nc -*.oda -*.p10 -*.p12 -*.p7b -*.p7c -*.p7m -*.p7r -*.p7s -*.pbm -*.pdf
-*.pfx -*.pgm -*.pko -*.pma -*.pmc -*.pml -*.pmr -*.pmw -*.png -*.pnm -*.pot
-*.ppm -*.pps -*.ppt -*.prf -*.ps -*.pub -*.qt -*.ra -*.ram -*.ras -*.rgb
-*.rmi -*.roff -*.rtf -*.scd -*.setpay -*.setreg -*.sh -*.shar -*.sit -*.snd
-*.spc -*.spl -*.src -*.sst -*.stl -*.sv4cpio -*.sv4crc -*.svg -*.swf -*.t
-*.tar -*.tcl -*.tex -*.texi -*.texinfo -*.tgz -*.tif -*.tiff -*.tr -*.trm
-*.tsv -*.uls -*.ustar -*.vcf -*.vrml -*.wav -*.wcm -*.wdb -*.wks -*.wmf
-*.wps -*.wri -*.wrl -*.wrz -*.xaf -*.xbm -*.xla -*.xlc -*.xlm -*.xls -*.xlt
-*.xlw -*.xof -*.xpm -*.xwd -*.z -*.zip
+*.mp3
It's ridiculous, I know, but basically excludes all common web files and then
includes mp3 files. This has worked before with some success. My external
mirror limit is 0, as I do not want to spider a ton of external sites.
The upshot is that nothing except the index page is downloaded. Why? | |