HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Get only .mp3 files from specific site
Author: Darren
Date: 04/08/2019 17:37
 
I have just done a similar thing with the free Desert Island Discs archive on
bbc.co.uk.
  
I too could not get this to work without some probably unnecessary messing
about.

First, I scraped the 'base' URL using HTTrack for the programme names (in your
case, episode numbers)

In your case scrape
<https://absoluteradio.co.uk/schedule/the-christian-oconnell-breakfast-show-2/episodes/oct-2008/>
etc 

Abort the scrape after some time or check your existing scrape if you did not
delete it and then look for the folder: podcast.timlradio.co.uk/oconnell/

Hopefully, you will find lots of episode numbers, for example: 2767, 2815,
etc

Do a DOS DIR to text file to get a list of the episode numbers

Create a list of URLs with format:
<https://absoluteradio.co.uk/schedule/the-christian-oconnell-breakfast-show-2/episodes/EPISODE>
NUMBER from list above
for example:
<https://absoluteradio.co.uk/schedule/the-christian-oconnell-breakfast-show-2/episodes/2767>

Start a new project in HTTrack, name it and paste you list of URLs under Web
Addresses: (URL)
go to Set options, Scan Rules and add this to limit downloads

+*.mp3
+https://absoluteradio.co.uk/
-*.gif -*.jpg -*.jpeg -*.png -*.tif -*.bmp
-*.zip -*.tar -*.tgz -*.gz -*.rar -*.z -*.exe
-*.mov -*.mpg -*.mpeg -*.avi -*.asf -*.mp2 -*.rm -*.wav -*.vob -*.qt -*.vid
-*.ac3 -*.wma -*.wmv
-*.html
-*.html.tmp
-*.txt
-*.ini

hope that that is clear even if it is long winded

Darren
 
Reply Create subthread


All articles

Subject Author Date
Get only .mp3 files from specific site

04/04/2019 13:42
Re: Get only .mp3 files from specific site

04/08/2019 17:37




6

Created with FORUM 2.0.11