| I have just done a similar thing with the free Desert Island Discs archive on
bbc.co.uk.
I too could not get this to work without some probably unnecessary messing
about.
First, I scraped the 'base' URL using HTTrack for the programme names (in your
case, episode numbers)
In your case scrape
<https://absoluteradio.co.uk/schedule/the-christian-oconnell-breakfast-show-2/episodes/oct-2008/>
etc
Abort the scrape after some time or check your existing scrape if you did not
delete it and then look for the folder: podcast.timlradio.co.uk/oconnell/
Hopefully, you will find lots of episode numbers, for example: 2767, 2815,
etc
Do a DOS DIR to text file to get a list of the episode numbers
Create a list of URLs with format:
<https://absoluteradio.co.uk/schedule/the-christian-oconnell-breakfast-show-2/episodes/EPISODE>
NUMBER from list above
for example:
<https://absoluteradio.co.uk/schedule/the-christian-oconnell-breakfast-show-2/episodes/2767>
Start a new project in HTTrack, name it and paste you list of URLs under Web
Addresses: (URL)
go to Set options, Scan Rules and add this to limit downloads
+*.mp3
+https://absoluteradio.co.uk/
-*.gif -*.jpg -*.jpeg -*.png -*.tif -*.bmp
-*.zip -*.tar -*.tgz -*.gz -*.rar -*.z -*.exe
-*.mov -*.mpg -*.mpeg -*.avi -*.asf -*.mp2 -*.rm -*.wav -*.vob -*.qt -*.vid
-*.ac3 -*.wma -*.wmv
-*.html
-*.html.tmp
-*.txt
-*.ini
hope that that is clear even if it is long winded
Darren | |