| That is a PHP site, so all of the 1000s of pages you "see" are just a few php
pages, that provide different content based on the data feed to them after the
'?'
But HTTrack can still do it. It will just make 100's of files as if they were
actual html pages. (don't override the autonaming, the file names will be
random, but the content will be there)
in this case
2016_election.php is just the listings
2016_election_speeches.php is each candidates speech listings
ws/index.php is the actual page with the speech
So if your root was "2016_election.php" it would not get the speaches which
are in the "ws" folder
you need to add "+ws/index.php*" to your filter
also add "2016_election_speeches.php*" for good measure.
The Speech Text is in a <Span> of Class="displaytext"
Hope that helps
| |