HTTrack Website Copier
Free software offline browser - FORUM
Subject: Help with regexp
Author: Adam
Date: 06/12/2009 17:01
 
I am writing a script whereby a user can enter a URL to
a forum thread and then httrack will capture that thread.

At this particular forum, each thread is available in two
styles:

'Normal' view:

<http://www.somesite.com/thread-topic.html>     (First page of thread)
<http://www.somesite.com/thread-topic-2.html>   (Second page of thread)
<http://www.somesite.com/thread-topic-3.html>   (Third page of thread)

'Printable' view:

Same thread above, URL's become:

<http://www.somesite.com/thread-topic-print.html>     (First page of thread)
<http://www.somesite.com/thread-topic-print-2.html>   (Second page of thread)
<http://www.somesite.com/thread-topic-print-3.html>   (Third page of thread)


Currently in my script I've got it working such that 
as long as the 'normal' view of a thread is required
the capture is working as per the following call to
httrack:

httrack.exe <http://www.somesite.com/thread-topic.html>
-N1 -O "c:\site" -*
+http://www.somesite.com/thread-topic.html*.html
-http://www.somesite.com/thread-topic.html-print*.html


But if a user instead wanted to capture the 'printable' version of the thread
the above obviously won't work.

Is there some sort of regular expression that I can specify so that regardless
of whether a user chooses the 'Normal' or 'Printable' view of a thread, just
the relevant pages are retrieved?
Thanks
 
Reply


All articles

Subject Author Date
Help with regexp

06/12/2009 17:01
Re: Help with regexp

06/14/2009 00:55
Re: Help with regexp

06/14/2009 02:53




7

Created with FORUM 2.0.11