HTTrack Website Copier
Free software offline browser - FORUM
Subject: Regular Expression Filter Ranges?
Author: Peter
Date: 06/25/2011 05:49
 
Hi,

I'm having a problem with filter ranges.

I downloaded an entire thread of a forum using the following filter, where
t=123456 was the number of the thread:

-*
+*xyz.com/showthread.php?t=123456&page=*[0-9]

and I used the first page of the thread as the starting page as follows:

<http://xyz.com/showthread.php?t=123456&page=1>

This worked perfectly and downloaded 559 pages of the thread but after
sometime passed I wanted to download only the newly created pages in the
thread so I tried to use the following filter:

-*
+*xyz.com/showthread.php?t=123456&page=*[560-563]

But this did not work at all I suppose because HTTrack still does not use
normal regular expression so I had to use this filter instead:

-*
+*xyz.com/showthread.php?t=123456&page=*[561,562,563]

and I used page 560 of the thread as the starting page as follows:

<http://xyz.com/showthread.php?t=123456&page=560>

No suppose I needed to only download pages 560 to 1000 then it would take me
ages to write the filter for that as I would have to include the number of
each page from 560 to 1000 in the filter.

So my question is this, is there currently as way to use a range filter for
pages [560 to 1000] without having to include each number from 560 to 1000 in
the range regular expression?
If there is currently no way to do this then does the developer of this
software intend to ever include the ability to use standard regular expression
within the scan filters? If standard regular expression is in included then
this software would be the best website downloader in the world.

Thanks and I'm hoping to hear from you soon.

Regards,
Peter
 
Reply


All articles

Subject Author Date
Regular Expression Filter Ranges?

06/25/2011 05:49
Re: Regular Expression Filter Ranges?

06/25/2011 16:30
Re: Regular Expression Filter Ranges?

06/27/2011 00:52




9

Created with FORUM 2.0.11