HTTrack Website Copier
Free software offline browser - FORUM
Subject: Message board "pagination" issues
Author: Michael
Date: 02/12/2010 01:03
 
During some attempts at mirroring a multi-page thread, the real-world page will
not match up with the HTTrack output page, which has been specified by:

-N "%h%p/%n%[page:-:::].%t"

I suspect this has something to do with the way page links are displayed... On
shorter threads, you see the absolute (1, 2, 3, 4, 5) but on larger pages you
see something like (1, 2, 3, ..., 31, 32, 33)

If it does mis-number them, the first three and last three are usually
correct.

Also, in some other cases, an additional page will appear.  Like:

page.html
page-.html
page-1.html
page-2.html
...
page-33.html
page-34.html

When in reality, only 33 pages exist.  I'm not concerned with the files
without the numbers, as they are duplicates of the first page and can easily
be dealt with.  I cannot, however, think of a way to determine whether or not
the final file is a duplicate of a page.  Especially considering that this
file may not match up with the last physical page (could be somewhere in the
middle).

And finally, on some *really* big threads, not all of the pages are pulled. 
For instance, on a 67 page thread, I got pages 1 through 15, and 52 through
67.

*** Interesting side note (I'm learning as I'm posting):

Is it a coincidence that 15 pages on each side were pulled?  And further, page
15, 16, 52, and 53 are "bad" in that they only contain an error message from
the server.

See my previous separate post regarding this:

<http://forum.httrack.com/readmsg/23022/index.html>
 
Reply


All articles

Subject Author Date
Message board "pagination" issues

02/12/2010 01:03
Re: Message board "pagination" issues

02/12/2010 04:48
Re: Message board "pagination" issues

02/13/2010 19:01
Re: Message board "pagination" issues

02/13/2010 19:36




e

Created with FORUM 2.0.11