| During some attempts at mirroring a multi-page thread, the real-world page will
not match up with the HTTrack output page, which has been specified by:
-N "%h%p/%n%[page:-:::].%t"
I suspect this has something to do with the way page links are displayed... On
shorter threads, you see the absolute (1, 2, 3, 4, 5) but on larger pages you
see something like (1, 2, 3, ..., 31, 32, 33)
If it does mis-number them, the first three and last three are usually
correct.
Also, in some other cases, an additional page will appear. Like:
page.html
page-.html
page-1.html
page-2.html
...
page-33.html
page-34.html
When in reality, only 33 pages exist. I'm not concerned with the files
without the numbers, as they are duplicates of the first page and can easily
be dealt with. I cannot, however, think of a way to determine whether or not
the final file is a duplicate of a page. Especially considering that this
file may not match up with the last physical page (could be somewhere in the
middle).
And finally, on some *really* big threads, not all of the pages are pulled.
For instance, on a 67 page thread, I got pages 1 through 15, and 52 through
67.
*** Interesting side note (I'm learning as I'm posting):
Is it a coincidence that 15 pages on each side were pulled? And further, page
15, 16, 52, and 53 are "bad" in that they only contain an error message from
the server.
See my previous separate post regarding this:
<http://forum.httrack.com/readmsg/23022/index.html> | |