| I see! I'm surprised that it "recycled" the same post request, but I suppose
that makes sense. Sorry for the misunderstanding.
>1) Always post the actual command line used (or log file line two) so we know
what you did, not what you think you did. no command, no flags posted.
>Don't alter the domain, no mind readers here - can't see the robots.txt or
the actual page.
No. This is a CS student's private, halfassed, disregarded toy board, running
a hopelessly obsolete and insecure forum program, from which I'm trying to
archive my old posts before it randomly vanishes someday. I'm not going to
trigger its destruction by pointing the spambots, or a crowd of well-meaning
troubleshooting strangers, at it to establish accounts so they can see the
index page. Httrack logs in and grabs the index page, so I think I Just Might
have the domain correct, and here is the entirety of robots.txt:
User-agent: *
Disallow: /
>>The server's robots.txt locks off addresses beginning with / . I have tried
the ignore robots option with both my current and starting filters.
The command line I tried 5 minutes ago, fixed from my misconception about
character escapes and not posted at midnight:
winhttrack -qwC2%Pns2u1j0%s%uN0%I0p3DaK0c1H0%kf2A20000%c1%f#f -F "Mozilla/4.5
(compatible; HTTrack 3.0x; Windows 98)" -%F "<!-- Mirrored from %s%s by
HTTrack Website Copier/3.x [XR&CO'2007], %s -->" -%l "en, en, *"
-O1 E:\httrack\Oldboard -* +*.png +*.gif +*.jpg +*.css +*.js
-ad.doubleclick.net/* -mime:application/foobar +*board=1.* +*topic=*
>Are you sure the url is ..../index.php/?board=... and not the normal
It is not. That's from me somehow being misinformed that the / was the escape
character in httrack's filter parsing engine, and seeing whether ?s needed to
be escaped:
>>That's just my latest line with paranoid escapes on the question marks - I
started with -* +*board=1.* +*topic=*
>>The board uses...URLs of form .../index.php?board=1.0, 1.25, 1.50 etc. | |