HTTrack Website Copier
Free software offline browser - FORUM
Subject: Bug: Escaped newline in URL
Author: Joao Luzio
Date: 09/28/2005 15:40
 
I've seen page that has some escaped newlines and tabs in the url. Httrack
doesnt handle it well.

AFAIK, browsers ignore those extra characters.
Httrack reads the URL as having the newlines (i guess) so the links get the
404 "not found".
In the cache the newline is written so instead of the entry being 1 line, its
more (which messed up my parser :/).

The test URL for it is:
<http://www.cm-chamusca.pt/chamusca/concelho/informacaogeografica/?wbcmode=presentationunpublished>
A mirror with depth 2 exemplifies what i've said.

Greetings,
    Joao Luzio
 
Reply


All articles

Subject Author Date
Bug: Escaped newline in URL

09/28/2005 15:40




7

Created with FORUM 2.0.11