HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: problem in new.txt url
Author: Xavier Roche
Date: 05/29/2005 17:06
 
> The results for some URL in cache comes like this:
> www.bn.ptservicos-ao-publico/sp-emprestimo-interbibliotecas.html
> The good one should be:
> www.bn.pt/servicos-ao-publico/sp-emprestimo-interbibliotecas.html
> This appen with this site I think, but I'm not sure, that is because of the
link in www.bn.pt site linking this one:
> 
> In www.bn.pt the link for
servicos-ao-publico/sp-emprestimo-interbibliotecas.html is this:
> <a href="../servicos-ao-publico/sp-emprestimo-interbibliotecas.html">

The problem is a malformed URL (../ in a top-level page) not correctly handled
by httrack.

The bug is located in htslib.c, function fil_simplifie(), around line 2360:

..
      if (rollid > 1) {
        rollid--;
        b = rollback[rollid - 1];
      } else {
        rollid = 0;
>>>     b = f;
      }

this should be:

      if (rollid > 1) {
        rollid--;
        b = rollback[rollid - 1];
      } else {
        rollid = 0;
>>>     b = f /* after the / */ + 1;
      }

I'll merge this fix in the next alpha-release.

Thanks for the bugreport, by the way!
 
Reply Create subthread


All articles

Subject Author Date
problem in new.txt url

05/24/2005 14:01
Re: problem in new.txt url

05/29/2005 17:06




5

Created with FORUM 2.0.11