HTTrack Website Copier
Free software offline browser - FORUM
Subject: ibiblio.org URL mis-written
Author: Haudy Kazemi
Date: 03/03/2003 00:46
 
Hello

I observed after crawling a project consisting of this list:
<http://www.ibiblio.org/obp/> 
<http://www.ibiblio.org/obp/books/> 
<http://www.ibiblio.org/obp/electricCircuits/> 

many pages in this URL path were skipped or mis-written 
during the WinHttrack crawl:
www.ibiblio.org/obp/thinkCS/

Missed were:
<http://www.ibiblio.org/obp/thinkCS/java.html>
<http://www.ibiblio.org/obp/thinkCS/cpp.html>

Miswritten was:
<http://www.ibiblio.org/obp/thinkCS/python.html>
(I could manually browse my filesystem to these files, but 
could not get to them thru the normal html-linking 
structure.)

Also, this page (which is the intro page to most of the 
others listed above) is missing its background colors:
<http://www.ibiblio.org/obp/thinkCS/index.html>

A relevant clip from hts-log.txt, showing some files being 
written correctly (dsl.html) and others not correctly 
(java.html, and others) is:

06:51:49	Info: 	engine: transfer-status: link 
recorded: 
www.ibiblio.org/obp/electricCircuits/Devel/dsl.html -> 
I:/web-archive problematic/www.ibiblio.org obp 
20030302/www.ibiblio.org/obp/electricCircuits/Devel/dsl.html

06:51:49	Debug: 	File checked by cache: 
www.ibiblio.org

06:51:49	Info: 	engine: save-name: local name: 
www.ibiblio.org/obp/images/nopic.png -> 
www.ibiblio.org/obp/images/nopic.png

06:51:49	Info: 	engine: save-name: local name: 
www.ibiblio.org/obp/images/wilson.jpg -> 
www.ibiblio.org/obp/images/wilson.jpg

06:51:50	Info: 	engine: check-html: 
www.ibiblio.org/obp/thinkCS/

06:51:50	Info: 	engine: save-name: local name: 
www.ibiblio.org/obp/thinkCS/www.ibiblio.org/obp/thinkCS/main
.css -> 
www.ibiblio.org/obp/thinkCS/www.ibiblio.org/obp/thinkCS/main
.css

06:51:50	Info: 	engine: save-name: local name: 
www.ibiblio.org/obp/images/howtothink.gif -> 
www.ibiblio.org/obp/images/howtothink.gif

06:51:50	Info: 	engine: save-name: local name: 
www.ibiblio.org/obp/thinkCS/www.ibiblio.org/obp/thinkCS/pyth
on.html -> 
www.ibiblio.org/obp/thinkCS/www.ibiblio.org/obp/thinkCS/pyth
on.html

06:51:50	Info: 	engine: save-name: local name: 
www.ibiblio.org/obp/thinkCS/www.ibiblio.org/obp/thinkCS/java
.html -> 
www.ibiblio.org/obp/thinkCS/www.ibiblio.org/obp/thinkCS/java
.html

06:51:50	Info: 	engine: save-name: local name: 
www.ibiblio.org/obp/thinkCS/www.ibiblio.org/obp/thinkCS/cpp.
html -> 
www.ibiblio.org/obp/thinkCS/www.ibiblio.org/obp/thinkCS/cpp.
html

The full new.txt and hts-log.txt files are here:
<http://www.kazemizadeh.net/httrack/ibiblio.org>
 
Reply


All articles

Subject Author Date
ibiblio.org URL mis-written

03/03/2003 00:46
Re: ibiblio.org URL mis-written

03/05/2003 07:04
Re: ibiblio.org URL mis-written

03/09/2003 08:24




5

Created with FORUM 2.0.11