HTTrack Website Copier
Free software offline browser - FORUM
Subject: Forcing HTTrack to rename some files
Author: wunderwaffe
Date: 10/24/2015 17:16
 
Hello, I'm trying to mirror a website and is having some difficulties. I have
some questions and I hope that it can be answered. At first let me tell the
situation that I have to deal with;

The website contains a lot of pictures which I'm trying to mirror. The site is
updated daily, which is the the reason I choose to store ALL files in the
cache, in case it can be updated. However, the website has some problems of
it's own, namely it sometimes SCRAMBLED user's request eg. if I click a link
to picture1.jpg, it returns picture2.jpg or link1.htm. I found out about this
problem while using two Firefox profiles. If HTTRack tried to mirror the
pages, sometimes it return a wrong image, or a link and it is logged as an
error or a warning, in case the files mirrored has a wrong size eg. 25kb
downloaded, 39kb expected. If this problem happened, I usually deleted the
wrong files from the cache and UPDATED it again until it returns the right
one. This WORKS fine until one of the link has an accented characters eg. â,
é and so on. As HTTrack tries to mirror it, it download the files with
accented characters just fine (the files with the accented characters were
renamed into another name with special characters), unless the site returns a
wrong image or link which require me to delete it from the cache. Deleting the
files from the cache causes the cache to become corrupt, forcing HTTrack to
create a new cache. The cache is big (Its currently about 400mb, I expected it
might grow bigger than 5 Gb when I'm done, as I mirror one link at a time.). I
tried to put the encode name into scan rule eg. +Artéga into +Art%C3%A9ga but
it seems to return the same effect. So my questions are;


1. Can I force HTTrack to rename the files with accented characters to any
name of my choosing, and save it into the cache with its new name? eg.
Artéga.jpg into Artega.jpg.

2. Then, when I want to update the website, will HTTrack takes the renamed
files from the cache instead of downloading and renaming a new files every
time the site is updated?
3. Can HTTrack updated the files from a corrupted cache? (In case question 1
and 2 doesn't have a solution)

I really hoped someone could answer my problems.
 
Reply


All articles

Subject Author Date
Forcing HTTrack to rename some files 10/24/2015 17:16




7

Created with FORUM 2.0.11