HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Use the renamed file in the html files
Author: Dash
Date: 04/25/2025 21:14
 
I used this to fix a downloaded site using `hts_cache/new.txt`:

```python
import os
import csv
local_idx = -2
url_idx = -3
mapping = {}
with open('./hts-cache/new.txt') as csvfile:
    for idx, d in enumerate(csv.reader(csvfile, delimiter='\t')):
        if idx == 0:
            continue
        mapping[d[url_idx]] = d[local_idx]

exts = set()
for root, _, files in os.walk("."):
    for fname in files:
        ext = os.path.splitext(fname)[1]
        exts.add(ext)
        if ext not in [".css", ".html", ".js", ".readme"]:
            continue
        fullpath = os.path.join(root, fname)
        if "/hts-cache/" in fullpath or "hts-log" in fullpath:
            continue
        with open(fullpath, 'r') as fp:
            filedata = fp.read()
        replaced = False
        for url, loc in mapping.items():
            if url not in filedata:
                continue
            if "en-us" not in fname:
                continue
            relpath = "./" + os.path.relpath(loc, ".")
            filedata = filedata.replace(f'"{url}"', f'"{relpath}"')
            print(f"{fullpath=} {url=} {relpath=}")
            replaced = True
        if not replaced:
            continue
        with open(fullpath, 'w') as fp:
            fp.write(filedata)

```
 
Reply Create subthread


All articles

Subject Author Date
Use the renamed file in the html files

04/09/2025 22:59
Re: Use the renamed file in the html files

04/25/2025 21:14




1

Created with FORUM 2.0.11