We are attempting to turn our site into a static copy to host externally. In
testing we run into a problem where httrack sometimes changes the files to
html instead of their native type.
so we will get
link type="text/css" rel="stylesheet" href="modules/user/user.html?e"
link type="text/css" rel="stylesheet" href="modules/user/user.css?e"
Not sure why it is replacing the type and only doing it some runs instead of
HTTrack version 3.48-22
httrack -%F "" "<site>" -O "<folder>" -s0 +*.css +*.js <fileTypes>
-www.google-analytics.com/* -*.com/* -*/* +www.<site>.com/* -c32 -N
"%h%p/%n.%t" --disable-security-limits --max-rate 999999 -q --ext-depth=0
#-%F Remove Footer with timestamp
#-O Specify output location
#-s0 ignore robots.txt
#+*.css *.js Include these file types and map to new location
#-www.google Don't download data from site - keep as a link to that
#-*.com -*/* In fact don't download from any external sites.
#+www.a-p.com/* now that other sites are excluded, add back our site.
#-c32 Max number of connections / simultaneous downloads
#-N "%h%p/%n.%t" Specify format of file
# %h%p host and path
# %n.%t file name dot file extension. %N removes the dot,
don't use. aka indexhtml
#--disable-security-limits Ignore max connection speed in httrack
#--max-rate Specify max connection speed.
#-q quiet mode, no questions. needed to run as a script.
#-ext-depth Its supposed to prevent crawls of external sites.
Doesn't work by itself