| I am trying to download a few files from a staging area at Apache. This area is
organised as a hierarchy of folders, and I want only a part of it, so I point
directly to the required folder and put the -D option to stay below this
level. However, httrack still goes upwards and attempts to download a lot more
file if I don't set the last / at the end of the URL. Here is the example:
httrack <https://dist.apache.org/repos/dist/dev/commons/dbcp> -O /tmp/dist -D
Mirror launched on Sat, 17 May 2014 11:53:18 by HTTrack Website
Copier/3.48-5+libhtsjava.so.2 [XR&CO'2014]
mirroring <https://dist.apache.org/repos/dist/dev/commons/dbcp> with the wizard
help..
[snip]
At the end I do have not only the dbcp folder, but also all its siblings. I do
not have other files from two levels above, though, only files at the same
level as the initial URL.
lehrin) luc% ls -l /tmp/dist/dist.apache.org/repos/dist/dev/commons/
total 4
drwxr-xr-x 4 luc luc 100 mai 17 11:53 beanutils
drwxr-xr-x 4 luc luc 120 mai 17 11:53 codec
drwxr-xr-x 4 luc luc 100 mai 17 11:53 collections
drwxr-xr-x 4 luc luc 100 mai 17 11:53 compress
drwxr-xr-x 4 luc luc 100 mai 17 11:53 configuration
drwxr-xr-x 4 luc luc 120 mai 17 11:53 dbcp
drwxr-xr-x 2 luc luc 60 mai 17 11:53 dev_plugins
drwxr-xr-x 4 luc luc 120 mai 17 11:53 email
drwxr-xr-x 4 luc luc 100 mai 17 11:53 exec
drwxr-xr-x 4 luc luc 100 mai 17 11:53 fileupload
drwxr-xr-x 4 luc luc 100 mai 17 11:53 imaging
-rw-r--r-- 1 luc luc 1762 mai 17 00:59 index.html
drwxr-xr-x 4 luc luc 100 mai 17 11:53 lang
drwxr-xr-x 4 luc luc 120 mai 17 11:53 logging
drwxr-xr-x 4 luc luc 100 mai 17 11:53 math
drwxr-xr-x 3 luc luc 80 mai 17 11:53 plugins-test
drwxr-xr-x 4 luc luc 100 mai 17 11:53 pool
drwxr-xr-x 4 luc luc 120 mai 17 11:53 proxy
drwxr-xr-x 4 luc luc 100 mai 17 11:53 TEST_PLS_IGNORE
drwxr-xr-x 4 luc luc 100 mai 17 11:53 weaver
(lehrin) luc
If I do use a / as the last character, it works OK:
httrack <https://dist.apache.org/repos/dist/dev/commons/dbcp/> -O /tmp/dist -D
I suggest this case should either be explained in the man page, or (better for
me) some specific check added. When I end an URL with some/path/with/folder, I
clearly don't want some/path/with/thousands-of-other-sibling-folders, even if
I don't put a final /. | |