| I'm attempting to crawl the product pages from a massimodutti website. I'm
starting with a catalog page which contains a list of the products.
The problem is that httrack doesn't go into any product pages that I declare
as filter. I also tell httrack to go up and down the directions with the
option -B declared.
Below is the log file.
HTTrack3.46-nossl launched on Sun, 29 Jul 2012 13:23:48 at
<http://www.massimodutti.com/webapp/wcs/stores/servlet/category/duttifr/fr/massimoduttisales/256524/New>
-* +*product/duttifr/fr/*
(httrack
<http://www.massimodutti.com/webapp/wcs/stores/servlet/category/duttifr/fr/massimoduttisales/256524/New>
-O /Users/ecdiddy/websites/massimo2 -%v -B -* +*product/duttifr/fr/* )
Information, Warnings and Errors reported for this mirror:
note: the hts-log.txt file, and hts-cache folder, may contain sensitive
information,
such as username/password authentication for websites mirrored in this
project
do not share these files/folders if you want these information to remain
private
No files purged
HTTrack Website Copier/3.46 mirror complete in 2 seconds : 2 links scanned, 1
files written (68858 bytes overall), no files updated [1224 bytes received at
612 bytes/sec], 81 bytes transfered using HTTP compression in 2 files, ratio
124%, 1.5 requests per connection
(No errors, 0 warnings, 0 messages)
information,
such as username/password authentication for websites mirrored in this
project
do not share these files/folders if you want these information to remain
private
No files purged
HTTrack Website Copier/3.46 mirror complete in 2 seconds : 2 links scanned, 1
files written (84546 bytes overall), no files updated [646 bytes received at
323 bytes/sec], 81 bytes transfered using HTTP compression in 1 files, ratio
100%
(No errors, 0 warnings, 0 messages) | |