HTTrack Website Copier
Free software offline browser - FORUM
Subject: Website copy problem - www.officialcharts.com
Author: Rich71
Date: 08/03/2020 12:15
 
Hello,

I'm trying to copy chart listings from
<https://www.officialcharts.com/charts/singles-chart>

I particularly wanted charts from the 80's which take a date format in the url
like below

<https://www.officialcharts.com/charts/singles-chart/19800217/>
<https://www.officialcharts.com/charts/singles-chart/19800224/>
<https://www.officialcharts.com/charts/singles-chart/19800302>

My copy starts to work and does capture 2020/2019/2018 etc but then stopped at
2016.  I tried a second copy but this got no further than 2020.

I've tried a scan rule to try and limit the copy to only include 198 in the
url

 (+*[name].https://www.officialcharts.com/charts/singles-chart/198/*
-*/*198*)

Any ideas how to achieve this?  Error log is below

Thanks,

Richard



HTTrack3.49-2+htsswf+htsjava launched on Sun, 02 Aug 2020 21:42:43 at
<https://www.officialcharts.com/charts/singles-chart/> +*.png +*.gif +*.jpg
+*.jpeg +*.css +*.js -ad.doubleclick.net/* -mime:application/foobar
+*[name].https://www.officialcharts.com/charts/singles-chart/198/* -*/*198*
(winhttrack -qir10C2%Ps2u1%s%uN0%I0p3DaK0H0%kf2A25000%f#f -F "Mozilla/4.5
(compatible; HTTrack 3.0x; Windows 98)" -%F "<!-- Mirrored from %s%s by
HTTrack Website Copier/3.x [XR&CO'2014], %s -->" -%l "en, *"
<https://www.officialcharts.com/charts/singles-chart/> -O1 "d:\My Web
Sites\charts" +*.png +*.gif +*.jpg +*.jpeg +*.css +*.js -ad.doubleclick.net/*
-mime:application/foobar
+*[name].https://www.officialcharts.com/charts/singles-chart/198/* -*/*198* )
Information, Warnings and Errors reported for this mirror:
note: the hts-log.txt file, and hts-cache folder, may contain sensitive
information,
 such as username/password authentication for websites mirrored in this
project
 do not share these files/folders if you want these information to remain
private
21:42:52 Warning:  Note: due to <https://www.officialcharts.com> remote
robots.txt rules, links beginning with these path will be forbidden:
/aspnet_client/, /bin/, /config/, /data/, /macroScripts/, /umbraco/,
/umbraco_client/, /usercontrols/, /xslt/, /views/ (see in the options to
disable this)
21:45:17 Error:  "Bad Request" (400) at link
<https://www.googletagmanager.com/gtm.js?id>= (from
<https://www.officialcharts.com/charts/singles-chart/>)
21:45:17 Warning:  Moved Permanently for
<https://www.officialcharts.com/charts/singles-chart/20200730/7501/>
21:45:17 Warning:  File has moved from
<https://www.officialcharts.com/charts/singles-chart/20200730/7501/> to
/charts/singles-chart/20200724/7501
21:45:27 Error:  "Not Found" (404) at link
<https://www.officialcharts.com/css/grabbing.png/> (from
<https://www.officialcharts.com/css/style.css>)
21:45:28 Error:  "Not Found" (404) at link
<https://www.officialcharts.com/css/ajaxloader.gif/> (from
<https://www.officialcharts.com/css/style.css>)
21:46:01 Warning:  Moved Permanently for
<https://www.officialcharts.com/charts/singles-chart/20200723/7501/>
21:46:01 Warning:  File has moved from
<https://www.officialcharts.com/charts/singles-chart/20200723/7501/> to
/charts/singles-chart/20200717/7501
21:46:02 Warning:  Found for
<https://www.officialcharts.com/charts/singles-chart/20200806/7501/>
21:46:02 Warning:  File has moved from
<https://www.officialcharts.com/charts/singles-chart/20200806/7501/> to
/charts/singles-chart/
21:46:59 Warning:  Moved Permanently for
<https://www.officialcharts.com/charts/singles-chart/20200716/7501/>
21:46:59 Warning:  File has moved from
<https://www.officialcharts.com/charts/singles-chart/20200716/7501/> to
/charts/singles-chart/20200710/7501
21:48:14 Warning:  Moved Permanently for
<https://www.officialcharts.com/charts/singles-chart/20200709/7501/>
21:48:14 Warning:  File has moved from
<https://www.officialcharts.com/charts/singles-chart/20200709/7501/> to
/charts/singles-chart/20200703/7501
21:48:21 Error:  "Not Found" (404) at link
<https://d35iaml2i6ojwd.cloudfront.net/img/small?url=https://m.media-amazon.com/images/I/51suGI+N0YL._SL75_.jpg>
(from <https://www.officialcharts.com/charts/singles-chart/20200710/7501/>)
21:50:48 Warning:  Moved Permanently for
<https://www.officialcharts.com/charts/singles-chart/20200702/7501/>
21:50:48 Warning:  File has moved from
<https://www.officialcharts.com/charts/singles-chart/20200702/7501/> to
/charts/singles-chart/20200626/7501
HTTrack Website Copier/3.49-2 mirror complete in 8 minutes 11 seconds : 231
links scanned, 219 files written (4221562 bytes overall), 3 files updated
[417332 bytes received at 849 bytes/sec], 3224 bytes transferred using HTTP
compression in 1 files, ratio 55%, 1.8 requests per connection
(4 errors, 13 warnings, 0 messages)


 
Reply


All articles

Subject Author Date
Website copy problem - www.officialcharts.com

08/03/2020 12:15
Re: Website copy problem - www.officialcharts.com

08/06/2020 19:25




8

Created with FORUM 2.0.11