HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Can't scrape a WebSite
Author: Nijaz
Date: 08/04/2020 23:34
 
Forbidden (403) means either that you are banned, in this case you can check
can you access website via web browser now? If yes, then you are not banned.
If you are banned then you would need new ip address, via some proxy or vpn,
like windscribe or webshare proxy.

If you are not banned then maybe that website bans only user agents, not the
ip. In that case modify user agent in some tab in httrack, you can put one
like you can find from web browser, in case of firefor go to menu (press alt
if it is invisible to make it visible), then go to help, then go to
troubleshooting information, and copy user agent string to httrack.

Or use this one which works for me: Mozilla/5.0 (Windows NT 10.0; Win64; x64;
rv:78.0) Gecko/20100101 Firefox/78.0

To include videos, just add this line in scan rules:
+*.mp4

And I recommend you to replace + sing in +*.js to - so that it will be -*.js,
because js is not needed for websites offline. Good luck!
 
Reply Create subthread


All articles

Subject Author Date
Can't scrape a WebSite

08/01/2020 03:11
Re: Can't scrape a WebSite

08/01/2020 03:13
Re: Can't scrape a WebSite

08/04/2020 23:34
Re: Can't scrape a WebSite

08/05/2020 16:48
Re: Can't scrape a WebSite

08/06/2020 00:58
Re: Can't scrape a WebSite

08/06/2020 17:39




7

Created with FORUM 2.0.11