HTTrack Website Copier
Free software offline browser - FORUM
Subject: Anti-crawler protection preventing clone
Author: Abe
Date: 08/06/2023 01:47
 
I'm trying to clone this entire website for someone who lost access to their
hosting login (for reasons I won't get into) and needs a local copy of
everything:
<https://www.cowpatch.com/>

It looks like it is created with wordpress.

I'm using httrack from terminal on a mac. Here's the command I'm running:

httrack <https://www.cowpatch.com/>  -O
"/Users/abrahamfeinberg/websites/cowpatch2"

It gets index.html, but there seems to be some kind of anti-crawler thing
stopping it from getting the other pages on the site. All the other HTML pages
downloaded display a message saying the following:

Anti-Crawler Protection is checking your browser and IP … for spam bots. You
will be automatically redirected to the requested page after 3 seconds. Don't
close this page. Please, wait for 3 seconds to pass to the page. Anti-Spam by
CleanTalk

My take on this is that because httrack doesn't wait the three seconds, it
ends up cloning this redirection page. Is there a workaround? Some option I
can use to fix this?
Any help is greatly appreciated!
 
Reply


All articles

Subject Author Date
Anti-crawler protection preventing clone

08/06/2023 01:47




7

Created with FORUM 2.0.11