HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: Deleted Messages on Server
Author: Xavier Roche
Date: 12/21/2003 16:29
> I recently copied a website that is a password protected 
> message forum. I used the Capture URL feature to record 
> username and password. After completing the mirror, Many, 
> but not all, of the messages that I had posted on the 
> were deleted from the website, but were intact on my 
> copy that I had just made.
> Is there a setting or filter that I should have used to 
> prevent this?
This is definitely a design bug on the server, because 
regular URLs (generating GET requests) should not have side-
effects to on the database. Especially, "delete" or "move" 
actions should always be triggere by POSTed forms, so that 
regular crawlers do not "f up" the forum when running.

Anyway, the mandatory analysis to be done BEFORE crawling a 
forum is to list which links are composing the forum ; such 
- links to display a regular page ; such as

- links to display the next page or previous page following 
a regular page ; such as
which is, in this example, identical to:

- links to make an action, such as delete or reply

Here, you'll have to use scan rules such as:


To get all regular forum pages, except the "delete/reply" 
links, and except "previous" ans "next" pages, which would 
cause to fetch all pages in 3 identical versions, wasting 
3X bandwidth.

You can also, optionnally, include images or related files 
that could be located outside the forum:

+*.gif +*.jpg +*.png +*.css +*.js

Reply Create subthread

All articles

Subject Author Date
Deleted Messages on Server

12/19/2003 04:58
Re: Deleted Messages on Server

12/21/2003 16:29
Re: Deleted Messages on Server

12/22/2003 04:39


Created with FORUM 2.0.11