HTTrack Website Copier
Free software offline browser - FORUM
Subject: Virtual vs. Real File Names
Author: Willy
Date: 12/16/2016 09:05
 
In the process of documenting our web site [(HTTrack) is awesome for this!],
we've discovered something odd which our sysadmin team doesn't want to answer. 
Right now, it's internal only (eventually public), so I have to rely upon
explaining things as best as I can.  (Providing we can answer all of this) 
BTW, this is for Windows.

This is using the x64 bit HTTrack (jeez, is it fast/efficient!) Windows GUI.

There are a number of (virtual?) file versions in many of the files which can
be referenced by a version name -and- a "real" filename.

Example:

For version 6, the "virtual" version tag is (v=vs.60), which is what we want
to document, but things such as (v=vs.52) should be ignored.

We don't want to do is to walk the entire website and sort things out later. 
That kind of slows things down (in a BIG way!) and adds some extra work. We'd
prefer to have as much performed by the machine and not write extra code to
sort things out after walking entire pieces of the system.

Suppose we have a filename like this:


Case #A:  .../.../library/aa1899(v=vs.52)

We don't care about it because it's version 5.2

Case #B:  .../.../library/aa187916.aspx

It appears to be a fixed filename (without a version), so it's not truly known
until we tinker with the filename:

If we try insert the (v=vs.60) tag to the filename:

Case #B:  .../.../library/aa187916(v=vs.60).aspx

and looking at it doesn't throw an error, we know in reality it's a 6.0 file
and will continue using it...just like HTTrack in regular processing.

Shortened explanation: 

If a file's address already has a version

 (v=vs.99.9)

and it's not the one we want, we skip it - which HTTrack already does..  If
it's v6.0 -- either before it's examined OR after we modify it and treat it
like a valid hit such as:

aa187916(v=vs.60).aspx

So how do we tinker with the address/file before we allow HTTrack to determine
if it's a hit or not?  In event logic, we would need to mess around with the
file/path _before  HTTrack performs its magic _during.

Thanks!
 
Reply


All articles

Subject Author Date
Virtual vs. Real File Names

12/16/2016 09:05
Re: Virtual vs. Real File Names

12/16/2016 09:31




1

Created with FORUM 2.0.11