In the process of documenting our web site [(HTTrack) is awesome for this!],
we've discovered something odd which our sysadmin team doesn't want to answer.
Right now, it's internal only (eventually public), so I have to rely upon
explaining things as best as I can. (Providing we can answer all of this)
BTW, this is for Windows.
This is using the x64 bit HTTrack (jeez, is it fast/efficient!) Windows GUI.
There are a number of (virtual?) file versions in many of the files which can
be referenced by a version name -and- a "real" filename.
For version 6, the "virtual" version tag is (v=vs.60), which is what we want
to document, but things such as (v=vs.52) should be ignored.
We don't want to do is to walk the entire website and sort things out later.
That kind of slows things down (in a BIG way!) and adds some extra work. We'd
prefer to have as much performed by the machine and not write extra code to
sort things out after walking entire pieces of the system.
Suppose we have a filename like this:
Case #A: .../.../library/aa1899(v=vs.52)
We don't care about it because it's version 5.2
Case #B: .../.../library/aa187916.aspx
It appears to be a fixed filename (without a version), so it's not truly known
until we tinker with the filename:
If we try insert the (v=vs.60) tag to the filename:
Case #B: .../.../library/aa187916(v=vs.60).aspx
and looking at it doesn't throw an error, we know in reality it's a 6.0 file
and will continue using it...just like HTTrack in regular processing.
If a file's address already has a version
and it's not the one we want, we skip it - which HTTrack already does.. If
it's v6.0 -- either before it's examined OR after we modify it and treat it
like a valid hit such as:
So how do we tinker with the address/file before we allow HTTrack to determine
if it's a hit or not? In event logic, we would need to mess around with the
file/path _before HTTrack performs its magic _during.