| I have a problem tht doesn't have to do directly witht
this program, but since I ran into it using this program I
thought I might ask.
I ran into the problem using your hts-cache/new.txt file.
I was using this file to scan all the links on a page and
list them. I would then open this file and go to column I
which lists all the URLs scanned.
Then I would save the new.txt file as a .csv file and open
it into a spreadsheet and cut and paste the column with
the URLs in it into it's own txt file. (pretty much just
extract the URL out of your log file.)
well that's where I ran into a problem. On one of my
search and scans the URLs end up being too long for a csv
file. It cuts the end of the address off.
Here is an example of an address...
<http://patimg1.uspto.gov/.piw?Docid=06686531&homeurl=http>%
3A%2F%2Fpatft.uspto.gov%2Fnetacgi%2Fnph-Parser%3FSect1%
3DPTO2%2526Sect2%3DHITOFF%2526u%3D%2Fnetahtml%2Fsearch-
adv.htm%2526r%3D5%2526f%3DG%2526l%3D50%2526d%3DPTXT%2526p%
3D1%2526p%3D1%2526S1%3D(((('electric%252Bbass'%252BOR%
252B'electric%252Bbasses')%252BOR%252B'bass%252Bguitar')%
252BOR%252B'bass%252Bguitars')%252BAND%252B(84%2F$.CIOR.%
252Bor%252B84%2F$.CIXR.%252Bor%252B84%2F$.CIUX,CIDX.))%
2526OS%3D%252B(%252522electric%252Bbass%252522%252Bor%252B%
252522electric%252Bbasses%252522%252Bor%252B%252522bass%
252Bguitar%252522%252Bor%252B%252522bass%252Bguitars%
252522)%252Band%252Bccl%2F84%2F$%2526RS%3D((((%
252522electric%252Bbass%252522%252BOR%252B%252522electric%
252Bbasses%252522)%252BOR%252B%252522bass%252Bguitar%
252522)%252BOR%252B%252522bass%252Bguitars%252522)%252BAND%
252BCCL%2F84%2F$)
&PageNum=&Rtype=&SectionNum=&idkey=3C3DDBF631F6
Part of the reason the address is so long is because it
contains a HOMEURL field which contains all the parameters
of the original query. If I take out the homeurl field the
address still works but looks like this.
<http://patimg1.uspto.gov/.piw?Docid=06686531&PageNum=&Rtype=&SectionNum=&idkey=3C3DDBF631>
F6
Ok so if I could find a program that takes a list of URLs
in text format and truncates out the homeURL field, I
could really use it.
I COULD do it manually but the list I have is over 2000
URLs long and it is only one of many searches. If I did
this manually it would take days.
Can anyone help? I usually search the web for this kind of
stuff but I don't even know where to look. Point me in the
right direction please.
Joe
| |