I plan to download a large archive which I've compiled which looks like
website.com/stories/Story1rel=nofollow
website.com/stories/Story2rel=nofollow
website.com/stories/Story3rel=nofollow
the total links is about 4000000 I have split this between 5 batches but i am
finding when i continue a disrupted project is is parsing all the links again
and then to actually grab the pages themselves is taking a long time itself
I have excluded file extensions .gif .jpg .jpeg and increased the connection
limit
Is there anything else i can do such as download the .html straight away
instead of parsing |