HTTrack Website Copier
Free software offline browser - FORUM
Subject: Re: How to download blogspot blog? Here's how
Author: Robinson
Date: 08/07/2012 18:59
 
Actually I had set max depth to 3 and external depth to 0. The 3 because it is
enough to get all entries following the BLOG ARCHIVE on the right side, since
entries are ordered by Year,Month,Entries, and that is enough.

Yes, I noticed that .tmp are renamed to html (I posted while still
downloading).

Since you comment that, I guess that the extremely duplicated entried for
blogspot may be due to the "?ShowComment=" urls elaborated by HTTrack.

I think the problem is explained here:
<http://blogger-hits.blogspot.com/2009/06/remove-duplicate-content-because-of.html>

Added to the filters:
-*?showComment=* 
-*.rar -*.zip -*.mov -*.mpg -*.mpeg -*.avi -*.asf -*.mp3 -*.mp2 -*.rm -*.wav
-*.vob -*.qt -*.vid -*.ac3 -*.wma -*.wmv

And can be added to the filter to remove:
-*?max-results=*
-*search?updated-min=*
-*search?updated-max=*
-*/feeds/*
-*?comments*
-*/comments
-*/search/label/* HTML page containing entries with that label. This always
contains duplicated entried.


This is an example of how to download a blogspot blog with HTTrack.


/*****************
 Remove Duplicate content because of "showcomments=" links in Blogger
Posted On Friday, June 26, 2009 at at 5:44 AM by Abdelrahman Ellithy

Blogger makes a link for every comment on your blog that is for a post like :
<http://nogoomfm.blogspot.com/2009/05/blog-post_19.html>
may get urls as :
<http://nogoomfm.blogspot.com/2009/05/blog-post_19.html?showComment=1242753180000>

this is a major problem for famous massively commented posts and blogs,
Here is the way you can solve it using the rel="canonical" attribute

1- Log in to your blogger Dashboard.
2- Choose Layout ( edit HTML template) and Download the template as backup.
3- Find :

</head>


4- Put before it the following code :

<!-- 100fm6.com block duplicate content START -->
<b:if cond='data:blog.pageType == "item"'>
<link expr:href='data:blog.url' rel='canonical'/>
</b:if>
<!-- 100fm6.com block duplicate content END -->



5- Save your template.
Google uses re='canonical' hack in it blog too.

*****************/
 
Reply Create subthread


All articles

Subject Author Date
How to download blogspot blog?

12/22/2009 20:05
Re: How to download blogspot blog?

12/22/2009 20:29
Re: How to download blogspot blog?

12/23/2009 18:22
Re: How to download blogspot blog?

12/23/2009 21:27
Re: How to download blogspot blog?

12/24/2009 13:47
Re: How to download blogspot blog?

12/27/2009 23:51
Re: How to download blogspot blog?

02/01/2010 06:47
Re: How to download blogspot blog? ( answer )

02/26/2010 17:08
Re: How to download blogspot blog? ( answer )

02/26/2010 17:08
Re: How to download blogspot blog?

05/01/2012 12:07
Re: How to download blogspot blog? Here's how

08/07/2012 18:59
Re: How to download blogspot blog?

02/21/2013 11:34
Re: How to download blogspot blog?

05/12/2013 18:03
Re: How to download blogspot blog?

05/12/2013 18:03
Re: How to download blogspot blog?

05/12/2013 18:06
Re: How to download blogspot blog?

05/12/2013 18:07
Re: How to download blogspot blog?

12/13/2013 09:19
Re: How to download blogspot blog?

03/14/2017 11:07
Re: How to download blogspot blog?

12/02/2019 19:04
Re: How to download blogspot blog?

01/25/2021 20:37




f

Created with FORUM 2.0.11