Canonical Tag – HTTPs & HTTP content issues

HTTP & HTTPs duplicate content is still one of the most difficult one to solve for many of the webmasters. I do get queries regarding it. Last weak I helped another big SEO company solve this problem. I have written about http and https duplicate content plenty of times and most of it has reserved its top 10 position in Google http://www.google.com/q=http+https+duplicate+content. There has been many changes in Google algo’s in last few years. My articles still solves some of the problem but it is not perfect. Let me write the perfect article now :) . This will kill my importance though.

What is Http or Https problem?

  1. Duplicate content: Both http and https are considered 2 different site. One works on 443 port and another on 80th port, technical both are different. Typical canonicalization problems.
  2. 2 different sites being live means links can happen to any site and thus your link power is diversified/divided. You may be wasting your energy going in 2 directions. May be is the key as we don’t know how smart search engines are.

Prior to this post, I suggested going for robots.txt at SEOMOZ.org & SEOforClients.com to solve duplicate content issues. Now that is no longer the best available option.

Go for Canonical tags for http and https

I am attaching the ppt by Matt cutts, look at slide number #9 which says:

Q: Can I use this to suggest http://example.com be the canonical url instead of https://example.com?
A: Yes, absolutely

Canonical tag does take care of the duplicate content and most importantly the issue with link division. Now canonical tag is even available for multi-site. Now you have full control over what is ranked.Official docs for canonical tags:

  1. Google’s canonical tag document, Canonical tag help section by Google
  2. Yahoo’s blog post on canonical tag
  3. Bing’s webmaster section does have some canonical posts

Should I rank http or https

My take will be http and make sure that all your links are relative. Only the signups or checkouts needs to be on https, you can also block these sections using robots.txt so that these signup or checkout related pages are not even indexed. Why I am insisting on http is to avoid any scary browser popup while shifting from https to http or vice-versa. There were some popup for http to https shift as well for some of the browser settings.

Advice finally for http https duplicate content fix

  • Divide website into 2 parts, browsing section + transaction section, transaction section only opens on https (add 301 redirect for these pages, some people put them on different subdomain), browsing section can open on both http and https. Transaction section to be blocked by robots.txt
  • All pages to have http:// canonical tag so that only http ranks + all link value is consolidated under https.
  • All links to be relative so that if someone is on https, they don’t get a http link. Else it will pop up scary messages for non technical users.

PPT by Matt Cutts on Canonical tag


(this ppt is old, canonical tag is supported across multi domain now, see http://googlewebmastercentral.blogspot.com/2009/12/handling-legitimate-cross-domain.html)

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>