As we outlined, duplicate content can be created in many ways. Internal duplication of material requires specific tactics to achieve the best possible results from an SEO perspective. In many cases, the duplicate pages are pages that have no value to either users or search engines. If that is the case, try to eliminate the problem altogether by fixing the implementation so that all pages are referred to by only one URL. Also, 301 redirect (these are discussed in more detail in the “Redirects” section later on in this chapter) the old URLs to the surviving URLs to help the search engines discover what you have done as rapidly as possible, and preserve any link authority the removed pages may have had.
If that process proves to be impossible, there are many options, as we will outline in “Content Delivery and Search Spider Control”. Here is a summary of the guidelines on the simplest solutions for dealing with a variety of scenarios:
• Use robots.txt to block search engine spiders from crawling the duplicate versions of pages on your site.
• Use the canonical tag. This is the next best solution to eliminating the duplicate pages.
• Use the Robots NoIndex meta tag to tell the search engine to not index the duplicate pages.
Be aware however that if you use robots.txt to prevent a page from being crawled, then using NoIndex or NoFollow on the page itself does not make sense, as the spider can’t read the page, so it will never see the NoIndex or NoFollow tag. With these tools in mind, here are some specific duplicate content scenarios:
1. HTTPS pages:
If you make use of SSL (encrypted communications between the browser and the web server), and you have not converted your entire site, you will have some pages on your site that begin with https: instead of http: The problem arises when the links on your https: pages link back to other pages on the site using relative instead of absolute links, so (for example) the link to your home page becomes https://www.seomartweb.com instead of http://www.seomartweb.com.
If you have this type of issue on your site, you may want to use the canonical tag, which we describe in “Content Delivery and Search Spider Control”, or 301 redirects to resolve problems with these types of pages. An alternative solution is to change the links to absolute links (http://www.seomartweb.com/content instead of “/content”), which also makes life more difficult for content thieves that scrape your site.
2. CMSs that create duplicate content:
Sometimes sites have many versions of identical pages because of limitations in the CMS where it addresses the same content with more than one URL. These are often unnecessary duplications with no end-user value, and the best practice is to figure out how to eliminate the duplicate pages and 301 the eliminated pages to the surviving pages. Failing that, fall back on the other options listed at the beginning of this section
3. Print pages or multiple sort orders:
Many sites offer print pages to provide the user with the same content in a more printer-friendly format. Or some e-commerce sites offer their products in multiple sort orders (such as size, color, brand, and price). These pages do have end-user value, but they do not have value to the search engine and will appear to be duplicate content. For that reason, use one of the options listed previously in this subsection, or setup a print CSS style sheet such as the one outlined in this post by Yoast (http://yoast.com/added-print-css-style-sheet/).
4. Duplicate content in blogs and multiple archiving systems (pagination, etc.):
Blogs present some interesting duplicate content challenges. Blog posts can appear on many different pages, such as the home page of the blog, the Permalink page for the post, date archive pages, and category pages. Each instance of the post represents duplicates of the other instances. Few publishers attempt to address the presence of the post on the home page of the blog and also at its permalink, and this is common enough that it is likely that the search engines deal reasonably well with it. However, it may make sense to show only excerpts of the post on the category and/or date archive pages.
5. User-generated duplicate content (repostings, etc.):
Many sites implement structures for obtaining user-generated content, such as a blog, forum, or job board. This can be a great way to develop large quantities of content at a very low cost. The challenge is that users may choose to submit the same content on your site and in several other sites at the same time, resulting in duplicate content among those sites. It is hard to control this, but there are two things you can do to reduce the problem:
• Have clear policies that notify users that the content they submit to your site must be unique and cannot be, or cannot have been, posted to other sites. This is difficult to enforce, of course, but it will still help some to communicate your expectations.
• Implement your forum in a different and unique way that demands different content. Instead of having only the standard fields for entering data, include fields that are likely to be unique over what other sites do, but that will still be interesting and valuable for site visitors to see.
Read More : SEO Services in Delhi