Duplicate content isn’t just caused by content scrapers. Content scrapers do create issues with your own content, but what happens more frequently is that webmasters generate their own duplicate content issue without realizing it. If you’re still doing clean-up on your site because each Google Panda update inflicts more pain, be sure to review the common duplicate content causes and rid your website of these issues.
The Unlucky 7
- You used substantially the same content and replaced certain words to describe a different product/keyword.
- You submitted content from your website to an article directory.
- You offer the same content for every color or size option a product has.
- Your search filter options or content management system are generating multiple urls for the same page.
- You used someone else’s content on your website.
- Your site doesn’t resolve the different variations of your home page, i.e. yoursite.com and yoursite.com/index.html
- Your site doesn’t resolve the non-www and www versions of your website.
Detecting the Issues
There are a few different ways you can figure out how to get started with removing and resolving your duplicate content.
If you go to your Google Webmasters tools account, the first place you’ll want to look is at the html errors. If you click on the results shown for duplicate titles and descriptions, you’ll be presented with a list of urls that have these factors in common. Sometimes the urls that pop up in here are the same products with different colors or sizes, sometimes they are multiple search results for the same page. You may not uncover every issue by looking here, but it makes for a good starting point.
Resolving the Issues
- Once you clean up this list, you can further deduce issues by running your pages through the premium version of Copyscape, making sure that your non-www and www page versions are resolved, and changing any content that is substantially similar to other pages, whether on your own site or someone else’s.
- If you have an /index, /index.php, /index.htm or /index.html version of your home page, that page should have the rel canonical tag applied to it referring back to your preferred version of the home page.
- Choose whether you want to show up for the non-www or www version of your website. Apply the appropriate code to your htaccess file or server configuration file.
- If you used someone else’s content, you can either link to the source or apply the rel no index to the page so it will be taken out of the search results.
- Block the search filter results for pages offering viewing options.
- If your website generates multiple urls for the same page, be sure to resolve them with the rel canonical tag.
You may not notice the fruits of your labor right away, but if you notice an improvement after the next Panda update, you know you’ve made progress.
Theresa Happe works with BuyDomains.com where you can buy and sell domains, get an appraisal or park your domain.