Copy Cope-Ups: What is Duplicate Content and How to Fix Them
You’ve heard the term duplicate content before, but what does it comprise of? Duplicate content, in a word, is a copy from one page that is similar to a copy from another page of your website or other landing pages. In short, it’s the same content only placed in different URLs.
As a business owner, should this be a concern? If you’re trying to rank higher in Google, then fixing duplicate content issues should be your top priority. That’s because search engines will have a hard time ranking which page should they show in the results because showing the same content twice is a big no-no.
Why does duplicate content happen? These issues are mostly done due to site technicalities like session IDs, parameters, or pagination.
Causes of duplicate content
You’ll be surprised to know that website owners don’t copy content from another page. Most causes of duplicate content are technical. Here are some of them:
HTTP vs. HTTPS, WWW vs. Non-WWW
When your site is accessible with or without WWW, it can cause duplicate content. Search engines treat this as separate pages and won’t know which one should be prioritized. The same goes for HTTP and HTTPS. When you haven’t migrated your HTTP site in a secured one or just moved recently, chances are, search engines will confuse which content should rank higher.
URL parameters don’t change the content. For example, affiliates or e-commerce links turn on tracking features which change your URL. You might see your URL like http://www.yourbusiness.com/articleproduct/, and another one as http://www.yourbusiness.com/?source=amzn. To search engines, they’re different – but, you have the same content.
Do you allow comment pagination in your website? Some content management systems opt to page in your audience’s comments, but it causes another URL disruption. For example, http://www.yourbusiness.com/article01/comment-page-1, page-2, page-3, and so forth.
Unique session IDs
Copied or scraped content
Many website owners strive hard to keep their content fresh and original. Nonetheless, there are times when other websites copy content from yours without your permission. What’s more, one of the most common copied content issues is with e-commerce sites and their product descriptions. Instead of paraphrasing the content, what owners do is simply copy and paste product descriptions on their sites.
Printer-friendly web pages
Printer-friendly sites can make or break your SEO in some ways. However, it presents two versions of your site to search engines while retaining the same content.
How to fix duplicate content issues
Duplicate content issues filter your web pages from search results because it confuses search engines on which pages to index. To fix duplicate content, here’s what you can do.
Specify a canonical URL
If you choose a specific rel=canonical element of your web page, it allows search engines to crawl, index, and rank your preferred version of the content. A canonical URL is considered as the true and correct by search engines. In your WordPress site, you can set your canonical URL with SEO Yoast.
Meanwhile, you can also visit your search engine’s webmaster tool to set preferred URLs to be indexed by crawlers.
Run plagiarism tests
If your website has been copied or scraped, you can run several tests to find out which sites plagiarized your content. You can always try tools like Copyscape, Plagium, Plagiarism X, or Grammarly Premium.
You can even use Google to spot websites using your content. Input your keyword, a sentence, or phrase and search duplicate content around the web.
Use 301 for redirects
If you’ve restricted your URL, you can direct visitors using a 301 request. It’s considered as a permanent move from one link to another, which can be beneficial for SEO.
Other types of redirects are 302 (temporarily moved) or meta refresh – where you wait a couple of seconds before you’re redirected to the new URL. However, both can be harmful to SEO.
URL parameter handling in webmaster tools
If you’re having issues with www and non-www, or http and https, setting your parameter handlings can help specify your webmaster tool to crawl specific URLs.
The process can be tedious especially if you’ve submitted your site in different search engines like Bing, Yandex, and more. Whether you want a www or non-www, you need to keep your URLs consistent in use.
When posting duplicate content online, you can opt to noindex the piece for search engines not to index the URL, but rather, just crawl it. This allows search engines to “understand” your content, but won’t index it. This means you don’t get penalized. However, use the noindex option sparingly.
Use Hreflang tag for multinational sites
If you’re running an international site, using the hreflang tag is critical to businesses with subdomains and ccTLDs. So, if you have www.yourbusiness.com, www.yourbusiness.au, and www.business.jn, a hreflang on your head can control the duplicate your site has.