Duplicate content vs Duplicate URL

[SEO Case Study] How Duplicate Content can affect Website Rankings & User Experience

You might have read many times that Duplicate content may lead to a penalty but I will show you how it is affecting the page authority and the primitive measures to duplicate URL. This article is a case study with examples of some industry-leading websites. let me break down from the start.

Duplicate Content vs Duplicate URL

Duplicate Content

When content appears more than once on the internet whether the appearance will be on the same website or on the different websites then it is known as duplicate content.

Duplicate URL

When a single webpage is accessible or inaccessible by multiple URLs within the same website is know as Duplicate URL. I say accessible or inaccessible because the status of the webpage may be live or left alone as 404.

Difference Duplicate content and Duplicate URL

They might look the same but

  • Duplicate content can happen across websites while duplicate URL can happen within a website and almost have the same URL.
  • Duplicate content will have almost or at least some of the content that is already available on the internet while duplicate URL will have the same content that is in the accessible state or inaccessible state within a website.
  • Duplicate content can lead to penalization by Google, while duplicate URL won’t cause penalization in most cases, but have more effects in the ranking shifts and diversification of page authority.

How Duplicate URL will affect the ranking?

If you have multiple (duplicate) URLs, Google will choose one URL as the canonical version and crawl that and all other URLs will be considered duplicate URLs and crawled less often.

Duplicate URLs have the ability to dilute the page authority

Example:
www.example.com
example.com
https://www.example.com
https://example.com
http://www.example.com
http://example.com

The above-listed URLs will be pointing to the same webpage content, if any of them is not canonicalized or redirected to one single version then it is known as duplicate URL.

All the above URLs are treated as different pages. If 10 backlinks link to www.example.com and 12 backlinks link to example.com, the link juice is passed to two different pages and not to one.

Even though both the pages provide the same information to the users, search engines will treat them as different webpages, if it doesn’t canonicalized.

This may also lead to content duplication but most sites won’t be flagged for the content duplication within a website. Because search engines now are smart they run across the duplicate content and rank the most authoritative version, unless the intent of the duplicate content is deceptive or manipulative.

Tools Used:

I have used very simple and free tools to analyze the effect of duplicate URL and a simple way find it.

  1. Monitor Backlinks
  2. Google Sheets
  3. Merkle Technical SEO tool
  4. Uptime Robot

Why should you worry about the Duplicate URL’s?

As we already know duplicate URL’s are bad for user experience and might divide the link juice. Let me show you an example of Wappalyzer.

Case 1 – Wappalyzer

“Have you ever used wappalyzer.com? Wappalyzer is a cross-platform utility that uncovers the technologies used on websites. It detects content management systems, e-commerce platforms etc,.

Duplicate URL: https://wappalyzer.com

Error: Invalid SSL

You might have received “This site can’t be reached” or “Connection is not protected” unless they resolved the problem at the time you reading this article. So I’ll add the screenshot of the error.

url error

But normally you can visit the webpage from searching on Google because as Google’s statement that “it will find the authoritative version and index it”.

URL inactivity - Wappalyzer

Check out the working version – www.wappalyzer.com (It works, ain’t it?)

The URLs are distinctly different when it has a scheme or without, the same applies to the subdomain. So here the https://wappalyzer.com acts as one version which is unreachable and unmaintained.

learn about the URL parts.

Coming back to Why you need to worry about duplicate URL?

Let’s take a look at their backlink profile just for the not working URL.

Backlink stats from Monitorbacklinks

That’s 2.4K backlinks from 615 Domains. The list includes 13 backlinks from drupal, 1808+ from GitHub and more. And just for the one URL version.

Whenever a user clicks the backlink they are directed to the “unreachable” page so there might be a huge dropout of users who might even consider using their premium service. This is even apart from the dropping of link juice. By fixing the webpage they can increase the user experience, they can increase the traffic and might even see an increase in conversion.

It’s the same for the search engine bots, when it crawls the webpage with a backlink of the target URL it records the same error. So the page authority is not passed in this case. It doesn’t matter if they have tons of URLs for this dead page.

Case 2 – BigCommerce

Bigcommerce is one of the leading E-commerce solution providers.

Duplicate URL – https://bigcommerce.com/

Error: No status code/Not reachable – Blank page.

The URL doesn’t work and does don’t throw any error but a blank page. Upon analyzing using the technical SEO tool or Https header it’s the same.

Let’s take a look at their backlink profile

Backlink of Bigcommerce duplicate URL
Backlink stats from Monitorbacklinks

And I have been monitoring this quite a long period, previously there were around 200k+ backlinks pointing to the source URL but later I think the webmasters of the regarding backlinked websites would have changed it to the working one.

Duplicate URL Inactivity - Bigcommerce
Screenshot of the URL inactivity

But anyway the duplicate URL still got the juice and many websites still pointing it to including neilpatel blog. Even if they don’t use the URL to market or don’t have a backlink. It would be a great choice if they fix it because their business is based online.

And the sad part is one of their premium theme (Peak by Pixel Union) costs $195 which has the unreachable duplicate URL at the footer.

Case 3 – Teem

This is quite different from the other websites mentioned above because they don’t seem to have a duplicate URL but an error in the canonical URL.

URL – https://www.teem.com/

Error – Canonical URL

When digging into their source code you could able to notice the Canonical URL of the webpage or even with the extensions like MozBar or something similar. I found out using my simple solution which is attached at the end of the article.

Canonical URL error
Find the canonical area by yourself by looking into the screenshot

There might be confusion in the rel=”publisher” and rel=”canonical” but the search bots will understand the page as the canonical version of the Google+ profile.

How to fix it?

The website link juice will be different for distinct webpages even though they serve the same purpose. You have 2 options to fix it.

Canonical vs Redirect – You can canonicalize the other URLs to one single piece but before canonicalization, you must fix the webpage so it won’t have any error to the end-user and bots. I suggest, if you have duplicate URL within your website it’s better to 301 redirect them to one single piece so you won’t have multiple versions to monitor all the time.

You can prefer canonical if it’s duplicate content.

Now you worry about the duplicate URL, Right?

How to find the Duplicate URL?

You can simply find the status of the URLs by typing into the browser but I have created a simple worksheet with URL parts separated that could help you to find the duplicate URLs much more easily. Please feel free to make a copy of it.

Duplicate URL checker - DigitalGasm

Get a copy from here – Duplicate URL checker

The sheet is created with the possible misled variations of a webpage. Just enter the domain name in the C2 cell then drag till the last cell. The sheet will automatically provide you the Status Code, Redirected URL and Canonical URL of the webpage. You can even change the subdomain or filename as per the website.

It is good if a webpage has only two “200” status codes. That is one will have a trailing slash (/) at the end another URL will not have a trailing slash.

Let me know if you find any other websites in the comment section, I will update that here and add your LinkedIn profile in the section. And subscribe to my newsletter to keep you posted.

References:

https://support.google.com/webmasters/answer/139066?hl=en

https://eikhart.com/blog/google-sheets-http-status-codes

https://www.contentkingapp.com/academy/http-status-codes/

8 thoughts on “[SEO Case Study] How Duplicate Content can affect Website Rankings & User Experience”

  1. That’s hell of a case study, you found a very good one. Difference between Content duplication and URL duplication is very clear. Subscribed to your newsletter and looking forward for more SEO case studies.

  2. I really wonder how you took the time and see the details in the error. Everyone will come across and leave it unnoticed but you made a change by creating it as a case study. And I never thought it would affect the rankings though. Thank you for the sheets, I made a copy for myself.

  3. Thanks for the google sheets that is very useful and handy to find the duplicate url. I have been used across my website and found one of client non-ssl is not been redirected to the main page.

  4. I never knew the difference between URL duplication and Content duplication. That’s a hell lot of a case study

Leave a Comment

Your email address will not be published. Required fields are marked *