What is duplicate content and how to detect it? | detect duplicate content

What is duplicate content and how to detect it?

In the web and SEO world, duplicate content is text that is exactly the same, or very similar, to other text from a different URL.

It doesn’t matter if the two URLs are from the same website or from two or more different websites, it’s considered duplicate content anyway.

And this, my friend, is something that can harm your project if you don’t control it. Even if you think that you are not generating copied content, it may be that other websites are doing so by “inspiring” you in your content or blatantly plagiarizing it.

So let’s look at some ways you can avoid this type of content. But first, you’re probably wondering what happens if that’s the case, and that’s what we’re going to see first 😉

How duplicate content can harm you?

Apart from the obvious, which is that copying and publishing your content on another website is full-fledged plagiarism and copyright infringement, it can hurt your SEO.

If Google detects that your website has duplicate content, either within the same website, or that the content of one or several URLs is similar to that of other websites, two things can happen: either it does not show that content in the results of search, or that penalizes the web, in the event that it detects it as a habitual practice.

When search engines detect that there is more than one URL with the same (or similar) content, they can be confused about which of the two to show, so they can choose not to show any, or show the first one that they have indexed in their index.

How to avoid duplicate content

There are several ways to try to avoid duplicate content. The first is obvious: do not copy the texts of other websites that you find out there.

But as obvious as it may be, you can’t imagine the number of people who do it, and not new people on the internet anymore, but people who, supposedly, are dedicated to selling web design services, online marketing, etc. To piss and not drop, yes.

Another way to avoid it is to be careful with the links that point to the different URLs of your website.

If your website is made with WordPress, by default, the URLs of your website will load with the slash at the end of everything (called slash).

Example: //wordpressjournal.net/

If you click on the link, you will see that it opens with the trailing slash (/). Now, copy and paste this URL in the browser bar: https://lawebdetuvida.com/desenador-web-wordpress

As you can see, it does not have the slash at the end, but when the page is opened, it automatically redirects to the same URL with the slash at the end.

This is fine, but there are cases in which the URL loads both ways: with the slash and without the slash, and this could be considered duplicate content by Google.

In that case, you will have to create a redirect so that the URLs load with a slash or without a slash, but not both.

You also have to watch when creating internal web links. If they have the slash at the end, create the link with it, and vice versa.

Another similar problem could be that your website loads with http:// and with https:// after having installed an SSL certificate, generating a duplicate of each of the URLs. Lastly, and more difficult to control, is to ensure that other websites do not publish content taken from your articles.

There are many “clever” who, instead of writing their own content, are dedicated to copying the content of others and adding it as is on their websites. Or they even modify it by an incredible 1% to try to make it “not noticeable” (both have been done to me).

As I say, it is more difficult to control, because it is difficult to track all the websites that exist to detect if there is one that copies content, but there are some ways to detect it that I will tell you about below.

Tools to detect duplicate content

There are various tools and ways to try to detect duplicate content, either within your website or outside of it. I’m going to show some of them.

SEMrush

Among all the tools that SEMrush includes, is the one to carry out an SEO audit, in which it tells you if duplicate content has been detected within your website.

Copyscape

Copyscape is a well-known tool that, giving you a URL, searches the Internet to find texts that match the one found in the indicated URL and shows you a list of websites that have the same or very similar content.

In fact, at the time of writing this, I have put the URL of my web design service page in Copyscape and I have found a couple of websites that have struck down my text and copied it practically as is in their service pages.

“Pofresionality” above all, yes sir.

plagiarism

Plagium is another popular tool to search for duplicate content on the internet, but instead of indicating a URL like in the previous one, you have to paste the piece of text you want to search for, and the tool will crawl the internet to find the same or similar text.

 

Also Read: How to do SEO in WordPress to Increase the position your website

Share the Post: