Menu
Thin content was one of the first SEO issues Google targeted with its Panda algorithm update in 2011. That update rocked the entire industry and kick-started the search giant’s war against low-quality content.
It also made life increasingly difficult for black hat SEOs trying to game the SERPs. However, there are plenty of genuine, technical SEO reasons why you might end up with thin content on your website. In this article, we explain exactly what thin content is, how to find it on your site and what you need to do about it.
Google describes thin content as having “little or no added value”. This is the description you’ll see if you’re unlucky enough to get a manual action warning in Google Search Console, informing you that you’ve been penalised for having thin content on your site.
You definitely don’t want one of those.
The question at this point is: what kind of content does Google consider to have “little or no added value”?
Back in the early Panda days, Google was mostly targeting deceptive uses of thin content – for example:
In this case, we are looking at low-quality content, often created by basic machine concatenation, and offering limited, if any, value. For example, grabbing a news story in Spanish and then running it through Google Translate before adding it to your site – a big no-no.
We are starting to see examples of machines (or ‘robots’) writing high value content and this is something that will become more prevalent as AI and machine learning continue to improve. This does not fall into thin content but you would still want a human editor to review this type of content before publishing it.
Affiliate websites offering useful, comprehensive purchase advice have nothing to fear from Google. However, pages filled with affiliate links that offer no useful or relevant information for the end user are prime targets for getting hit by a search penalty.
If you’re in the affiliate game, stick to the following guidelines:
If you systematically add content to your website from external sources, you’re also at risk of a thin content penalty. There are a number of ways in which content is copied (or scraped) from other sources, a few of the more common ones being:
Doorway pages are a means to spam the search engine results pages (SERPs) with very thin content that target a very specific term or close group of terms with the purpose of sending this traffic to another website or destination.
This creates a poor user search experience and adds unwanted steps for the user to get to their desired end result. Often, doorway pages mean that the user ends up on a lower quality and less relevant search result page than required, resulting in excessive searching to discover the content they needed.
Essentially, if your content is copied from anywhere else, generated by software or you’re creating pages with little or no content, you could be in trouble. Even if you’re not trying to be deceptive (for example, reposting relevant news stories), you have to question why Google would choose to rank your page when it’s simply repeating content that’s already available – it has nothing new or valuable to offer.
As Google explains over at Search Console Help:
“One of the most important steps in improving your site’s ranking in Google search results is to ensure that it contains plenty of rich information that includes relevant keywords, used appropriately, that indicate the subject matter of your content.
“However, some webmasters attempt to improve their pages’ ranking and attract visitors by creating pages with many words but little or no authentic content. Google will take action against domains that try to rank more highly by just showing scraped or other cookie-cutter pages that don’t add substantial value to users.”
It all comes down to adding substantial value to the end user because this is what Google aims to deliver as a search engine.
For more info on thin content, take a look at this video from Google’s former head of web spam, Matt Cutts:
It’s not a particularly recent video but everything Matt Cutts says is still relevant today.
While the most publicised danger of thin content is getting hit by a Google search penalty, your problems run much deeper than this if you’ve got too much of it. If Google’s algorithms can tell you’re using thin content deceptively, then you can bet the majority of users who visit your site can see it as soon as they land on the page.
Whatever your objectives are with the page, you’re not going to convince many people to take action this way. You’ll struggle to keep users on the page, encourage them to engage with your brand or inspire them to convert.
Essentially, this is the real danger of thin content: your marketing objectives are going to fall flat.
Now, in terms of the Google Search penalties, these can be pretty devastating and it helps to understand how Google’s Panda algorithm works.
The Google Panda update was first released in 2011 with the purpose of de-valuing low-value and thin websites, to stop them from appearing so prominently in SERPs.
The other, lesser communicated, side of this update was the additional ranking gains (tied to content quality signals) rewarding websites creating high-quality content.
Google Panda updates can impact (remember, this ‘impact’ can be positive or negative) a single page, a whole topic or theme, multiple themes, or entire websites.
The Panda filter applies a number of perceived content quality criteria as well as questions that the Google Quality Raters would be asking themselves when manually viewing content – things like:
Related reading: The SEO’s guide to Google quality raters
The above is just the starting point for Panda protecting your website and content.
It is important to get a second opinion on your content. Be objective and honest with yourself and your team about the quality of what is being produced, and how it needs to improve.
While the penalties for having too much thin content can be severe, there are quite a lot of scenarios where you’re naturally going to end up with content that could fall into this category.
If you have a search function on your website, the results pages are going to offer very little or no original content. This can’t be helped, of course. The purpose of a search results page is to show snippets of other pages across your site and help users choose the most relevant option.
Solution: Prevent Google from indexing results pages by adding a disallow line for these pages in robot.txt file.
In many cases, it’s perfectly reasonable to have a photo or video gallery on your website. You might be a wedding photographer, a marquee hire company or a business with a bunch of video case studies to show off.
If the purpose of this page is to allow visitors to browse your photos or videos and choose which ones they want to view, this causes some thin content issues. You probably don’t want a load of text getting in the way on the gallery page itself and your problems get worse if each image or video has its own dedicated page.
Solution: This really depends on how you structure your gallery. You might choose to create content for your gallery page and no-index the individual image/video pages, for example. Or you might take the opposite approach and create unique content for each image/video and no-index the gallery page.
Alternatively, you could create a carousel that displays all images/videos on the same URL – it all depends on what you want to rank for and the kind of content you’re planning to create.
Shopping cart pages aren’t there to provide users with valuable content; they’re designed to help people manage orders and complete purchases. Technically, we’re in thin content territory here but the fix is pretty simple.
Solution: Once again, stop Google from indexing these pages by no-indexing them in your robot.txt file.
Duplicate pages are a natural part of managing a website. Moving over to HTTPS from HTTP creates duplicates, as does having www and non-www domains while managing multilingual websites and recreating pages for multiple locations can also result in duplicates.
Technically, duplicate content isn’t quite the same thing as thin content but the two do overlap in certain cases.
Solution: Mark the page version you want to rank with canonical tags, use 301 redirects if you’re sending users to a different URL and use hrefhttps://www.vertical-leap.uk/blog/gdpr-for-marketers/lang tags for international languages/locations.
In many cases, thin content isn’t detrimental to the user experience at all. In fact, it’s sometimes better to forget about content and simply deliver the functionality users need – eg: shopping carts.
Luckily, keeping these pages safe from search penalties is relatively simple. By no-indexing pages, telling Google which version to index (canonical tags) and/or using 301 redirects to send users to the right place, non-deceptive thin content shouldn’t be a problem.
This is one of the most common scenarios where thin and/or duplicate content occurs on a website. This is especially true if you’re selling multiple versions of the same or very similar product.
Naturally, brands try to avoid having duplicate content across these pages but it’s difficult to say the same thing in a hundred different ways.
It becomes a battle of thin content vs duplicate content and this causes a lot of confusion for website owners, SEOs and marketers in general.
The truth is, duplicate content is the lesser of two evils here and it’s better to provide users with comprehensive product details – even if they’re the same or similar – than publishing pages with very little (albeit unique) content.
Here’s What Google’s Andrey Lipattsev had to say about duplicate product pages during a Q&A on duplicate content with fellow Googler John Mueller.
“And even, that shouldn’t be the first thing people think about. It shouldn’t be the thing people think about at all. You should think, I have plenty of competition in my space, what am I going to do? And changing a couple of words is not going to be your defining criteria to go on. You know, the thing that makes or breaks a business.”
More to the point, there is no search penalty for duplicate content but there is for thin content.
So, when it comes to product pages, don’t worry too much about duplicate content for very similar products or variations of the same product. Instead, focus on optimising for the best experience and giving Google any clues you can about which page to prioritise in terms of indexing.
Here are some tips:
The key takeaway from the Q&A on duplicate content is that when pages are similar (or the same), Google is looking for a way to differentiate between them and product descriptions are just one of the hundreds of factors it looks at.
There are a number of ways to discover thin content (levels of words, duplication, and value) and a few of the more common actions can be seen below.
Using Copyscape (and other free tools), you can crawl the web to look for any content that has been copied from your domain, as well as any content that may have been added to your own site over the years copied (in part or full) from external sites.
You can also use Google search operators to manually check Google for instances of content copying/scraping or duplication.
Here’s an example of what you need to do:
Here’s an example of the above in action. In this case checking any duplication of content from a post I created for Search Engine Journal:
As you can see, the first site appearing is the originator website, and as this content is opinion-driven, it is intended to be distributed, shared socially and used on other websites.
An important aspect of this is the purpose of the content, whether it’s to drive traffic back to the main website, encourage shares or something else.
I’ve been using our machine learning software Apollo Insights for nearly ten years. One of the ways in which I use the data is to locate pages that are not contributing towards total site success.
You can see this in action below (the ‘Page Activity’ widget):
Another metric I use Apollo Insights for is locating content with a limited word count.
Although more words doesn’t always mean better quality content, in most cases a page with very few words is unlikely to be providing the depth of user and search value needed to deliver an optimum search experience.
You can see this below using a deep data grid – in this case I am looking at depth of content based on expected content structural elements, things like presence of multiple levels of header tags, and checking that the page is active and real:
Remaining with Apollo, ‘Auditor’ tells me how many pages have fewer words on them than I would expect from a high-quality website page. I can also look at the bigger picture and combine this knowledge with items like: external linking, framed content, pages orphaned off from the main website and much more.
The first stage in fixing thin content is understanding what high-quality and value-enhancing content looks like in the first place. The example below is from Think With Google: ‘The Customer Journey to Online Purchase‘.
Some of the key points which flag this as high quality for me include:
Using external comparisons is a great way to put in place the lowest benchmark for your own content quality. The goal is to create content on your website that is far better than any other examples available online.
Once you identify what ‘good’ looks like in your niche, you want to move towards creating ‘great’ content. At this stage, you need to find the content that doesn’t work at present (see previous section on ‘finding thin content’) and boost the content so that it can contribute more towards total site success, as well as its own standalone value.
“You will also need to find new opportunities for effective content creation. Don’t limit your content value by re-purposing alone, there is always an opportunity to create something amazing with digital content.”
Other tactics for creating new quality content include:
If you would like to chat about thin content, how we can help identify and fix it, or simply want to make your existing content work harder for you, then contact us at our London or Portsmouth agency offices.
Lee has been working in the online arena, leading digital departments since the early 2000s, and oversees all our delivery services at Vertical Leap, having joined back in 2010. Lee joined our company Operations Team in May 2019. Before working at Vertical Leap, Lee completed a degree in Business Management & Communications at Winchester University, headed up the online development and direct marketing department for an international financial services company for ~7 years, and set up/run a limited company providing website design, development and digital marketing solutions. Lee had his first solely authored industry book (Tactical SEO) published in 2016, with 2 further industry books being published in 2019, and can be seen regularly expert contributing to industry websites including State of Digital, Search Engine Journal, The Drum, plus many others. Lee has a passion for management in the digital industry and loves to see the progression of others through personal learning, training and development. Outside the office he looks to help others while challenging himself, having skydived, bungie jumped and abseiled (despite a fear of heights) with many more fundraising and voluntary events completed and on the horizon. As a husband and dad, Lee loves to spend time with his family and friends. His hobbies include exercising, trying new experiences, eating out, playing countless team sports, as well as watching films (Gangster movies in particular – “forget about it”).
Website under-performing but not sure why? Our free review will reveal a list of fixes to get it back on track!
Categories: SEO