<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Technical SEO Insights - Actionable Articles and In-Depth Research</title>
	<atom:link href="https://vladsand.com/category/technical-seo/feed/" rel="self" type="application/rss+xml" />
	<link>https://vladsand.com/category/technical-seo/</link>
	<description>SEO Expert, Author and Lecturer</description>
	<lastBuildDate>Sun, 19 Oct 2025 02:17:23 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://vladsand.com/wp-content/uploads/2025/01/favicon-150x150.png</url>
	<title>Technical SEO Insights - Actionable Articles and In-Depth Research</title>
	<link>https://vladsand.com/category/technical-seo/</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Is Your Server Blocking AI Bots and LLM Crawlers?</title>
		<link>https://vladsand.com/is-your-server-blocking-ai-bots-and-llm-crawlers/</link>
		
		<dc:creator><![CDATA[Vlad Sand]]></dc:creator>
		<pubDate>Wed, 08 Oct 2025 00:13:54 +0000</pubDate>
				<category><![CDATA[AI SEO]]></category>
		<category><![CDATA[Technical SEO]]></category>
		<guid isPermaLink="false">https://vladsand.com/?p=1602</guid>

					<description><![CDATA[<p>Everyone has been talking lately about AI SEO, GEO, and LLMs, and how to rank, and so on. But one very important fact is often...</p>
<p>The post <a href="https://vladsand.com/is-your-server-blocking-ai-bots-and-llm-crawlers/">Is Your Server Blocking AI Bots and LLM Crawlers?</a> appeared first on <a href="https://vladsand.com">VladSand</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>Everyone has been talking lately about AI SEO, GEO, and LLMs, and how to rank, and so on.</p>



<p>But one very important fact is often overlooked: the biggest issue right now for websites, when it comes to indexing and being processed by LLMs (Large Language Models), is not about optimization. The real problem is that <strong>LLMs simply cannot “see” most websites</strong>.</p>



<h2 class="wp-block-heading"><strong>Why AI Crawlers Can’t Access Many Websites</strong></h2>



<p>This happens because, although we in the SEO world are keeping up with trends, in this case with AI and LLMs, some web hosting providers have unfortunately remained stuck in the “Middle Ages” when it comes to AI.</p>



<p>Even at the time of writing this article, they are still blocking AI bots in the same way they block aggressive, invasive bots such as MJ12bot, SemrushBot, LinkpadBot (the typical bots used by SEO and online marketing tools).</p>



<h2 class="wp-block-heading"><strong>How to Check if Your Server is Blocking LLM and AI Bots</strong></h2>



<p>First, we need to identify which LLM bots are relevant at the moment.</p>



<p>Among the most relevant bots currently are:</p>



<ul class="wp-block-list">
<li><strong>OpenAI bots</strong> (GPTBot, ChatGPT-User &amp; OAI-SearchBot)<br></li>



<li><strong>Perplexity</strong> (PerplexityBot, Perplexity-User)<br></li>



<li><strong>Anthropic</strong> (ClaudeBot, Claude-User, Claude-SearchBot)<br></li>
</ul>



<h3 class="wp-block-heading"><strong>Practical Methods to Test Whether Bots Are Being Blocked</strong></h3>



<h4 class="wp-block-heading"><strong>Method 1: Using Screaming Frog to Test AI Bot Access</strong></h4>



<p>This is the simplest method. You just need a Screaming Frog license, and in <strong>Configuration &gt; User Agent</strong>, you can select the bot you want to check to see whether your site (or server) is blocking it:</p>



<figure class="wp-block-image size-large"><img fetchpriority="high" decoding="async" width="1024" height="646" src="https://vladsand.com/wp-content/uploads/2025/10/configuration-user-agent-gptbot.jpg-1024x646.webp" alt="configuration user agent: GPTBot in ScreamingFrog" class="wp-image-1606" srcset="https://vladsand.com/wp-content/uploads/2025/10/configuration-user-agent-gptbot.jpg-1024x646.webp 1024w, https://vladsand.com/wp-content/uploads/2025/10/configuration-user-agent-gptbot.jpg-300x189.webp 300w, https://vladsand.com/wp-content/uploads/2025/10/configuration-user-agent-gptbot.jpg-768x485.webp 768w, https://vladsand.com/wp-content/uploads/2025/10/configuration-user-agent-gptbot.jpg.webp 1306w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p><strong>If you receive a 406 error</strong> message, it means <strong>the server is blocking the ChatGPT bot</strong> from crawling your site.</p>



<figure class="wp-block-image size-full"><img decoding="async" width="834" height="134" src="https://vladsand.com/wp-content/uploads/2025/10/406-error-gpt-bot.jpg.webp" alt="406 Error in Screaming Frog for GPTBot" class="wp-image-1608" srcset="https://vladsand.com/wp-content/uploads/2025/10/406-error-gpt-bot.jpg.webp 834w, https://vladsand.com/wp-content/uploads/2025/10/406-error-gpt-bot.jpg-300x48.webp 300w, https://vladsand.com/wp-content/uploads/2025/10/406-error-gpt-bot.jpg-768x123.webp 768w" sizes="(max-width: 834px) 100vw, 834px" /></figure>



<p>If the bot starts crawling your site and <strong>returns status 200</strong> for the pages it visits, that means <strong>the server is not blocking GPTBot</strong>.</p>



<p>Here I’ll make a quick note regarding what OpenAI states on its official site: <strong>their three main bots (GPTBot, OAI-SearchBot, and ChatGPT-User)</strong> each have different roles.</p>



<p>Specifically:</p>



<ul class="wp-block-list">
<li><strong>GPTBot</strong> – is used for training AI models (OpenAI states it does not affect appearance in search results).<br></li>



<li><strong>OAI-SearchBot</strong> – is used for indexing for ChatGPT Search. This is theoretically the most important if you want your site to be crawled, processed, and shown as a source in ChatGPT.<br></li>



<li><strong>ChatGPT-User</strong> – this bot needs actual access when users click on links or perform live requests, such as “check if there are blue balloons on website.com”. If this bot is blocked, GPT won’t be able to visit the page.<br></li>
</ul>



<p>As for the other bots, <strong>Claude and Perplexity</strong>, you can check them in Screaming Frog in the same way as OpenAI’s bots.</p>



<h4 class="wp-block-heading"><strong>Method 2: Testing AI Bots with curl (HEAD Request)</strong></h4>



<p>This method is more direct and doesn’t require any special software. Open either the <strong>Mac Terminal or Windows Command Prompt</strong> and run the following header checks for each bot:</p>



<p><strong>OpenAI – GPTBot</strong></p>



<pre class="wp-block-code"><code>curl -I --ssl-no-revoke -A "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; GPTBot/1.1; +https://openai.com/gptbot" https://example.com</code></pre>



<p><strong>OAI-SearchBot</strong></p>



<pre class="wp-block-code"><code>curl -I --ssl-no-revoke -A "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; OAI-SearchBot/1.0; +https://openai.com/searchbot" https://example.com</code></pre>



<p><strong>ChatGPT-User</strong></p>



<pre class="wp-block-code"><code>curl -I --ssl-no-revoke -A "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; ChatGPT-User/1.0; +https://openai.com/bot" https://example.com</code></pre>



<p><strong>PerplexityBot</strong></p>



<pre class="wp-block-code"><code>curl -I --ssl-no-revoke -A "Mozilla/5.0 (compatible; PerplexityBot/1.0; +https://www.perplexity.ai/bot)" https://example.com</code></pre>



<p><strong>Perplexity-User</strong></p>



<pre class="wp-block-code"><code>curl -I --ssl-no-revoke -A "Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko); compatible; Perplexity-User/1.0; +https://perplexity.ai/perplexity-user" https://example.com</code></pre>



<p><strong>ClaudeBot</strong></p>



<pre class="wp-block-code"><code>curl -I --ssl-no-revoke -A "Mozilla/5.0 (compatible; ClaudeBot/1.0; +https://www.anthropic.com/claude)" https://example.com</code></pre>



<p><strong>Claude-User</strong></p>



<pre class="wp-block-code"><code>curl -I --ssl-no-revoke -A "Mozilla/5.0 (compatible; Claude-User/1.0; +https://www.anthropic.com/claude)" https://example.com</code></pre>



<p><strong>Claude-SearchBot</strong></p>



<pre class="wp-block-code"><code>curl -I --ssl-no-revoke -A "Mozilla/5.0 (compatible; Claude-SearchBot/1.0; +https://www.anthropic.com/claude)" https://example.com</code></pre>



<p>To understand <strong>what this curl command does</strong> and where this information comes from: there’s no magic or hack involved.</p>



<p>The major companies behind these bots (OpenAI, Anthropic, Perplexity, etc.) publicly provide all the information about their crawlers, including their full user-agent strings, exactly as they appear in the HTTP headers of real requests.</p>



<p>If you’re wondering about the “<strong>revoke</strong>” part, the <strong>&#8211;ssl-no-revoke</strong> flag is included because some Windows versions throw SSL errors when they cannot check the certificate’s revocation status.</p>



<h4 class="wp-block-heading"><strong>Case Study: Real Bot Access Test on VladSand.com</strong></h4>



<p><strong>OAI-SearchBot gets through (status 200):</strong></p>



<figure class="wp-block-image size-large"><img decoding="async" width="1024" height="153" src="https://vladsand.com/wp-content/uploads/2025/10/oai-search-bot-example.jpg-1024x153.webp" alt="OAI-SearchBot has a 200 Status in curl" class="wp-image-1652" srcset="https://vladsand.com/wp-content/uploads/2025/10/oai-search-bot-example.jpg-1024x153.webp 1024w, https://vladsand.com/wp-content/uploads/2025/10/oai-search-bot-example.jpg-300x45.webp 300w, https://vladsand.com/wp-content/uploads/2025/10/oai-search-bot-example.jpg-768x115.webp 768w, https://vladsand.com/wp-content/uploads/2025/10/oai-search-bot-example.jpg.webp 1440w" sizes="(max-width: 1024px) 100vw, 1024px" /></figure>



<p>As you can see, the GPT crawler that indexes content has access and returns a 200 status.</p>



<p><strong>ChatGPT-User (status 406) – blocked, unfortunately, in my case:</strong></p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="102" src="https://vladsand.com/wp-content/uploads/2025/10/chatgpt-user-bot-406-not-acceptable.jpg-1024x102.webp" alt="406 error in curl ChatGPT-User" class="wp-image-1653" srcset="https://vladsand.com/wp-content/uploads/2025/10/chatgpt-user-bot-406-not-acceptable.jpg-1024x102.webp 1024w, https://vladsand.com/wp-content/uploads/2025/10/chatgpt-user-bot-406-not-acceptable.jpg-300x30.webp 300w, https://vladsand.com/wp-content/uploads/2025/10/chatgpt-user-bot-406-not-acceptable.jpg-768x76.webp 768w, https://vladsand.com/wp-content/uploads/2025/10/chatgpt-user-bot-406-not-acceptable.jpg.webp 1369w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>ChatGPT-User is blocked by the server, by the hosting company (I’m currently in a battle with them to explain why it needs to be unblocked), but hopefully they will eventually fix this.</p>



<h2 class="wp-block-heading"><strong>How to Unblock AI and LLM Bots on Your Website</strong></h2>



<p>If you&#8217;re using shared hosting  (like most websites), the only way to unblock these bots is to <strong>contact your hosting provider</strong> and ask them to whitelist the relevant User-Agents or IP ranges. Fortunately, this is usually a quick fix once they understand the issue.</p>



<p>Most major AI companies also <strong>publish their official IP ranges</strong>, which can be useful if your firewall or hosting provider uses IP-based filtering. Here are the official resources:</p>



<p><strong>GPTBot</strong>: <a href="https://openai.com/gptbot.json" target="_blank" rel="noreferrer noopener">https://openai.com/gptbot.json</a></p>



<p><strong>OAI-SearchBot</strong>: <a href="https://openai.com/searchbot.json" target="_blank" rel="noreferrer noopener">https://openai.com/searchbot.json</a></p>



<p><strong>ChatGPT-User</strong>: <a href="https://openai.com/chatgpt-user.json">https://openai.com/chatgpt-user.json</a></p>



<p><strong>PerplexityBot</strong>: <a href="https://www.perplexity.ai/perplexitybot.json" target="_blank" rel="noreferrer noopener">https://www.perplexity.ai/perplexitybot.json</a></p>



<p><strong>Perplexity-User</strong>: <a href="https://www.perplexity.ai/perplexity-user.json" target="_blank" rel="noreferrer noopener">https://www.perplexity.ai/perplexity-user.json</a></p>



<p><em><strong>*Anthropic (Claude) </strong>has not provided official details about the IP ranges it uses.</em></p>



<p>If you’re on a <strong>dedicated server or VPS</strong>, you can manually adjust the server configuration, for example, by adding specific <strong>ModSecurity rules</strong> or firewall exceptions to explicitly allow AI crawlers.</p>



<h2 class="wp-block-heading"><strong>Other Common Reasons AI Bots Are Blocked</strong></h2>



<p>Here I focused on blocking at the server level, because in most cases this is where access for these bots is cut off. However, they can also be blocked in other ways, sometimes without you even realizing it:</p>



<ul class="wp-block-list">
<li>through the robots.txt file<br></li>



<li>through external firewalls (Cloudflare, Wordfence, etc.)<br></li>



<li>through security plugins<br></li>



<li>or implicitly through IP blocking (if your hosting provider blocks certain countries, the IPs of OpenAI / Perplexity / Anthropic may be rejected even if the User-Agent is valid)</li>
</ul>
<p>The post <a href="https://vladsand.com/is-your-server-blocking-ai-bots-and-llm-crawlers/">Is Your Server Blocking AI Bots and LLM Crawlers?</a> appeared first on <a href="https://vladsand.com">VladSand</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>How to Decide Which Filters (Facets) to Index on an E-Commerce Website</title>
		<link>https://vladsand.com/how-to-decide-which-filters-facets-to-index-on-an-e-commerce-website/</link>
		
		<dc:creator><![CDATA[Vlad Sand]]></dc:creator>
		<pubDate>Fri, 03 Oct 2025 22:44:31 +0000</pubDate>
				<category><![CDATA[E-Commerce SEO]]></category>
		<category><![CDATA[Keyword Research]]></category>
		<category><![CDATA[Technical SEO]]></category>
		<guid isPermaLink="false">https://vladsand.com/?p=1558</guid>

					<description><![CDATA[<p>The indexability of filters and the correct way to choose which filters should be indexed is a challenging topic for webmasters, SEO specialists, and SEO...</p>
<p>The post <a href="https://vladsand.com/how-to-decide-which-filters-facets-to-index-on-an-e-commerce-website/">How to Decide Which Filters (Facets) to Index on an E-Commerce Website</a> appeared first on <a href="https://vladsand.com">VladSand</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>The indexability of filters and the correct way to choose <strong>which filters should be indexed</strong> is a challenging topic for webmasters, SEO specialists, and SEO agencies alike.</p>



<p>In today’s article, we’ll focus on <strong>filters that already exist on your website</strong>, but you’re not sure which ones to make indexable, without harming your site’s indexing and without risking to damage its SEO structure.</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="683" src="https://vladsand.com/wp-content/uploads/2025/10/common-category-filters-facets-1024x683.webp" alt="common category filters (facets)" class="wp-image-1560" srcset="https://vladsand.com/wp-content/uploads/2025/10/common-category-filters-facets-1024x683.webp 1024w, https://vladsand.com/wp-content/uploads/2025/10/common-category-filters-facets-300x200.webp 300w, https://vladsand.com/wp-content/uploads/2025/10/common-category-filters-facets-768x512.webp 768w, https://vladsand.com/wp-content/uploads/2025/10/common-category-filters-facets.webp 1536w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<h2 class="wp-block-heading"><strong><strong>Why Choosing Filters to Index Is So Difficult</strong></strong></h2>



<p>Choosing which filters to index can be a tricky exercise because there are multiple factors to consider when deciding which filters should be indexable.</p>



<p>One of the most common issues is platform limitations. Some e-commerce platforms, whether mainstream or custom, are configured in a way that <strong>makes filters either impossible to index</strong> correctly or impossible to index at all.</p>



<h2 class="wp-block-heading"><strong>Common Indexing Challenges for Filters</strong></h2>



<h3 class="wp-block-heading"><strong>JavaScript-Generated Filters</strong></h3>



<p>We’re not talking here about how URLs are generated, but about the fact that, when a filter is applied, nothing changes in the browser’s address bar. No new URL is created; everything happens through JavaScript on the page.</p>



<p>In this situation, filters have no SEO value and don’t contribute to organic visibility. They only partially help improve the site’s UX.</p>



<h3 class="wp-block-heading"><strong>Either All Filters Are Indexed or None</strong></h3>



<p>This is a very common situation: the platform generates filter URLs, but it either indexes all the filters on the site (or within a specific category) or none of them.</p>



<p>This can harm your site more than it helps. Imagine your website has 100 categories with 30 filters each. Essentially, each category and every filter gets indexed individually.</p>



<p>You might think: “What’s wrong with that? I’ll have more indexed filters, more ranking opportunities, more URLs.”</p>



<h4 class="wp-block-heading"><strong>The Risks of Indexing Every Filter</strong></h4>



<ul class="wp-block-list">
<li><strong>You will index filters with little or no content</strong> (e.g., 1–2 products). This creates thin content across your site, which could result in tens of thousands of thin pages. This can lead to algorithmic penalties, such as those from Google Panda.<br></li>



<li><strong>You’ll end up indexing filters with no SEO value</strong>, such as <em>price, free shipping, new arrivals, discount only, in stock, out of stock</em>, etc. Nobody searches for these filters, so they generate extra content with zero search intent. Search engines may interpret these pages as low-value, which can also negatively affect your crawl budget (the number of pages Google is willing and able to crawl on your site within a specific time period).<br></li>
</ul>



<h3 class="wp-block-heading"><strong>Lack of Customization for Filter Pages</strong></h3>



<p>Let’s assume you have filter URLs and can choose which ones to index, but you cannot customize them. This means the filter pages cannot have their own visible titles (H1) or unique meta titles in the format “category + filter” or “filter + category.”</p>



<p>If you can’t do this, all your indexable filter pages will end up having the same visual title and meta title as the parent category.</p>



<p><strong>Example:</strong><strong><br></strong>Instead of having pages like:</p>



<ul class="wp-block-list">
<li>example.com/women-shoes/filter/red → <em>Red Women Shoes</em><em><br></em></li>



<li>example.com/women-shoes/filter/blue → <em>Blue Women Shoes</em><em><br></em></li>



<li>example.com/women-shoes/filter/black → <em>Black Women Shoes</em><em><br></em></li>
</ul>



<p>You’ll have all these URLs using the same meta title and visual title as the main category page:</p>



<p>example.com/women-shoes → <em>Women Shoes</em></p>



<p>This leads to keyword cannibalization between the category and its filters, resulting in a complete SEO mess.</p>



<h3 class="wp-block-heading"><strong>Indexing Multiple Filter Combinations</strong></h3>



<p>Another issue that can occur is the platform not offering an option to deindex combinations of two or more filters applied at the same time.</p>



<p>Ideally, you should have the option to index a maximum of two combined filters that make sense together.</p>



<p><strong>Valid example:</strong></p>



<ul class="wp-block-list">
<li>example.com/t-shirts/filter/blue/xxl/ → XXL Blue T-Shirts<br></li>
</ul>



<p>This is a common and correct scenario for indexing two combined filters.</p>



<p><strong>Invalid examples:</strong></p>



<ul class="wp-block-list">
<li>Size + size: t-shirts/filter/xs/xxl/<br></li>



<li>Size + color + size: t-shirts/filter/xs/blue/xxl/<br></li>
</ul>



<p>These combinations don’t make sense, have nearly zero search intent, and can generate millions (or even hundreds of millions) of useless pages.</p>



<p>A site that realistically has only a few thousand pages could end up with millions of indexable, irrelevant pages, leading to a potential SEO collapse.</p>



<h2 class="wp-block-heading"><strong>How to Choose the Right Filters to Index</strong></h2>



<p>Let’s assume all the above issues have been solved, you have a good developer who can customize everything, or your platform has all the necessary plugins and features to perfectly handle filter functionality.</p>



<p>For a new site, even with the entire infrastructure in place, the ideal approach is to start with all filters set to “noindex.” If these are not noindex and you don’t start from this calming fact, that’s okay, at least from this point on you’ll be able to decide which filters to index and apply noindex to the ones that shouldn’t be.</p>



<h3 class="wp-block-heading"><strong>The Simple Way – Keyword Research</strong></h3>



<p>You can use keyword research tools such as Ahrefs, SEO Monitor, Ubersuggest, and others.</p>



<p>These tools are helpful when you start from a single keyword (for example, the category name) and they provide relevant keywords associated with it.</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="902" height="489" src="https://vladsand.com/wp-content/uploads/2025/10/examples-of-keyword-research.png.webp" alt="examples of keyword research for a category in Ahrefs" class="wp-image-1561" srcset="https://vladsand.com/wp-content/uploads/2025/10/examples-of-keyword-research.png.webp 902w, https://vladsand.com/wp-content/uploads/2025/10/examples-of-keyword-research.png-300x163.webp 300w, https://vladsand.com/wp-content/uploads/2025/10/examples-of-keyword-research.png-768x416.webp 768w" sizes="auto, (max-width: 902px) 100vw, 902px" /></figure>



<p>However, most of the time, these tools focus on main category keywords. They can help you:</p>



<ul class="wp-block-list">
<li>Create new categories<br></li>



<li>Optimize existing categories<br></li>



<li>Occasionally discover good filter ideas with sufficient search volume<br></li>
</ul>



<h3 class="wp-block-heading"><strong>Identify Target Filters on Your Platform</strong></h3>



<p>This is the part we’ll focus on. Suppose you have an e-commerce site with a wide variety of products. You’ve set up all necessary filters in the database (color, sizes, etc.), and now you want to identify exactly which filters should be indexed.</p>



<p>This method is simple and can be applied using basic tools to help you find the right filters to index.</p>



<p><strong>Steps:</strong></p>



<p><strong>1.</strong> Open the <strong><a href="https://vladsand.com/keyword-combiner/">Keyword Combiner</a></strong> tool.</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="587" src="https://vladsand.com/wp-content/uploads/2025/10/keyword-combiner-tool.png-1024x587.webp" alt="keyword combiner tool" class="wp-image-1562" srcset="https://vladsand.com/wp-content/uploads/2025/10/keyword-combiner-tool.png-1024x587.webp 1024w, https://vladsand.com/wp-content/uploads/2025/10/keyword-combiner-tool.png-300x172.webp 300w, https://vladsand.com/wp-content/uploads/2025/10/keyword-combiner-tool.png-768x440.webp 768w, https://vladsand.com/wp-content/uploads/2025/10/keyword-combiner-tool.png-1536x880.webp 1536w, https://vladsand.com/wp-content/uploads/2025/10/keyword-combiner-tool.png.webp 1730w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p><strong>2. </strong>Open the category you want to analyze. Example: “laptops.”</p>



<p><strong>3.</strong> In Keyword Combiner, add both the singular and plural forms: laptop / laptops.</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="800" height="216" src="https://vladsand.com/wp-content/uploads/2025/10/category-keyword-combiner.png.webp" alt="category in Keyword combiner" class="wp-image-1563" srcset="https://vladsand.com/wp-content/uploads/2025/10/category-keyword-combiner.png.webp 800w, https://vladsand.com/wp-content/uploads/2025/10/category-keyword-combiner.png-300x81.webp 300w, https://vladsand.com/wp-content/uploads/2025/10/category-keyword-combiner.png-768x207.webp 768w" sizes="auto, (max-width: 800px) 100vw, 800px" /></figure>



<p><strong>4.</strong> Use a tool to extract anchor texts from URLs, such as <a href="https://chromewebstore.google.com/detail/magic-anchor-url-grabber/edgcafkcaiihjcmimgfimjicmkelkoab">Magic Anchor &amp; URL Grabber</a>, Linkclump (or other).</p>



<p><strong>5. </strong>On the category page, extract the filter anchors (brand, type, operating system, etc.)</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="466" src="https://vladsand.com/wp-content/uploads/2025/10/magic-url-grabber-print-screen-filters.png-1024x466.webp" alt="magic URL grabber print screen filters" class="wp-image-1564" srcset="https://vladsand.com/wp-content/uploads/2025/10/magic-url-grabber-print-screen-filters.png-1024x466.webp 1024w, https://vladsand.com/wp-content/uploads/2025/10/magic-url-grabber-print-screen-filters.png-300x137.webp 300w, https://vladsand.com/wp-content/uploads/2025/10/magic-url-grabber-print-screen-filters.png-768x350.webp 768w, https://vladsand.com/wp-content/uploads/2025/10/magic-url-grabber-print-screen-filters.png.webp 1199w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>and paste them into Keyword Combiner, in filter section:</p>



<figure class="wp-block-image size-full"><img loading="lazy" decoding="async" width="883" height="390" src="https://vladsand.com/wp-content/uploads/2025/10/keyword-combiner-filters-one-per-line.png.webp" alt="Keyword combiner - filters one per line" class="wp-image-1565" srcset="https://vladsand.com/wp-content/uploads/2025/10/keyword-combiner-filters-one-per-line.png.webp 883w, https://vladsand.com/wp-content/uploads/2025/10/keyword-combiner-filters-one-per-line.png-300x133.webp 300w, https://vladsand.com/wp-content/uploads/2025/10/keyword-combiner-filters-one-per-line.png-768x339.webp 768w" sizes="auto, (max-width: 883px) 100vw, 883px" /></figure>



<p>Then you generate the words and all the combinations of category (singular and plural) + filter (facet):</p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="489" src="https://vladsand.com/wp-content/uploads/2025/10/generated-results-keyword-combiner.jpg-1024x489.webp" alt="generated keywords" class="wp-image-1575" srcset="https://vladsand.com/wp-content/uploads/2025/10/generated-results-keyword-combiner.jpg-1024x489.webp 1024w, https://vladsand.com/wp-content/uploads/2025/10/generated-results-keyword-combiner.jpg-300x143.webp 300w, https://vladsand.com/wp-content/uploads/2025/10/generated-results-keyword-combiner.jpg-768x367.webp 768w, https://vladsand.com/wp-content/uploads/2025/10/generated-results-keyword-combiner.jpg.webp 1174w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p><strong>6.</strong> You can repeat this process for multiple categories until you’ve covered them all.<br></p>



<p>This method is useful <strong>when you don’t have backend database access</strong>. If you do have access, <strong>you can directly extract category and filter names from the database and combine them in Keyword Combiner</strong> without manually going through each category.</p>



<p>Once you’ve generated category + filter combinations, use a keyword research tool (Ahrefs, SEO Monitor, etc.) to <strong>check the monthly search volume</strong> for each combination. This will give you a clear picture of <strong>which existing filters on your site have meaningful search demand</strong> and therefore <strong>should be indexed</strong>.</p>



<p><strong>P.S.:</strong></p>



<p>Even if you’ve identified the right filters and have the necessary infrastructure to index and customize them, you need to follow one key rule: <strong>if a filter URL lists only one product, that page should remain “noindex.”</strong></p>



<p>Once <strong>a second product is added</strong>, the “noindex” tag should be <strong>dynamically removed</strong>, and the filter URL should become indexable.</p>



<p>For <strong>maximum SEO impact</strong>, it’s also recommended to add descriptive, targeted content (LSI &#8211; Latent Semantic Indexing ) to the first page of each filter, but not to its paginated pages.</p>



<p>The fewer filters you index, the richer the descriptive text should be, to compensate for the lack of product content.</p>
<p>The post <a href="https://vladsand.com/how-to-decide-which-filters-facets-to-index-on-an-e-commerce-website/">How to Decide Which Filters (Facets) to Index on an E-Commerce Website</a> appeared first on <a href="https://vladsand.com">VladSand</a>.</p>
]]></content:encoded>
					
		
		
			</item>
		<item>
		<title>The Disavow File &#8211; Is it really worth creating it?</title>
		<link>https://vladsand.com/the-disavow-file-is-it-really-worth-creating-it/</link>
		
		<dc:creator><![CDATA[Vlad Sand]]></dc:creator>
		<pubDate>Sun, 21 Sep 2025 01:13:23 +0000</pubDate>
				<category><![CDATA[Case Studies]]></category>
		<category><![CDATA[Link Building]]></category>
		<category><![CDATA[SEO Mastery]]></category>
		<category><![CDATA[Technical SEO]]></category>
		<guid isPermaLink="false">https://vladsand.com/?p=1500</guid>

					<description><![CDATA[<p>The disavow file has given headaches to all SEOs and webmasters ever since it was introduced, I think it was around 2012–2013, I don’t remember...</p>
<p>The post <a href="https://vladsand.com/the-disavow-file-is-it-really-worth-creating-it/">The Disavow File &#8211; Is it really worth creating it?</a> appeared first on <a href="https://vladsand.com">VladSand</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>The disavow file has given headaches to all SEOs and webmasters ever since it was introduced, I think it was around 2012–2013, I don’t remember exactly, but I do remember that the first time I heard about this tool was from <a href="https://en.wikipedia.org/wiki/Matt_Cutts">Matt Cutts</a>, the Google “spokesperson” for search engine optimization back in the good old days. He was like a John Mueller, but one who actually gave relevant answers.</p>



<h2 class="wp-block-heading">Questioning the real value of the Disavow File</h2>



<p>Let’s get back to the subject, because we strayed a bit with history.</p>



<p>The disavow file, at least for me, has never had any clear value. I mean, it only helped me and seemed to make sense once in its entire history of use, namely around 2012, after Google Penguin came into the SEO world.&nbsp;</p>



<p>(<em>A small side note for those who don’t know what Google Penguin is: it’s a Google update that basically acts as a link police. Sites that use spammy backlink methods fall into its filter and get penalized, either physically with a direct message in Google Webmaster Tools, or algorithmically, without you realizing it, but you can tell something negative is happening to your site, by an instant drop in organic traffic</em>)</p>



<h3 class="wp-block-heading">A single successful case</h3>



<p>Returning to the topic for the second time: the only time it seemed to make sense to create a disavow file was after Google Penguin was rolled out and I had a client, a gift-basket site, which until Penguin had only directory links, packages of hundreds of links bought all at once on a single anchor, etc. (basically, the kind of practices common back then).</p>



<p>This site was penalized (algorithmically, so without a message in GSC), and because it was post-Penguin, I said clearly the problem was there, in the area of spammy links, so<strong> I created a disavow file with hundreds of spammy sites</strong> linking to this site. The <strong>site recovered its lost traffic in about 2 months</strong> after uploading the file.</p>



<h2 class="wp-block-heading">Is it still worth creating a disavow file nowadays?</h2>



<p>Normally I would say no, because creating a disavow file can backfire on your site, even if it is done correctly.</p>



<p>You might <strong>add sites with a perfect profile for a disavow file</strong>: spammy, toxic, etc., but if these sites are old, your backlink profile has only 20 domains and these are 300 domains, Google can be dumb enough to take your disavow file into account (which it should), but the shock for the site of having so many domains disavowed can be so big that the site might simply drop in rankings.</p>



<h2 class="wp-block-heading">So is a disavow file worth it, or is it too risky?</h2>



<p>If you have a site with traffic and it falls into the class described above: <strong>very few clean domains, many spammy and toxic ones, but old and well-established</strong>, it’s <strong>better to leave it alone</strong> and not bother creating a disavow file, because anyway, most of the time, Google knows how to figure out which sites are toxic or not.</p>



<h2 class="wp-block-heading">Can Google simply ignore the disavow file?</h2>



<p>Yes, most of the time it doesn’t take it into account. As I said, I believe that in more than 150 cases of creating a disavow file, it only worked once.</p>



<h2 class="wp-block-heading">When and how to create a disavow file if you really want to take this step?</h2>



<p>First of all, for beginners, here are the technical steps:</p>



<p>Create a simple .txt file (UTF-8).</p>



<p>On each line, write:</p>



<p>domain:site-name.com to disavow an entire domain</p>



<p>or the full address of a page if you only want to block a single link.</p>



<p>You can add comments with # (Google ignores them).</p>



<p>Save the file and upload it here:</p>



<p><a href="https://search.google.com/search-console/disavow-links">https://search.google.com/search-console/disavow-links</a></p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="634" src="https://vladsand.com/wp-content/uploads/2025/09/disavow-page.jpg-1024x634.webp" alt="disavow page where you can upload the txt disavow files" class="wp-image-1502" srcset="https://vladsand.com/wp-content/uploads/2025/09/disavow-page.jpg-1024x634.webp 1024w, https://vladsand.com/wp-content/uploads/2025/09/disavow-page.jpg-300x186.webp 300w, https://vladsand.com/wp-content/uploads/2025/09/disavow-page.jpg-768x476.webp 768w, https://vladsand.com/wp-content/uploads/2025/09/disavow-page.jpg.webp 1032w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<p>And so you don’t have to do everything by eye, use tools like Ahrefs, Semrush, Majestic, or even Google Search Console to build your list. For example, in Ahrefs go to Backlink profile and start filtering by DR, anchors, etc.</p>



<h2 class="wp-block-heading">What should you add to the disavow file?</h2>



<p>I’ll start by telling you<strong> what not to add</strong>.</p>



<p>First of all, <strong>do not add domains that simply have a Domain Rating of 0</strong>. This is a practice used by many amateurs who have no idea about SEO and don’t understand the concept of SEO or a site’s value. Many people think that if a site with Domain Rating 0 links to your site, it’s bad.</p>



<p>That’s pure nonsense. Let me give you an example:</p>



<p>If you have an online bookstore, and a book enthusiast has a site where they talk about books, but the poor guy created his site just 2 months ago and links to you… Should I put this site in the disavow file? No. Not at all. That link is probably better than one from a Domain Rating 30 news site that has nothing to do with your niche, so don’t make this mistake.</p>



<h2 class="wp-block-heading">So what kind of sites should you add to the disavow file?</h2>



<h3 class="wp-block-heading"><strong>First, look at the TLDs.</strong></h3>



<p>If you see <strong>TLDs such as</strong>: <em>.xyz, .top, .club, .online, .site, .space, .pw, .gq, .cf, .ml, .ga, .tk, .work, .icu, .bid, .men, .win, .click, .loan, .download, .party, .science, .cam, .monster, .fun, .in, .pk</em>, they are most often <strong>not natural links to your website</strong>, so consider adding them to your <strong>disavow list</strong>.</p>



<h3 class="wp-block-heading">Manual inspection of unfamiliar sites</h3>



<p>Then go through the sites one by one, even if there are thousands (yes, you heard right, even if there are thousands). Open the sites that don’t look familiar and if you see strange languages, Chinese, Russian, etc., and your site is in English, <strong>straight to the disavow file</strong>.</p>



<h3 class="wp-block-heading"><strong>Scraper and directory listings</strong></h3>



<p>Also, if the sites you open are the kind of site listings that scrape the entire internet and add your site to a list of hundreds or thousands of links, <strong>straight to the disavow file.</strong></p>



<h3 class="wp-block-heading"><strong>Sites with adult or shady ads</strong></h3>



<p>Next, if you find sites that seem normal but inside have only links and strange banners to adult sites or potency sites, those kinds of ads go <strong>straight to the disavow file as well.</strong></p>



<h3 class="wp-block-heading"><strong>Don’t blindly trust “toxic” flags from tools</strong></h3>



<p>Another side note for those who add to the disavow list sites that certain tools flag as toxic. Never rely on such tools. I have seen many sites flagged as toxic when that was not the case.</p>



<h3 class="wp-block-heading"><strong>Check anchor text for spam attacks</strong></h3>



<p>And last but not least, you need to check the anchors through which those sites link to your site.</p>



<p>There are bad actors, “SEO agencies” that intentionally spam the entire internet by placing fake anchors on decent sites, but anchors that can harm your site, especially through spammy content. For example, see the image below showing the kind of anchors pointing to this site: </p>



<figure class="wp-block-image size-large"><img loading="lazy" decoding="async" width="1024" height="455" src="https://vladsand.com/wp-content/uploads/2025/09/spammy-anchors-disavow-file.jpg-1024x455.webp" alt="spammy anchors of websites that should be added in the disavow file" class="wp-image-1503" srcset="https://vladsand.com/wp-content/uploads/2025/09/spammy-anchors-disavow-file.jpg-1024x455.webp 1024w, https://vladsand.com/wp-content/uploads/2025/09/spammy-anchors-disavow-file.jpg-300x133.webp 300w, https://vladsand.com/wp-content/uploads/2025/09/spammy-anchors-disavow-file.jpg-768x341.webp 768w, https://vladsand.com/wp-content/uploads/2025/09/spammy-anchors-disavow-file.jpg-1536x682.webp 1536w, https://vladsand.com/wp-content/uploads/2025/09/spammy-anchors-disavow-file.jpg.webp 1635w" sizes="auto, (max-width: 1024px) 100vw, 1024px" /></figure>



<h3 class="wp-block-heading"><strong>Monitor backlinks regularly</strong></h3>



<p>If you catch this type of site early, you’ll need to add it to the disavow file. That’s why<strong> I recommend checking your backlink profile weekly</strong>, for your own site or your clients’ sites. Catching issues early is much better, because as I said earlier, if you wait for years, even a toxic and spammy site can take root and there may be consequences if you intervene too late with a disavow file (assuming Google will even take it into account)</p>



<p>That being said, I believe it’s still better to avoid creating any disavow file and to let old Google separate the good sites from the bad ones, because if we examine 100 successful organic sites right now, I assure you that 99 of them have spammy links and no one has paid them any attention.</p>
<p>The post <a href="https://vladsand.com/the-disavow-file-is-it-really-worth-creating-it/">The Disavow File &#8211; Is it really worth creating it?</a> appeared first on <a href="https://vladsand.com">VladSand</a>.</p>
]]></content:encoded>
					
		
		
			</item>
	</channel>
</rss>
