All posts
Analysis · 7 min read · 2026-05-21

Two Ways Your AI Visibility Fix Plan Fails Before It Starts

When a business's AI visibility score stays flat after implementing a fix plan, the instinct is to look for missed steps. Did they build out the city pages? Did they update the schema? Did they get listed on the recommended directories?

Sometimes none of that is the problem. We've found two structural failure modes that make AEO fix plans useless -- and in one case, actively damaging -- before a single recommendation is implemented. Both are invisible in a standard audit. Both are more common than published guidance suggests.

Failure One: The robots.txt structural ceiling

Our May 17, 2026 methodology review flagged a pre-check gap we hadn't fully accounted for in our audit pipeline.

Many businesses blocked AI crawlers -- GPTBot, PerplexityBot, ClaudeBot -- during the 2023-2024 controversy over AI companies scraping web content for training data. The decision made intuitive sense at the time: "I don't want my content used to train AI models." Developers added disallow rules to robots.txt. WordPress security plugins added them automatically. Some site templates shipped with them pre-configured.

The consequence nobody explained: those same crawlers are also the retrieval infrastructure for live AI search. GPTBot isn't just a training scraper -- it's what ChatGPT uses to read your site when answering real-time queries. PerplexityBot is how Perplexity retrieves your content on every single search it runs. Blocking them for training-data reasons also blocks live query retrieval.

The result is a structural ceiling. A business with PerplexityBot blocked can score 0.0 on every Perplexity query regardless of their directory presence, schema quality, or content freshness. The fix plan recommending three new city-specific service pages and a complete LocalBusiness schema rewrite will produce zero Perplexity movement. The crawler can't read any of it.

Our methodology rec describes it this way: "Recommending directory presence or schema updates to a client whose site is blocking AI crawlers is like recommending a better front door to a house with no street address -- the work is correct but the prerequisite is unmet."

We don't have published data on how frequently SMB sites have these blocks in place. Given the 2023-2024 controversy period and how many security plugins added them automatically without user notice, the frequency is likely non-trivial. It's not a rare configuration. It's common enough that we now treat a robots.txt fetch as a mandatory pre-check before scoring begins on any audit.

The check takes thirty seconds: visit `yourdomain.com/robots.txt` in your browser and look for any `User-agent: PerplexityBot`, `User-agent: GPTBot`, or `User-agent: *` paired with `Disallow: /`. If any of those are present, that block is Fix 1 -- not because it's the most strategically interesting, but because it's the ceiling every other fix sits under.

Failure Two: The fix that encodes the wrong identity

This one is more subtle, and we found it by auditing one of our own audit outputs.

In our May 16, 2026 methodology review -- triggered by the Race Data audit -- we identified a specific failure mode in how AI-generated fix plans handle entity disambiguation recommendations.

Race Data (racedata.ca) is a Canadian database marketing firm serving clients in finance, media, retail, and tourism. In the May 2026 audit, AI platforms produced heavily misattributed responses: ChatGPT described a motorsport data website; Gemini described a hardware company called Race Data Systems. The misattributions were detailed and specific -- not vague hallucinations but confident, complete wrong descriptions.

The fix plan recommended creating a Wikidata entry describing Race Data as "a Canadian data management and analytics company specializing in motorsport intelligence."

That description is wrong. The motorsport framing came directly from the AI hallucinations the audit was supposed to be diagnosing. The fix plan, generated by reading both the classified query responses and the actual website data, gave the hallucination equal weight in constructing the Wikidata recommendation. A client who followed that fix would have anchored their entity identity to an invented industry that appears in no real description of their company.

The mechanism matters here. When an AI tool generates fix recommendations, it reads the misattributed query responses alongside the actual company website. A vivid, detailed misattribution -- "motorsport intelligence platform" described across multiple responses -- competes with the actual company description during the recommendation pass. For companies with ambiguous brand names and low pre-existing entity signal, the misattribution can outweigh the true description.

This failure mode is specifically a risk when: - The brand name is generic or could refer to multiple things (Race Data could plausibly describe motorsport statistics) - The company has low entity signal -- no Wikidata entry, no Crunchbase, no Wikipedia - The misattribution count in the audit results is high

We documented the same upstream problem in our April 27, 2026 edge-case investigation of entity identity instability. When a brand has thin or inconsistent signals across the web, AI platforms produce incoherent output: Gemini returned three completely different invented identities for the same business name in the same audit session -- a different person, a different profession, a different industry each time. A fix plan reading those responses without anchoring to the client's actual website description would be absorbing three contradictory wrong identities as input.

The practical consequence: if you receive an entity disambiguation recommendation -- a suggested Wikidata entry, an Organization schema description, an About page rewrite -- check the language against your own website's home page meta description, H1, and About page. If the language in the fix matches what AI said about you rather than what you say about yourself, the fix needs revision before implementation.

Your own website copy is the ground truth. The AI's description of you is the thing being corrected. When a fix conflates the two, implementing it makes the misattribution permanent.

What both problems have in common

Both failures are downstream consequences of the same gap: no pre-check of structural prerequisites before recommendations begin.

The robots.txt ceiling fails because no one checked whether the platform could even read the site. The misattribution absorption fails because no one checked whether the AI's description of the client matched the client's actual business.

Neither shows up in a standard visibility score. A 4.2/10 Perplexity score with PerplexityBot blocked looks identical to a 4.2/10 score with PerplexityBot accessible. A fix plan with the right structure but the wrong industry framing reads as reasonable advice if you haven't independently verified the entity language.

These are pre-fix checks, not post-fix analysis. They run before the recommendations matter.

The order of operations for any fix plan receiving entity disambiguation or structural crawl recommendations: first, fetch robots.txt and confirm no AI crawler blocks are present. Second, read the fix plan's entity language against your own website. If the recommended Wikidata description doesn't match your home page's description of what you do, find the mismatch before you publish the entry.

Entity corrections are harder to reverse than they are to make. An incorrectly filed Wikidata entry anchoring a wrong industry can compound over time as other sources reference it. The motorsport framing for Race Data would have been wrong and would have propagated. Getting the underlying identity correct before you build external signals on top of it is not a stylistic consideration -- it is the fix.

A Signal Check at sourcepull.ca runs the pre-check automatically, including a robots.txt parse for AI crawler blocks. If the structural ceiling is present, it surfaces before any other recommendation. If your business has significant misattribution patterns, those appear in the citation data -- which is the input your team uses to verify any entity fix language against your actual business description.

The fixes work. The pre-checks make the fixes accurate.

See how your business scores on AI platforms.

Check your score — free