All posts
Deep Dive · 5 min read · 2026-04-11

How ChatGPT Decides Which Businesses to Recommend

ChatGPT doesn't rank websites — it synthesizes information from training data and real-time retrieval, and the businesses it recommends are the ones that left the clearest signal.

Understanding this mechanism is the difference between optimizing for the old internet and optimizing for the new one.

Training data vs. retrieval

ChatGPT and similar models have two ways to know about your business: it was in the training data, or it can retrieve information about you in real-time using web browsing or retrieval-augmented generation.

For most local businesses, real-time retrieval is the more actionable lever. This means your site needs to be crawlable, structured, and explicit about what you do and where you do it.

What makes a citation

When ChatGPT recommends a business, it's doing pattern matching across signals it can read. The strongest signals are: schema markup (LocalBusiness, Service, Product), structured entity profiles (Google Business, LinkedIn, industry directories), and explicit content that directly answers the queries people ask.

A plumber whose site says "expert plumbing" is less citable than one whose site says "emergency plumbing services in Burlington, Ontario — licensed, insured, 24/7."

The GPTBot crawl problem

By default, ChatGPT's crawler (GPTBot) is allowed to index your site. But many sites block it by accident via overly broad robots.txt rules. If GPTBot can't read your site, ChatGPT can't cite you.

One of the first checks Sourcepull runs is GPTBot accessibility. It's a common, easy-to-fix issue that blocks AI citations entirely.

Perplexity is different

Perplexity does live web search on every query. This means your domain authority and page structure matter more there. But the principle is the same: clear, structured, specific content wins over vague marketing language every time.

See how your business scores on AI platforms.

Check your score — free