Deep Dive · 6 min read · 2026-05-09

The Content Fix That Peer-Reviewed Research Actually Backs

Most AEO advice cites numbers with no attribution: "sites with schema get cited 3.2x more," "sameAs links improve citations by 67%." We've tracked most of these figures to their original source. They're usually vendor blog posts citing other vendor blog posts.

One intervention has actual peer-reviewed research behind it. It involves adding a sentence or two to your service page. Most businesses haven't done it.

The Princeton study on generative engine optimization

In 2023, researchers at Princeton University ran a controlled study on what page modifications actually move the needle for AI search visibility. The paper -- "GEO: Generative Engine Optimization" (arXiv:2311.09735) -- was presented at KDD 2024, one of the top-tier conferences in data mining and machine learning. Peer-reviewed. Findable. Not a vendor white paper.

The researchers tested nine types of content modifications and measured their effect using a metric called Position-Adjusted Word Count (PAWC): how prominently the modified content appeared in AI-generated responses compared to a control. Six interventions showed meaningful positive results, including adding statistics, citing sources inline, and simplifying language.

The largest effect came from something the paper calls "Quotation Addition" -- adding a credible, attributed quote to the page content. The improvement in PAWC: approximately 41%.

The next most effective intervention was adding statistics or quantitative data: approximately 30%.

These findings held across multiple generative engines, including Bing Chat and Perplexity. The mechanism is consistent with how LLMs evaluate credibility: training data is dense with attributed quotes. Academic papers cite experts. Journalism quotes sources. Wikipedia attributes claims. A page that does this reads as more authoritative to a model than one that doesn't.

What PAWC means -- and what it doesn't

Our methodology team flagged an important distinction when we surfaced this finding in our internal rec (filed 2026-05-08, Scout session 12).

PAWC measures how prominently your content appears in AI-generated text -- how much of your page gets surfaced and used in the answer. It is not a direct measure of URL citation rates.

Some vendors have taken the "+41%" figure and called it "+41% citation probability." That's imprecise. The original paper says +41% PAWC improvement. Your content gets used more visibly in what the AI tells users; whether your URL appears as a numbered citation is a separate question.

That distinction doesn't make the finding less useful. Visibility in AI answers -- having your framing appear in what the model says -- is influence, whether or not a link appears. But we won't inflate the number. Use the direction: attributed quotes are the highest single-intervention effect in the study. The exact magnitude is from 2023/early 2024, and LLM behavior has evolved. The directionality is still valid. The decimal isn't.

Structure comes first

The expert quote fix belongs in sequence, not in isolation. In our audits, it's a content-level optimization that sits on top of a structural foundation. Without the foundation, the content signal underperforms.

Entity consistency is the foundation. A business whose Organization schema uses "Acme Plumbing Inc." in the JSON-LD, "Acme Plumbing" in their website header, and "Acme Plumbing Solutions" on their Yelp profile has fragmented its entity across three separate strings. LLMs may not reliably merge these into one entity -- which means citations scatter or don't consolidate.

The mechanism here isn't the schema property itself. As we've documented in our knowledge base (schema-markup-effects.md, last updated 2026-04-30 based on practitioner research and platform statements from Microsoft and Google): what actually matters is entity cluster density. Having verified profiles at Wikidata, Crunchbase, Product Hunt, or major directories -- using your entity name consistently across all of them -- gives AI models multiple traversal paths to the same entity. The sameAs JSON-LD field only helps when there's a real, consistent profile at the other end of the link.

We've documented a recurring problem across three consecutive audits (methodology rec, 2026-05-02): Organization JSON-LD deploying with unfilled placeholder text in the sameAs array -- literal strings like "REPLACE WITH YOUR ACTUAL PROFILE URLS." A business pushing that schema isn't building entity authority. They're shipping invalid JSON-LD that fails Google's Rich Results Test and may get flagged as spam by schema validators. The instructions said to replace the placeholders before deploying. Clients following steps in order deployed first.

The fix sequence that actually works: entity-consistent schema (no placeholders), verified directory presence, expert quote on your core service page -- in that order.

What makes a quote count

You don't need a recognized industry figure. The requirement is attribution: who said it, in what context.

A quote from a named client -- "John D., general contractor in Hamilton" -- describing a specific outcome works. A quote from an industry association or a certified professional in your field works. A reference to published guidance with the source named inline works.

What doesn't work: unattributed testimonials ("A client told us..."), vague praise with no attribution, or self-referential quotes from your own staff without external standing.

The quote should appear in the body of your primary service page -- not in a reviews section, not in a footer carousel. It should be adjacent to the specific claim it's supporting, and the attribution should be visible inline, not buried in a caption.

One sentence from a named external source, properly attributed, placed in your service page body. That's the fix.

Why this matters more than most schema changes

Schema gets recommended constantly in AEO content, often accompanied by numbers that don't trace to any study. Research from Search/Atlas in December 2024 found no correlation between schema markup coverage and AI citation rates across the sites they examined. That finding conflicts with practitioner consensus -- but it illustrates the point. Schema coverage alone isn't the lever.

What the evidence does support -- both the Princeton study and the practitioner-level understanding of entity cluster density -- is a consistent underlying principle: AI models favor content that signals credibility through attribution. External verification. Named sources. Consistent entity presentation across the web. Expert quotes are the most direct expression of that principle at the page-content level.

Schema consistency creates the entity surface. Verified directory presence creates the cluster. An attributed expert quote on your service page creates the content authority signal. Run them in sequence and you've addressed the three layers the research actually supports.

Check your baseline before you start

If you want to see where you stand before making changes, our Signal Check gives you a cross-platform AI visibility score across ChatGPT, Perplexity, Gemini, and Claude in about 60 seconds. It's free, no account required. The score shows exactly which platforms are citing you and which aren't -- so you can prioritize whether you're dealing with an entity problem, a content problem, or both.

See how your business scores on AI platforms.

Check your score — free