Methodology

The math, the sources, what we don't do.

Most "AI visibility" tools grade their own homework with synthetic data and call it a "score". We sample real answer-engine responses across six engines with transparent statistics. Here's exactly how.

The funnel

Three signals per probe, parsed from each engine's actual response shape:

Retrieved
Your domain appeared in the engine's web-search / RAG retrieval results. This is the engine's "considered" set — pages it could have linked to.
Cited
The engine actually linked your domain in its answer or Sources list. The visible attribution most users will click.
Mentioned
Your brand name appears in the prose — with or without a link. Catches "memorized" mentions from training data.

The gaps between these three numbers are the actionable signals. Mentioned high, cited low = engines vouch for you via third parties. Retrieved low = your own pages aren't even being considered.

Wilson 95% confidence intervals

Every rate we report is a sampled proportion (e.g. 6 of 24 probes cited your domain). We report the Wilson score interval at 95% — a small-sample-friendly binomial confidence interval that correctly handles edge cases at 0% and 100% (where the normal-approximation interval breaks).

When two runs are compared, a delta is flagged significant only when the two Wilson CIs don't overlap. Anything else is sampling noise — increase --runs to tighten the bound.

The proven GEO levers

The only peer-reviewed evidence base for what actually moves AI-answer-engine citation is the KDD'24 paper Generative Engine Optimization (Aggarwal et al., Princeton / Georgia Tech / Allen AI). It found measurable lifts from:

Our generated content briefs use exactly these levers — and explicitly avoid the snake-oil ones (no llms.txt claims, no "schema-as-citation-lever" pitches; both have no causal evidence).

Engines, per-engine

Claude (Sonnet) + WebSearch
Routed through the authenticated claude CLI; stream-JSON output gives us the exact WebSearch tool-call queries and result URLs.
OpenAI (gpt-4o) + Responses web_search
POST to /v1/responses with the built-in web_search tool; citations come back as URL annotations on output_text content.
Gemini (2.0-flash) + google_search grounding
POST to generativelanguage.googleapis.com; grounding_metadata exposes the web URIs the model relied on.
Perplexity Sonar
POST to Sonar /chat/completions; the top-level citations array is the live retrieval/citation set.
Google AI Overviews · via SerpApi
Two-step flow: regular Google SERP, then engine=google_ai_overview with the page_token when needed. text_blocks + references map cleanly to our funnel; when no AIO is shown for a query we record an error (rates compute over actual triggers).
Bing / Copilot · via SerpApi
Defensive extraction across generative_search, copilot_answer, ai_answer, and instant_answer. Same honest framing — when no AI artifact is returned, no fake probe is recorded.

What we don't do