How We Track Perplexity Citations: Every Answer Has Sources, We Read All of Them
Perplexity returns a citations array on every answer - the URLs the engine pulled sources from. Our tracker queries Perplexity Sonar directly and records every citation per query so you see exactly which pages get pulled when AI users in your niche ask their questions.
One of 48 criteria in AEO Rank, the citation-readiness score we run against every site we audit.
By Alex Shortov
Quick Answer
Perplexity is the easiest AI engine to measure honestly because every answer ships with an explicit citations array. We query the Perplexity Sonar API (the cheap citation-heavy model) with the same target query set used for ChatGPT visibility, then parse the returned citations and search_results to record which URLs the engine pulled. Roughly $0.12 per domain for a 20-query visibility scan, well within Perplexity's 50 requests-per-minute rate limit. Results land in the same visibility report as ChatGPT data, so you can compare engine behavior side-by-side.
Audit Note
In our audits, we've measured How We Track Perplexity Citations: Every Answer Has Sources, We Read All of Them on live sites, we've compared implementations, and we've audited the gaps that keep scores low.
How does Perplexity decide which pages to cite?
Perplexity picks citations from sources it retrieved during the answer, and the Sonar API returns those URLs in an explicit citations array we read directly.
What is the Sonar API and why use it for visibility tracking?
The Sonar API is Perplexity's OpenAI-compatible model that returns a citations array with each answer, costing roughly $0.005 per call plus token usage.
How is Perplexity tracking different from ChatGPT tracking?
Perplexity ships citations in a structured array so HIT/MISS is unambiguous, while ChatGPT tracking requires brand-variation matching against natural-language prose.
What does it cost to check Perplexity visibility on my domain?
A 20-query Perplexity scan runs around $0.12 per domain at Sonar pricing, and 50 queries weekly costs roughly $1.20 in API spend per month.
Can a brand be invisible on ChatGPT but well-cited on Perplexity?
Yes, the engines weight different signals, so brands invisible on ChatGPT can dominate Perplexity when their content sits on Reddit, Wikipedia, or citation-rich pages.
Summarize This Article With AI
Open this article in your preferred AI engine for an instant summary and analysis.
What this article answers
- How does Perplexity decide which pages to cite?
- What is the Sonar API and why use it for visibility tracking?
- How is Perplexity tracking different from ChatGPT tracking?
- What does it cost to check Perplexity visibility on my domain?
- Can a brand be invisible on ChatGPT but well-cited on Perplexity?
Key takeaways
- Perplexity’s response always includes a citations array - we don’t have to infer which URLs the engine used, we read them directly from the API response.
- The Sonar model (the cheapest variant) returns full citations and costs ~$1 per million tokens plus $0.005 per call - a 20-query scan runs around $0.12 per domain.
- Citations are matched against the customer’s apex domain plus subdomains and brand-variation list - same matcher used for ChatGPT, so HIT/MISS counts are directly comparable across engines.
- Perplexity’s behavior differs sharply from ChatGPT - it leans heavily on Reddit, Wikipedia, and citation-rich sources, so well-optimized pages can rank in Perplexity even when ChatGPT is silent on the same query.
- Every Perplexity citation is also a URL we can hand the operator - if you got cited, we tell you exactly which page. If you didn’t, we tell you which competitor URLs got the citation slot you wanted.
- Rate limits (~50 requests/minute, Tier 0) easily cover the typical 20-100 queries per domain - no batching or queueing required for a single domain’s scan.
Why Perplexity Is the Easiest Engine to Measure Honestly
Every Perplexity answer is structurally citation-driven, returning a clean references array with URLs, titles, and dates, so visibility measurement requires no parsing of prose for brand mentions.
ChatGPT visibility takes effort to measure: you ask the question, parse natural-language prose for brand mentions, hope the model actually cites a URL instead of summarizing knowledge from training data. The response structure does not give you a clean signal.
Perplexity is the opposite. Every Perplexity answer is structurally a citation-driven response - the engine retrieves source pages and shows them as numbered references inline. Their public API exposes the same structure: a citations array with URLs, plus a search_results array with title, snippet, and date metadata for each citation. You don’t have to guess what Perplexity “really” knew; the response tells you exactly which 5-10 URLs the engine pulled.
That structural transparency is why Perplexity is our most reliable visibility data source. If your URL is in the citations array, you were cited. If it isn’t, you weren’t. There’s no ambiguity to resolve.
How the Sonar API Works
The Perplexity API endpoint is OpenAI-SDK compatible - we use the same OpenAI client library, just point baseURL at https://api.perplexity.ai and use a pplx- API key.
Four models are available, varying by cost and capability:
| Model | Token cost | Search fee per call | Use case |
|---|---|---|---|
sonar | $1/M in + $1/M out | $0.005 | Visibility checks - cheap, fast, full citations |
sonar-pro | $3/M in + $15/M out | $0.005 | Multi-step research, ~2x citations per answer |
sonar-reasoning | ~$3/M in + ~$15/M out | $0.005 | Complex reasoning queries |
sonar-deep-research | premium | $0.005 | Deep multi-source research |
For visibility tracking we use the sonar model - the cheapest tier - because the citations array is what we care about, and the cheaper model returns citations of equivalent quality. The expensive models add reasoning depth and citation count, which doesn’t change whether a single query cited the customer.
Cost math for a typical 20-query scan: $0.02 in tokens (queries are short, responses are typically <500 tokens each) + $0.10 in search fees ($0.005 × 20) = **$0.12 per domain**. Even at a 100-query scan that’s roughly $0.60 - well below the cost of running the same query set through ChatGPT’s API.
How Citation Matching Works
Citations run through the same brand-variation matcher used for ChatGPT: apex match, subdomain match, and path-level deep match, with non-customer citations recorded as competitor hits.
When a query response comes back, we extract three pieces of information:
{
"choices": [{ "message": { "role": "assistant", "content": "..." } }],
"citations": ["https://www.helpsquad.com/shopify", "https://tidio.com/integrations/shopify"],
"search_results": [
{
"title": "HelpSquad Shopify Integration",
"url": "https://www.helpsquad.com/shopify",
"snippet": "HelpSquad's chat widget integrates with Shopify in one click...",
"date": "2026-03-12"
}
]
}
The citations array is the primary signal. We run each URL through the same brand-variation matcher used for ChatGPT:
- Apex match - URL host matches
helpsquad.com→ HIT, regardless of subdomain - Subdomain match - URL host is
app.helpsquad.comorsupport.helpsquad.com→ still HIT - Path-level deep match - the matcher knows about the customer’s url_prefix structure so deep article URLs count correctly
For each citation that is NOT the customer, we check against the customer’s registered competitor list. Hits on competitor domains are recorded so the visibility report can show “your competitor was cited 7 times this week, you were cited 2 times” comparisons.
The search_results array provides title + snippet + date metadata for each cited URL. We persist all of it - this is what lets the visibility report show exactly which page got cited, including the section of the answer it informed. Customers can click through from the report to the cited page and see why Perplexity picked it.
How Perplexity’s Behavior Differs From ChatGPT
Perplexity leans heavily on Reddit, Wikipedia, industry publications, schema-rich FAQ pages, and recent dateModified content, while ChatGPT weights Bing rankings and conversational tone above all.
This is the most underappreciated fact about AI visibility: the engines behave differently. A domain can be invisible on ChatGPT but well-cited on Perplexity, or vice versa, because the engines weight different signals.
Observed patterns from running both engines side-by-side across hundreds of domains:
Perplexity leans heavily on:
- Reddit threads (citation-heavy, real-user discussions)
- Wikipedia (high-authority generic information)
- Industry-specific publication sites (TechCrunch, The Verge for tech; Healthline for health)
- Sites with clean schema markup + comprehensive FAQ pages (the citation parser explicitly looks for structured Q&A)
- Recent content - Perplexity weighs
dateheavily in citation ranking
ChatGPT tends to favor:
- Comprehensive long-form pillar pages with deep topical authority
- Sites with strong Bing index presence (ChatGPT Search is Bing-fed)
- High Schema.org coverage (especially Organization + Article + FAQPage)
- Brand entities that appear consistently across high-authority external mentions
The practical implication: AEO optimization tactics that work for ChatGPT are not 1:1 with what works for Perplexity. A site can be a Tidio-grade ChatGPT citation winner but miss out on Perplexity citations because its FAQ pages aren’t formatted for Perplexity’s parser. That’s why we track both engines independently and report them separately - operators need to see where their gap is per engine, not in aggregate.
Perplexity and ChatGPT cite from different retrieval models, which is why the two visibility scores rarely match.
| Trait | Perplexity | ChatGPT |
|---|---|---|
| Citation transparency | Always shows numbered sources | Hidden unless browse enabled |
| Source breadth | 5-10 sources per answer | 1-3 sources per answer |
| Recency weight | Moderate, source dates visible | Strong, Bing-driven |
| Citation matching difficulty | Easy - explicit URLs | Harder - prose mentions |
What the Sonar Query Looks Like
We send Perplexity an intentionally plain query with no system prompt or persona, because the unbiased baseline is exactly what real customers experience when they ask the same question.
The query we send to Perplexity is intentionally simple:
User: "what's the best live chat for Shopify"
No system prompt, no persona, no constraints. We want to capture what Perplexity returns for a generic user asking the question - because that’s what the customer’s real customers experience when they ask the same question. Adding system-prompt context would bias the response toward our intent rather than measuring organic citation behavior.
Some trackers run elaborate prompt engineering to “get better signal” out of AI engines. We don’t, because the signal we want is exactly the unbiased baseline. If you can’t get cited on the plain question, fancy prompts don’t change anything about your actual visibility to end users.
Storing, Trending, and Reporting Perplexity Data
Perplexity results land in the same aeo_monitor_results table as ChatGPT data, surfacing in three reports: per-query engine comparison, citation source mix, and competitor citation tracking.
Every Perplexity query result lands in the same aeo_monitor_results table that holds ChatGPT data. Schema includes domain, query, engine='perplexity', run_date, target_visible, cited_urls, competitor_mentions. The same trend graphs and weekly digests that surface ChatGPT data also surface Perplexity data - the operator UX is unified.
Three reports specifically use Perplexity data:
- Per-query engine comparison - shows the same query side-by-side across ChatGPT and Perplexity, so the operator can see “we win this query on ChatGPT but Perplexity is citing Wikipedia + Reddit instead of us.”
- Citation source mix - shows which kinds of sites are dominating citations on your target queries (Reddit-heavy vs Wikipedia-heavy vs industry-publication-heavy). Useful for deciding whether to focus on Reddit engagement, schema markup, or PR outreach.
- Competitor citation graph - which competitor domains are showing up across your target queries on Perplexity, broken down by which pages of theirs are getting cited. Map of where the competitive gap actually lives.
How Often Does Perplexity Tracking Run?
Same cadence as ChatGPT tracking - typically weekly per domain, configurable per plan tier. Each run uses the same target query set, so trends are directly comparable week over week.
Because Perplexity API costs are low and rate limits are generous (~50 requests/minute), we can also run on-demand spot checks from the Studio Visibility page - operator clicks “refresh now” and a single query or a small batch gets re-run in real time, costing roughly $0.01 per query. Useful when a customer wants to verify that a freshly-published article actually moved Perplexity’s citation behavior.
What This Costs and What It Buys You
Weekly Perplexity tracking on fifty queries costs around $1.20 per month per domain, buying 200 measured citations, competitor data, and engine-comparison trend lines over time.
Total cost to track a domain on Perplexity for a month at weekly cadence with 50 target queries: roughly $1.20 in API spend ($0.30 per weekly scan × 4 weeks). For comparison, a single hour of a content marketer’s time spent manually checking visibility is worth more than a year of automated Perplexity tracking.
What that buys:
- 4 weekly snapshots of citation state across 50 queries = 200 measured query-citation pairs per month
- Per-citation URL details + date + snippet
- Competitor citation tracking on every query
- Engine-comparison data alongside ChatGPT
- Trend over time so you can see whether content investments moved the citation needle
The cheap cost is why we run Perplexity scans for every customer regardless of plan tier - the value-to-spend ratio is too high to leave it as a paid add-on.
External Resources
- Perplexity Sonar API documentation - https://docs.perplexity.ai
- Perplexity API pricing - https://docs.perplexity.ai/guides/pricing
- OpenAI SDK (compatible with Perplexity) - https://github.com/openai/openai-node
- AEO Visibility methodology - https://www.aeocontent.ai/knowledge/aeo-rank-methodology
Related topics
Key takeaways
- Perplexity's response always includes a citations array - we don't have to infer which URLs the engine used, we read them directly from the API response.
- The Sonar model (the cheapest variant) returns full citations and costs ~$1 per million tokens plus $0.005 per call - a 20-query scan runs around $0.12 per domain.
- Citations are matched against the customer's apex domain plus subdomains and brand-variation list - same matcher used for ChatGPT, so HIT/MISS counts are directly comparable across engines.
- Perplexity's behavior differs sharply from ChatGPT - it leans heavily on Reddit, Wikipedia, and citation-rich sources, so well-optimized pages can rank in Perplexity even when ChatGPT is silent on the same query.
- Every Perplexity citation is also a URL we can hand the operator - if you got cited, we tell you exactly which page. If you didn't, we tell you which competitor URLs got the citation slot you wanted.
- Rate limits (~50 requests/minute, Tier 0) easily cover the typical 20-100 queries per domain - no batching or queueing required for a single domain's scan.