Platform

AEO Website Research-grade Content Content Factory About Audits Rankings Pricing

Resources

Knowledge Base Research FAQ
Engine Optimization Criterion #502

How AEO Articles Are Built: 5 Phases, 50+ Blocks, 80-Point Quality Gate

AEO Content AI generates articles in phases - intelligence collection, evidence hydration, batched section writing, deterministic scoring, and self-repair. Every article that ships scored 80 or higher on the AEOPageRank rubric, or it never left the pipeline.

One of 48 criteria in AEO Rank, the citation-readiness score we run against every site we audit.

By Alex Shortov

low effort high impact

Quick Answer

Articles are produced by a phased pipeline that starts with web intelligence from 10 source types (NewsAPI, Hacker News, Reddit, YouTube, Substack, Medium, Quora, Podcasts, Google Scholar, and gov/edu domains), distills it into a normalized evidence library, then writes sections in parallel batches of 6 blocks. Every article is scored against the 5-pillar AEOPageRank rubric (Content Originality, Content Uniqueness, Extractability, Entity Richness, Structural Signals). Anything under 80 enters a self-repair loop with up to 3 rewrites before a human ever sees it.

Audit Note

In our audits, we've measured How AEO Articles Are Built: 5 Phases, 50+ Blocks, 80-Point Quality Gate on live sites, we've compared implementations, and we've audited the gaps that keep scores low.

How does AEO Content AI generate an article from a single topic?

AEO Content AI runs a 5-phase pipeline: create article, hydrate evidence library, batched section writing, repair below 80, then finalize the published HTML.

What stops AI-generated articles from sounding generic?

Articles avoid generic output by pulling intelligence from 10 source types in two waves, then anchoring every claim to a stable evidence id in the citation registry.

How is article quality measured before publishing?

Quality runs through AEOPageRank's 5 pillars and 17 deterministic checks, with anything below 80 triggering deterministic fixes followed by up to 3 LLM rewrites.

Why phased generation instead of one big prompt?

Phased generation beats one-big-prompt because each phase owns one job, prevents continuity loss across sections, and isolates failures into recoverable database states.

What is the evidence library and how does it prevent fabrication?

The evidence library hydrates once from knowledge items, web intelligence, and the corpus, then stays in context so claims cannot fabricate beyond validated sources.

Summarize This Article With AI

Open this article in your preferred AI engine for an instant summary and analysis.

Article Generation Pipeline
🌐 Gather Intelligence
📚 Hydrate Evidence
✍️ Batched Writing
🔧 Score & Repair
Finalize
aeocontent.ai
Article Generation Pipeline. Infographic illustrating the AEO Rank criterion discussed in this article.

What this article answers

  • How does AEO Content AI generate an article from a single topic?
  • What stops AI-generated articles from sounding generic?
  • How is article quality measured before publishing?
  • Why phased generation instead of one big prompt?
  • What is the evidence library and how does it prevent fabrication?

Key takeaways

  • Generation runs in 5 phases - create article, hydrate evidence library, batched section writing, repair below 80, finalize - so each phase has one clear job and one clean handoff.
  • Intelligence is collected from 10 web source types in two waves before a single sentence is written - articles cite real data sources by name, not training-data residue.
  • The evidence library is hydrated once and stays in context across every section, so every claim is anchored to a stable evidence id that the backend can validate against a citation registry.
  • Sections write in batches of 6 (default) with a continuity summary, so the writer always knows what was already said - prevents the “same fact restated 8 times” failure mode of single-prompt generation.
  • AEOPageRank scores 5 pillars (Originality 25, Uniqueness 25, Extractability 25, Entity Richness 15, Structural 10) - anything below 80 triggers deterministic fixes, then up to 3 LLM rewrites before the article exits the pipeline.
  • All generation runs through Claude Code CLI inside a managed terminal-server worker, never the Anthropic SDK directly - means every article has a reproducible session log and our skill prompts ship via DB versioning.

Why Phased Generation Beats One-Big-Prompt

Single-prompt article generators restate facts, drift in structure, and fabricate citations, while five-phase generation gives each step a different failure mode and a different specialized fix.

Most AI-content tools work the same way under the hood: stuff the topic + a long instruction block + maybe some retrieval snippets into one prompt, hit Claude or GPT once, return the result. It looks magical for the first draft. By the third article you notice the failure modes: same facts restated five times because the model forgot what it said earlier in the same response, fabricated statistics with confident citations, paragraph structure that drifts halfway through, no way to enforce that the FAQ section actually pulls different questions than the body.

We split the work into 5 phases because each phase has a different shape of failure and a different shape of fix. Generating in one shot means a bad call at minute one corrupts every output downstream. Generating in phases means we can validate, score, and patch at every handoff - and if a phase fails we can re-run just that phase, not the whole article.

The phases:

#PhaseWhat it producesWhy isolated
1aeo_create_articleRecipe-driven block sequence, evidence library, story plan, article briefHeavy intelligence work happens once and the output is reusable
1.5Hydrate evidence libraryFull corpus loaded into the writer’s working contextEvery section sees the same evidence, no per-section retrieval drift
2Batch writing loopSection contracts + written blocks, citation registryParallel writing with continuity, server-side citation validation
3Repair (if needed)Re-written low-scoring blocksTriggered only by failing AEOPageRank checks, not blanket regeneration
4FinalizeArticle HTML + JSON-LD + review payloadSingle atomic transition to review_status: 'in_review'

The writer is the same model across all phases - the difference is what context it sees and what action it is allowed to take at each step.

What Goes Into the Evidence Library?

The evidence library is the proprietary substrate that separates AEO articles from generic AI output. Before any section gets written, aeo_create_article assembles a normalized library from five sources:

  • Knowledge items - proprietary data, case studies, expert quotes already saved in aeo_knowledge_items for the client’s domain
  • Web corpus - the output of gatherIntelligence() - 10 web source types, condensed via Claude CLI, cached 24 hours
  • Visibility gaps - real questions where ChatGPT, Claude, or Perplexity failed to mention the client (pulled from aeo_monitor_results rows with target_visible = false)
  • Reddit questions - what real users in the niche are asking on Reddit, captured by the Reddit visibility tracker
  • Audited internal links - the verified set of same-domain link targets pulled from the live sitemap (so the writer cannot link to a 404)

Every item carries a stable id, library_type, source_url, citation_key, citation_mode, summary, condensed_text, and quotable_phrases. The writer composes sentences and supplies an evidence_manifest with the ids it used. The backend validates the manifest against the library + a citation registry that tracks which evidence and which internal links have already been used. LLM chooses, backend commits. A claim that cites an evidence id not in the library fails the manifest check and the block doesn’t save.

That contract is the difference between citation hallucination and verifiable sourcing. The model cannot just write “studies show 73% of marketers…” - the manifest would fail because there is no evidence id for that claim.

How Does Web Intelligence Actually Get Collected?

The gatherIntelligence function runs two parallel waves across ten source types in roughly four minutes per article, then caches results for 24 hours of shared cluster reuse.

gatherIntelligence() in packages/mcp/src/lib/artifact-collect.ts runs 2 waves of source collection, sized to the data each source actually returns. Total elapsed time is around 4 minutes per article and the entire output caches for 24 hours, so subsequent articles on the same topic skip the network roundtrip.

Wave 1 (parallel, no API key needed): NewsAPI for recent news mentions on the topic + Hacker News via the free Algolia search API for technical-audience discussions.

Wave 2 (sequential, ScrapingDog rate-limited): Reddit threads, YouTube transcripts, Substack newsletters, Medium articles, Quora answers, podcasts, Google Scholar academic sources, and .gov / .edu domains. All eight share one SCRAPINGDOG_API_KEY for SERP + page scraping.

After fetch, the corpus is deduplicated by URL + title fuzzy-match, then condensed in batches of 5 via Claude CLI (not the Anthropic SDK - we run through CLI for session reproducibility). The condensed corpus is what becomes evidence library entries.

The whole intelligence run logs to aeo_task_log_threads / aeo_task_actions via the TaskLogger, so every article ships with a queryable trace of what was fetched, condensed, and used.

What Are the 50+ Blocks and Why So Many?

Articles assemble from 50-plus block types across nine categories so each unit fails independently, gets re-generated independently, and re-uses evidence with a deterministic citation registry.

A block is one self-contained unit of an article - the lede, the direct-answer paragraph, the FAQ, the comparison table, the conclusion. We maintain 50+ blocks through B-54 in the aeo_blocks table, grouped into 9 categories:

CategoryExamples
Research & ContextTopic research, competitor scan, audience profile
StructureOutline, section briefs
Core WritingLede, body sections (B-07 variants), conclusion
CTA & ConversionInline CTA (B-23), section CTA (B-29)
Quality & PolishReadability, fact-check, tone consistency
SEO & OptimizationMeta description, schema markup, keyword placement
MediaImage suggestions, alt text, hero image generation
Data & EvidenceStatistics, citations, original data
EnrichmentExpert quotes, case studies, FAQ (B-24)

Each block has its own sop_instruction (what it must contain) and delivery_notes (per-block structural rules - answer capsule format, contrarian-claim requirements, citation policy, etc.). Recipes assemble blocks into article shapes - a “comparison article” recipe pulls a different sequence than a “deep-dive guide” recipe. 12 recipes ship in aeo_recipes today, each defining block_ids[], default audience and goal, target word count, and hero style.

That modular shape is what lets the system pick the right structure per article type without re-prompting the LLM about format. The block library is the format library.

The phased article pipeline uses block categories that each fill a specific role in the citation-ready output.

Block CategoryExamplesPurpose
Opening blocksBold lede, short answerAnswer-first capsule
Question H2sSub-question sectionsQuery-answer alignment
Evidence blocksTables, fact lists, quotesCitation anchors
Engagement blocksIn-body CTAsConversion path
Closing blocksFAQ, references, related articlesTrust and freshness

How Does the AEOPageRank Quality Gate Work?

AEOPageRank runs automatically after the last block, scoring rebuilt HTML across five pillars and 17 deterministic checks before any article ships to operator review.

The moment the last block in the recipe writes successfully, AEOPageRank scoring runs automatically against the rebuilt article HTML. The rubric is 5 pillars with 17 deterministic checks, each scored 0-10:

PillarWeightWhat it measures
Content Originality25%How much of the article reflects evidence-library content vs generic training-data restatement (corpus item reflection 40% or higher for 8 out of 10)
Content Uniqueness25%Owned insight phrases (3+), named framework (acronym present), contrarian claims (2+), comparison table presence
Extractability25%Answer capsules of 15-35 words (80%+ of body sections), no weak openers (90%+), atomic sentences (40%+), front-loading factor 1.4x
Entity & Data Richness15%Entity density 15% or higher, fact density 6 or higher, evidence packaging (3+ data points with attribution), freshness signals
Structural Signals10%Question-format H2s (70%+), definition tags (3+), JSON-LD schema present, subjectivity balance 30-50%

Any article scoring 80 or higher passes and goes to review. Anything below 80 enters the improvement loop:

  1. Deterministic fixes - the server patches B-07 blocks programmatically (adds missing answer capsules, owned insight phrases, time tags, frameworks). No LLM call.
  2. Re-score - if still below 80, the article is flagged improvement_needed with the specific failing check ids.
  3. Agent rewrite - aeo_rewrite_block runs with corpus-aware instructions, maximum 3 iterations. Each iteration re-scores.

If after 3 iterations the article still sits below 80, it is held in the queue with improvement_needed = true and flagged for human review. We do not ship sub-80 articles silently.

How Does the Pipeline Handle Continuity Across Sections?

A continuity summary injected before each six-block batch plus a deterministic citation registry prevent same-fact restatement and duplicate definitions across long articles.

The single biggest weakness of one-big-prompt article generation is continuity loss within the same article. The model writes section 1, then section 2, then section 3 - and section 3 has no memory that section 1 already explained what the framework acronym means. Same facts restate. Same definitions repeat. Same statistics get cited 4 times.

We avoid this with two mechanics. First, aeo_prepare_section_batch (run before every batch of 6 blocks, by default) injects a continuity summary of what was already written into the contract for the next batch. Second, the writer carries the citation registry in context - a deterministic log of which evidence ids and internal links have already been used. The “ADJACENT CITATION HARD BAN” lives in the registry: the same evidence id cannot appear in two adjacent blocks.

Combined with the hydrated evidence library and per-block delivery notes, the result is an article where section 4 references “the contrarian claim we made in section 2” instead of restating it - and where the FAQ pulls different questions than the body H2s instead of duplicating them.

What Happens If Something Fails Mid-Pipeline?

Every phase writes its state to the database before the next phase reads it. Failures are recoverable, not catastrophic:

  • Failure during aeo_create_article: the article job is marked failed with an error message, no partial article is exposed in Studio, the credit is refunded.
  • Failure during section writing (Step 2): the partial article persists, blocks already written stay in block_outputs. The agent can be resumed and pick up from blocks_remaining.
  • Failure during AEOPageRank scoring: the article is held with improvement_needed = true for operator inspection rather than auto-shipped.
  • Failure during finalize: review_status stays draft, the article shows up in Studio’s “needs attention” list.

Every article job ties to a aeo_pipeline_executions row with full telemetry (timing, token spend, score trajectory). That row is what we trace when investigating an outlier.

How Customers Experience This

Customers see one button click and a finished article in 5-20 minutes scored above 80, with hero image attached and citations linking to evidence-library sources rather than fabricated references.

From the Studio side, every article looks like one button click. Customer queues a topic from Topic Ideas or the editorial calendar; the article appears 5 to 20 minutes later in their My Content tab, already scored 80+, with hero image attached, ready to review and publish. They never see the phase transitions, the batch boundaries, or the score re-runs. The system absorbs all of that complexity so the customer experience stays “topic in, finished article out.”

What they do see in their dashboard:

  • A consistent quality floor (80+ AEOPageRank) so they don’t need to grade every article themselves
  • Citation footnotes that link to real sources from the evidence library, not training-data fabrications
  • FAQ questions drawn from real Reddit threads in their niche, not invented
  • Internal links that resolve to pages on their own site (audited at evidence-library build time)

External Resources

Key takeaways

  • Generation runs in 5 phases - create article, hydrate evidence library, batched section writing, repair below 80, finalize - so each phase has one clear job and one clean handoff.
  • Intelligence is collected from 10 web source types in two waves before a single sentence is written - articles cite real data sources by name, not training-data residue.
  • The evidence library is hydrated once and stays in context across every section, so every claim is anchored to a stable evidence id that the backend can validate against a citation registry.
  • Sections write in batches of 6 (default) with a continuity summary, so the writer always knows what was already said - prevents the 'same fact restated 8 times' failure mode of single-prompt generation.
  • AEOPageRank scores 5 pillars (Originality 25, Uniqueness 25, Extractability 25, Entity Richness 15, Structural 10) - anything below 80 triggers deterministic fixes, then up to 3 LLM rewrites before the article exits the pipeline.
  • All generation runs through Claude Code CLI inside a managed terminal-server worker, never the Anthropic SDK directly - means every article has a reproducible session log and our skill prompts ship via DB versioning.

Related FAQs

Content Strategy for AI
Technical Implementation
The AEO Audit