technical overview
How Perathos Works
Perathos is a drop-in verification proxy. Your AI calls pass through it unchanged — but every response is examined by seven independent verifiers before delivery.
architecture
Seven verifiers run in parallel during the Judge phase. Total added latency ≈ slowest single verifier, not the sum.
In practice: 800ms–2s end-to-end for the full seven-verifier pipeline; under 100ms for the deterministic-only fast path (Symbolic Solver, Knowledge Graph, Schema Validator, Model Fingerprinter). Configurable per request via the X-VRL-Mode header.
Every response passes through all seven verifiers simultaneously. Verdicts are aggregated using a weighted confidence model. Verifier 1 (Model Fingerprinter) establishes the identity of the source LLM before any claim is evaluated. Verifiers 2–7 evaluate whether the claims are true, current, and internally consistent.
The Verification Pipeline
Intercept & Fingerprint
Your SDK's base_url points to Perathos. The request is forwarded to your chosen LLM provider identically — same model, same parameters, same API format. On response, the Model Fingerprinter immediately identifies the source LLM from headers, response shape, and latency signature.
Extracts: provider, model_id, version, token_count, finish_reason, latency_ms — all bound to the VRL bundle
Claim Extraction
A fast lightweight model reads the response and extracts every verifiable claim: factual assertions, mathematical expressions, structured data, and citations.
Output: typed claim list — [{id, type: 'mathematical|factual|structured|citation', text, context}]
Parallel Verification
All seven verifiers run simultaneously. LLM-based verifiers use independent models; deterministic verifiers use SymPy and your knowledge base. No verifier sees another's output.
asyncio.gather() — full pipeline ≈ 800ms–2s; deterministic-only fast path < 100ms
Verdict Aggregation
A weighted confidence score is computed from all seven verifier signals. Deterministic failures (wrong math, model mismatch) trigger immediate BLOCK. LLM failures accumulate toward FLAG or BLOCK thresholds.
PASS ≥ 0.80 confidence | FLAG 0.50–0.80 | BLOCK < 0.50 (configurable per deployment)
Proof Generation
The verdict is wrapped in a VRL Proof Bundle containing the evidence trail: claims, verifier scores, confidence weights, timestamps, and configured integrity fields where enabled.
verdict JSON -> VRL bundle -> retained according to the workspace evidence policy
Delivery
The response is returned in the exact format your SDK expects. PASS responses include the original content. BLOCK responses replace content with an explanation and verification context.
x-vrl-verdict header | x-vrl-bundle-id | x-vrl-confidence | x-vrl-model
Fast path vs. full pipeline
Verification is configurable per request. The full seven-verifier pipeline includes the Cross-Examiner and Hallucination Detector — LLM-based verifiers that catch prose-level confabulation but add 800ms–2s of latency. For agentic workflows that chain six or more LLM calls, that cost compounds. Perathos exposes a deterministic-only fast path for cases where latency matters more than open-ended hallucination detection.
The fast path runs four verifiers only: Symbolic Solver (mathematical), Knowledge Graph (structured factual lookups), Schema Validator (JSON/output structure), and Model Fingerprinter (source attribution). Total latency under 100ms. Use the fast path when responses are numerically or schema-constrained — landed-cost calculations, structured trade documents, tariff classifications, regulatory citation lookups against a configured knowledge base. Use the full pipeline when responses are open-ended natural-language — clinical summaries, legal opinions, research write-ups.
# Per-request mode selection X-VRL-Mode: fast # 4 deterministic verifiers, < 100ms X-VRL-Mode: full # all 7 verifiers, 800ms–2s (default)
The Seven Verifiers
Weights shown are defaults. Enterprise deployments can adjust per-verifier weights to match their risk priorities.
Identifies and records the source LLM — model family, version, and provider — before any claim is evaluated. Compares declared model identity against behavioral fingerprint. Flags responses where model identity cannot be confirmed or where the declared model does not match observed behavior.
catches: Model substitution, version drift in production, undisclosed model changes by API providers.
Sends response and extracted claims to a second model with the sole task of finding factual contradictions. Returns per-claim PASS/FAIL/UNCERTAIN with confidence scores.
catches: Confident confabulation, claims that don't survive independent restatement.
A fast model specifically prompted to identify hallucination signals: fabricated specifics, overconfident statistics, invented citations, implausible entity combinations.
catches: Fabricated citations, invented statistics, overconfident responses on ambiguous topics.
Uses SymPy to algebraically verify every mathematical equation and numerical assertion. If LHS ≠ RHS, it is always BLOCK — no threshold, no probability. Deterministic verification cannot be overridden by other verifier scores.
catches: Arithmetic errors, incorrect formula application, invalid logical deductions.
Factual claims are cross-referenced against structured knowledge graphs. Configurable to domain-specific graphs: financial regulations, clinical drug databases, legal case law, trade tariff schedules.
catches: Fabricated regulatory citations, superseded regulations, incorrect entity relationships.
Detects claims that were true at training time but are no longer current. Compares dated assertions against a maintained recency index.
catches: Outdated regulatory citations, superseded guidance, stale market or clinical data.
Structured output — JSON blocks, financial tables, API responses — is extracted and validated against a JSON Schema or regex patterns you configure. AI extracts; deterministic code verifies each one.
catches: Malformed structured outputs, schema violations, proof tampering.
Full API and schema documentation
VRL Proof Bundle schema reference, verdict threshold configuration, and API reference are in the docs.
See it in your environment
We build a proof of concept tailored to your use case — live, with your data.
Request a Demo