perathos
docs/api

api reference

API Reference

The Perathos API is OpenAI-compatible. Any SDK that supports a custom base_url will work without modification. All endpoints require a Perathos API key in the Authorization header.

Base URL & Authentication

Base URL:  https://api.perathos.com/v1

Authorization: Bearer pk_live_...   # Perathos API key in all requests
POST/v1/chat/completions

Drop-in replacement for the OpenAI Chat Completions endpoint. Request format is identical. Perathos routes the request through the provider configuration registered for your workspace, verifies the response, and returns it with additional headers.

Request headers

Authorization*

Bearer {PERATHOS_API_KEY}

Content-Type*

application/json

X-LLM-Provider-Key

Transitional header for onboarding only. Production workspaces should use server-side provider configuration rather than sending provider keys per request.

X-LLM-Model

Override the model identifier forwarded to your LLM provider. Defaults to the model field in the request body.

X-VRL-Flag-Below

Override the FLAG threshold for this request. Float 0.0–1.0. Default: 0.80.

X-VRL-Block-Below

Override the BLOCK threshold for this request. Float 0.0–1.0. Default: 0.50.

Request body

Standard OpenAI Chat Completions request body, with an optional perathos extension object.

{
  "model": "gpt-4o",
  "messages": [
    { "role": "system", "content": "You are a financial analyst." },
    { "role": "user", "content": "Summarise Basel III capital requirements." }
  ],
  "temperature": 0.3,
  "max_tokens": 500,

  // optional Perathos extension
  "perathos": {
    "verifiers": ["symbolic_solver", "knowledge_graph"],  // enable only specific verifiers
    "flag_below": 0.85,      // per-request threshold override
    "block_below": 0.60,
    "domain": "financial_regulations"  // hint to knowledge graph verifier
  }
}

Response headers

x-vrl-verdict

PASS | FLAG | BLOCK

x-vrl-model

Fingerprinted source model (provider/model-name)

x-vrl-bundle-id

VRL Proof Bundle ID. Use to retrieve the full bundle.

x-vrl-confidence

Aggregated confidence score, float 0.0–1.0

x-vrl-latency-ms

Verification latency added by Perathos, in milliseconds

Response body

Identical to the OpenAI Chat Completions response format. For BLOCK verdicts, choices[0].message.content is replaced with a block explanation and verification context.

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "model": "gpt-4o",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Basel III requires banks to maintain..."  // BLOCK: explanation replaces this
    },
    "finish_reason": "stop"
  }],
  "usage": { "prompt_tokens": 28, "completion_tokens": 142, "total_tokens": 170 }
}
GET/v1/bundles/{bundle_id}

Retrieve the full VRL Proof Bundle by ID. Returns the bundle fields available for the deployment in scope, including verifier scores, extracted claims, timestamps, and configured proof or signature fields where enabled.

GET https://api.perathos.com/v1/bundles/bndl_01j9xk2m3n4p5q6r7s8t9u0v
Authorization: Bearer pk_live_...

# Response: 200 OK
# Body: VRL Proof Bundle object (see VRL Proof Bundle schema reference)
GET/v1/bundles

List bundles for your account. Supports pagination and filtering by verdict, date range, and model.

GET https://api.perathos.com/v1/bundles?verdict=BLOCK&limit=50&after=bndl_...
Authorization: Bearer pk_live_...

# Query parameters:
# verdict     string  Filter by verdict: PASS, FLAG, BLOCK
# after       string  Cursor-based pagination: bundle_id to start after
# before      string  Cursor-based pagination: bundle_id to end before
# limit       integer Max results to return (default 20, max 100)
# created_after  ISO 8601 datetime
# created_before ISO 8601 datetime

Error Codes

400
invalid_request

Malformed request body or missing required fields.

401
authentication_error

Missing or invalid Authorization header.

402
quota_exceeded

Monthly call quota exceeded. Upgrade tier or contact sales.

422
unprocessable_response

LLM provider returned an unparseable response. Bundle not generated.

429
rate_limit_exceeded

Too many requests. Retry after the value in Retry-After header.

502
llm_provider_error

Upstream LLM provider returned an error. Error details forwarded in response body.

503
verification_unavailable

Verification pipeline temporarily unavailable. Response not returned. Retry with exponential backoff.

Streaming

Streaming is enabled by setting stream: true in the request body. Perathos must observe the complete LLM response to issue a verdict — two modes trade off verdict-before-delivery against time-to-first-token.

Buffered — Perathos buffers the LLM stream, runs verification, then streams the response with the verdict header set. Full-pipeline latency (800ms–2s) before the first token. Use when a BLOCK must gate delivery.

Pass-through — Perathos streams in real time and verifies in parallel. The verdict is delivered as the HTTP trailing header x-vrl-verdict-deferred: <bundle_id> at stream end. Retrieve the verdict via the bundle endpoint. Suitable for audit, not gating — the response has already reached the user by the time verification completes.

# Per-request mode override
X-VRL-Stream-Mode: buffered | passthrough

# Account default is configurable via the dashboard.
# Default for new accounts: buffered.

# Pass-through trailing header (only present when X-VRL-Stream-Mode: passthrough)
x-vrl-verdict-deferred: bndl_01j9xk2m3n4p5q6r7s8t9u0v

Rate limits

Starter

60 RPM, 1,000 RPD

Platform

600 RPM, no daily cap (subject to monthly plan volume)

Enterprise

Negotiated per contract, no enforced ceiling outside the contract

Every response includes rate-limit headers:

X-RateLimit-Limit:      Current period limit
X-RateLimit-Remaining:  Calls remaining in the current window
X-RateLimit-Reset:      Unix epoch when the window resets

On 429 responses:

HTTP/1.1 429 Too Many Requests
Retry-After: 12

{
  "error": "rate_limit_exceeded",
  "retry_after_seconds": 12
}

Rate limits scope per Perathos API key, not per source IP. Multiple keys (for example, per service) get independent limits.

Retries and idempotency

Clients should send an Idempotency-Key request header (UUIDv4) per logical request. When idempotency is enabled for the workspace, identical key + identical payload returns the cached bundle ID within the configured retention window. Identical key + different payload returns a 409 conflict; treat that as a programming error.

Network-level retries are safe with an idempotency key; without one, retried calls run new verifications and are billed separately. Recommended retry strategy: exponential backoff, max 3 attempts, only on 5xx and 429.

# Python — idempotency key per logical request
import uuid

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[...],
    extra_headers={"Idempotency-Key": str(uuid.uuid4())},
)

Degraded mode and fail-policy

Verifier-level degradation

If 1–2 of the seven verifiers are temporarily unavailable, the verdict is still issued from the remaining verifiers. The bundle is marked degraded: true and the response carries x-vrl-degraded: true. The confidence score is renormalised over the available verifiers.

Pipeline-level failure

With four or more verifiers down (or an upstream verifier-inference outage), default behaviour is fail-closed: 503 response, no LLM call forwarded, no bundle generated. This is the safe default for regulated workflows: no unverified response leaves the proxy.

Fail-open option

Enterprise customers can opt in to fail-open behaviour per contract. The LLM call is forwarded and the response returned with x-vrl-verdict: UNVERIFIED and x-vrl-degraded: pipeline_unavailable. The customer explicitly accepts the audit gap during pipeline outages. Not available on Starter or Platform tiers.

Proxy availability is independent of verifier availability

If the pipeline is down but the proxy gateway is healthy, fail-open customers still get their LLM responses with an explicit unverified header; the gating decision is theirs.