perathos
TechnicalJanuary 14, 2025· 6 min read

Anatomy of a VRL Proof Bundle

A walkthrough of every field in a VRL Proof Bundle — what it contains, why each piece is there, and how to review the evidence record behind a verdict.

Every API call through Perathos can produce a VRL Proof Bundle: a structured JSON evidence record for the verification. This post walks through the structure field by field and separates implemented fields from integrity metadata that depends on the customer deployment.

Here is a complete bundle (abbreviated for readability):

{
  "bundle_id": "bndl_01j9xk2m3n4p5q6r7s8t9u0v",
  "schema_version": "1.2.0",
  "created_at": "2025-01-14T09:17:32.881Z",
  "verdict": "FLAG",
  "confidence_score": 0.72,
  "model": {
    "declared": "openai/gpt-4o",
    "fingerprinted": "openai/gpt-4o",
    "match": true
  },
  "verifiers": [
    { "id": "model_fingerprinter",    "type": "deterministic", "verdict": "PASS", "score": 1.00, "weight": 0.10, "latency_ms": 11  },
    { "id": "cross_examiner",         "type": "llm",           "verdict": "FLAG", "score": 0.61, "weight": 0.20, "latency_ms": 1842 },
    { "id": "hallucination_detector", "type": "llm",           "verdict": "FLAG", "score": 0.58, "weight": 0.18, "latency_ms": 1105 },
    { "id": "symbolic_solver",        "type": "deterministic", "verdict": "SKIP", "score": null, "weight": 0.50, "latency_ms": 4    },
    { "id": "knowledge_graph",        "type": "deterministic", "verdict": "PASS", "score": 1.00, "weight": 0.22, "latency_ms": 88   },
    { "id": "temporal_consistency",   "type": "hybrid",        "verdict": "FLAG", "score": 0.55, "weight": 0.12, "latency_ms": 634  },
    { "id": "schema_validator",       "type": "hybrid",        "verdict": "PASS", "score": 1.00, "weight": 0.18, "latency_ms": 23   }
  ],
  "findings": [
    {
      "verifier_id": "cross_examiner",
      "claim_id": "clm_3",
      "severity": "warning",
      "description": "Claim 3 did not survive independent restatement. The cross-examiner could not reproduce the stated statistical relationship."
    },
    {
      "verifier_id": "temporal_consistency",
      "claim_id": "clm_1",
      "severity": "warning",
      "description": "Claim 1 references guidance that may have been superseded. Assertion dated to 2022; recency index flags an update in Q3 2024."
    }
  ],
  "integrity": {
    "hash": "sha256:9f2b...",
    "signature_status": "configured_when_enabled",
    "proof_status": "configured_when_enabled"
  }
}

Reading the verdict

The top-level verdict is the first thing to check. In this bundle it is FLAG, which means the confidence score (0.72) fell between the block threshold (0.50) and the flag threshold (0.80). The response passed but is flagged for review.

The confidence_score of 0.72 is the weighted average of all non-SKIP verifier scores. Notice that the Symbolic Solver returned SKIP — this response contained no mathematical expressions, so that verifier had nothing to evaluate and is excluded from the weighted average.

The model object

The model object is produced by Verifier 01, the Model Fingerprinter. It records the declared model identity (what the request claimed) and the fingerprinted identity (what behavioral analysis determined). Here they match — both are openai/gpt-4o. A mismatch here (e.g., a provider substituting a cheaper model without disclosure) would set match: false and contribute to a FLAG or BLOCK verdict.

Reading the verifiers array

Each entry shows the verifier's individual verdict, raw score, weight, and latency. A few things worth noting in this bundle:

  • — The Cross-Examiner and Hallucination Detector both flagged with scores of 0.61 and 0.58 respectively — these are the LLM-based verifiers raising soft warnings
  • — The Knowledge Graph returned PASS (score 1.0) — the factual claims it could check against its structured database were correct
  • — Temporal Consistency returned FLAG (0.55) — one claim appears to reference outdated guidance
  • — Total latency was approximately 1,842ms — the slowest single verifier (Cross-Examiner), not the sum of all verifiers, since they run in parallel

The findings array

The findings array is where specific issues are surfaced. Each finding references the verifier that raised it and the specific extracted claim ID. This is how you go from "the response was flagged" to "here is the specific claim that didn't hold up."

In this bundle, two findings: one from the Cross-Examiner (a statistical claim that didn't survive independent restatement) and one from Temporal Consistency (a claim that may reference superseded guidance from Q3 2024).

Integrity fields

The integrity object records the hash and any deployment-specific integrity metadata that has been enabled for the customer environment. A startup pilot should treat these fields as evidence controls that must be confirmed during onboarding, not as universal public guarantees.

The practical value is reviewability: a customer can compare the bundle, verdict, verifier outputs, and retention policy for the deployment in scope. Stronger integrity controls can be added as enterprise requirements become concrete.

Reviewing integrity metadata

During procurement, ask which integrity controls are active for the workspace, where the records are retained, who can access them, and how long they are kept. The answer should match the customer contract, the deployment architecture, and the evidence shown during onboarding.

That framing keeps the bundle useful for enterprise review without claiming a public assurance control that has not been enabled for the specific customer environment.

Full schema reference

The complete VRL Proof Bundle field reference — every field, type, and example value.