High Assurance
How did the AI reason about the task?
High Assurance captures the full chain-of-thought reasoning trace, token count, and reasoning model identity — linking every generated line of code to the AI's complete decision-making process.
Storage overhead: ~10–500 KB per annotation (with size management)
High Assurance adds reasoning context to Medium Assurance's prompt and environment tracking. It answers "what was the AI thinking when it wrote this code?" for every annotated line or function.
reasoning_text — full chain-of-thought or reasoning trace produced by the model (e.g., "Let me analyze the existing validation patterns in auth.py...") requiredreasoning_token_count — number of tokens in the reasoning output (e.g., 2847) optionalreasoning_model — model that produced the reasoning; set to the generation model if only one model is used (e.g., "claude-opus-4-5") requiredreasoning_hash — SHA-256 hash linking annotation records to the reasoning context in manifest.json required at highassurance_level — set to "high" on all annotation records requiredReasoning traces can be substantial (10–500 KB per generation event). The standard defines two mechanisms to keep manifest files manageable:
Stored as plain text in reasoning_text field in manifest.json
Gzip + base64 encoded in reasoning_text_compressed with compressed: true
Stored in .ai-audit/blobs/<hash>.json.gz with external: true
Note: PRISM fields (risk_score + risk_factors) add ~200–500 bytes per annotation — always inline, no compression needed.
prompt_text, prompt_type, prompt_context_files — prompt context inheritedtool_name, tool_version, model_name, model_version — environment context inheritedcommand_text, command_type — command context inheritedcaused_by, depends_on, informed_by, delegated_to, supersedes, reviewed_by) — context graph relationships inheritedparent_session_id, child_session_id, delegation_type) — multi-agent orchestration inheriteddecision_point, options, selected, rationale, confidence) — structured decision records inheritedannotation_id — content-derived SHA-256 identifier for edge record cross-references inheriteddecision_hash — optional link from annotations to decision context entries inheritedanchor_context, anchor_hash, file_content_hash — content anchoring fields inherited from Low/Medium. Highest fidelity at this level. optionalrisk_score, risk_factors — PRISM score; all standard signals available at High (including model_capability_tier). optionalKnowing how the AI reasoned is qualitatively different from knowing what it was asked. Chain-of-thought traces transform your audit trail from an evidence record into a forensic investigation tool.
Aerospace, medical devices, and autonomous vehicles require understanding of AI decision-making. When a flight controller function is AI-generated, auditors need to verify the model considered edge cases, overflow conditions, and failure modes. The reasoning trace provides that evidence.
When vulnerable code is discovered, the reasoning trace reveals what the model "thought" when generating it. Did the model consider input validation? Was it aware of the injection risk but chose a simpler approach? The chain-of-thought turns a vulnerability report into a root cause analysis.
Study how different models reason about the same programming tasks. Compare chain-of-thought quality across model families, analyze how reasoning depth correlates with code quality, and track how model reasoning improves across versions — all from structured, queryable audit data.
The EU AI Act and similar regulations may require explainability for AI-generated artifacts in regulated domains. High Assurance provides the traceability required by the Act's transparency obligations — a complete record of the AI's reasoning process for every code change.
"Why did the AI suggest this approach?" — the reasoning trace answers this directly. When a subtle bug surfaces months later, you can read the model's chain-of-thought to understand its assumptions, tradeoffs, and the logic path that led to the implementation choice.
High Assurance stores reasoning context in the manifest and links it to annotations via SHA-256 hash — the same content-addressed pattern used for environment, command, and prompt context.
Stored in manifest.json — records the AI's chain-of-thought reasoning.
Stored in annotations.jsonl — note the reasoning_hash field linking code to the AI's reasoning trace.
High Assurance is designed for contexts where understanding the AI's reasoning process is as important as understanding its output. If "what was it asked?" isn't enough and you need "how did it think?", this is your level.
If you only need to know which tool and model were used, Low Assurance has minimal overhead. If you need to know what the AI was asked but not its reasoning, Medium Assurance is a lighter alternative.
Beyond source code: High Assurance reasoning traces are particularly valuable for autonomous agents in regulated domains — healthcare, legal, financial services — where explainability of AI decisions is a compliance requirement, not just a best practice.
High Assurance captures both prompt text and reasoning traces, which may contain sensitive information. The larger data volume requires additional precautions.
Chain-of-thought traces can include internal reasoning about security architecture, proprietary algorithms, or sensitive business logic. The model may reference or analyze API keys, credentials, or confidential code that appeared in the context window. Projects operating at High Assurance should:
• Review manifest.json and blobs/ directory before committing to public repositories
• Use .gitignore to exclude manifest.json or specific blob files containing sensitive reasoning
• Consider encrypting sensitive manifest entries and blob files at rest
• Establish data retention policies — reasoning traces are large and accumulate quickly
• Evaluate whether blob files should be version-controlled or stored externally
High Assurance builds on Medium with one key addition: capturing chain-of-thought output after each generation event. The primary complexity is size management for large reasoning traces.
Compute the environment context hash from tool name, version, model name, and version. If the hash doesn't exist in manifest.json, add it. Append a "session" / "start" record to annotations.jsonl.
Capture the full prompt text, classify the prompt_type, and record context files. Compute the prompt hash using canonical JSON → SHA-256. Write the prompt context entry to manifest.json if the hash is new.
After receiving the model's response, capture the chain-of-thought output. Compute the reasoning hash using canonical JSON (sorted keys: reasoning_model, reasoning_text, reasoning_token_count, type) → SHA-256. Apply size management: if the reasoning text exceeds compress_reasoning_threshold_bytes (default 10 KB), store as gzip+base64; if it exceeds external_blob_threshold_bytes (default 100 KB), write to .ai-audit/blobs/<hash>.json.gz. Include both prompt_hash and reasoning_hash in annotation records.
When the AI executes a shell command, file operation, or API call, compute a command context hash and add it to manifest.json if new. Include the command_hash in associated annotation records.
If PRISM is enabled, compute risk score using all available signals. At High assurance, all 8 standard signals are available including model_capability_tier from environment context. Include risk_score and risk_factors in the annotation record. PRISM is a standalone extension on top of VIBES.
Backfill the commit_hash field on all annotation records created since the last commit. This field is required — every annotation must be linked to its git commit.
Append a "session" / "end" record to annotations.jsonl with the session ID and timestamp.
The key addition is capturing and managing chain-of-thought output in step 3. Implementations MUST support reading both compressed and external blob entries. For detailed implementation guidance including size management algorithms, see the Implementors Guide or the full RFC specification.
Back to the VIBES Standard overview →
High assurance data enables the richest PRISM risk assessments (reasoning trace analysis) and the strongest VERIFY attestations (full chain-of-thought integrity verification).