High Assurance - VIBES Standard

Reasoning context

What this level captures

// 01 — What This Level Captures

High Assurance adds reasoning context to Medium Assurance's prompt and environment tracking. It answers "what was the AI thinking when it wrote this code?" for every annotated line or function.

Reasoning Context (new at High)

reasoning_text — full chain-of-thought or reasoning trace produced by the model (e.g., "Let me analyze the existing validation patterns in auth.py...") required
reasoning_token_count — number of tokens in the reasoning output (e.g., 2847) optional
reasoning_model — model that produced the reasoning; set to the generation model if only one model is used (e.g., "claude-opus-4-5") required

Annotation Enhancement

reasoning_hash — SHA-256 hash linking annotation records to the reasoning context in manifest.json required at high
assurance_level — set to "high" on all annotation records required

Size Management

Reasoning traces can be substantial (10–500 KB per generation event). The standard defines two mechanisms to keep manifest files manageable:

Inline (default)

< 10 KB

Stored as plain text in reasoning_text field in manifest.json

Inline Compressed

10 KB – 100 KB

Gzip + base64 encoded in reasoning_text_compressed with compressed: true

External Blob

> 100 KB

Stored in .ai-audit/blobs/<hash>.json.gz with external: true

Note: PRISM fields (risk_score + risk_factors) add ~200–500 bytes per annotation — always inline, no compression needed.

Inherited from Lower Assurance Levels

prompt_text, prompt_type, prompt_context_files — prompt context inherited
tool_name, tool_version, model_name, model_version — environment context inherited
command_text, command_type — command context inherited
Edge records (caused_by, depends_on, informed_by, delegated_to, supersedes, reviewed_by) — context graph relationships inherited
Delegation records (parent_session_id, child_session_id, delegation_type) — multi-agent orchestration inherited
Decision context entries (decision_point, options, selected, rationale, confidence) — structured decision records inherited
annotation_id — content-derived SHA-256 identifier for edge record cross-references inherited
decision_hash — optional link from annotations to decision context entries inherited
Line and function annotations, session tracking, git commit correlation inherited
anchor_context, anchor_hash, file_content_hash — content anchoring fields inherited from Low/Medium. Highest fidelity at this level. optional
risk_score, risk_factors — PRISM score; all standard signals available at High (including model_capability_tier). optional

Use cases

Why this data matters

// 02 — Why This Data Matters

Knowing how the AI reasoned is qualitatively different from knowing what it was asked. Chain-of-thought traces transform your audit trail from an evidence record into a forensic investigation tool.

Safety-Critical Systems

Aerospace, medical devices, and autonomous vehicles require understanding of AI decision-making. When a flight controller function is AI-generated, auditors need to verify the model considered edge cases, overflow conditions, and failure modes. The reasoning trace provides that evidence.

Security Forensics

When vulnerable code is discovered, the reasoning trace reveals what the model "thought" when generating it. Did the model consider input validation? Was it aware of the injection risk but chose a simpler approach? The chain-of-thought turns a vulnerability report into a root cause analysis.

AI Research

Study how different models reason about the same programming tasks. Compare chain-of-thought quality across model families, analyze how reasoning depth correlates with code quality, and track how model reasoning improves across versions — all from structured, queryable audit data.

Regulatory Compliance

The EU AI Act and similar regulations may require explainability for AI-generated artifacts in regulated domains. High Assurance provides the traceability required by the Act's transparency obligations — a complete record of the AI's reasoning process for every code change.

Bug Investigation

"Why did the AI suggest this approach?" — the reasoning trace answers this directly. When a subtle bug surfaces months later, you can read the model's chain-of-thought to understand its assumptions, tradeoffs, and the logic path that led to the implementation choice.

Storage format

Example data

// 03 — Example Data

High Assurance stores reasoning context in the manifest and links it to annotations via SHA-256 hash — the same content-addressed pattern used for environment, command, and prompt context.

Reasoning Context Record

Stored in manifest.json — records the AI's chain-of-thought reasoning.

"c3d4e5f6a7b8c9d0...": { "type": "reasoning", "reasoning_text": "Let me analyze the existing validation patterns in auth.py.\n\nThe current implementation accepts email and password but doesn't validate format. I should:\n1. Add email regex validation\n2. Check password length (min 8 chars)\n3. Require at least one uppercase, one lowercase, one digit\n4. Return specific error messages for each failure case\n\nFor the regex, I'll use a permissive pattern that allows most valid emails without being overly strict...", "reasoning_token_count": 2847, "reasoning_model": "claude-opus-4-5", "created_at": "2026-02-03T14:30:05Z" }

Example: Compressed reasoning context (inline gzip+base64)

// When reasoning_text exceeds 10 KB, store compressed "c3d4e5f6a7b8c9d0...": { "type": "reasoning", "reasoning_text_compressed": "H4sIAAAAAAAAA8tIzcnJVyjPL8pJUQQAlRumDwwAAAA=", "compressed": true, "reasoning_token_count": 2847, "reasoning_model": "claude-opus-4-5", "created_at": "2026-02-03T14:30:05Z" }

Example: External blob reference (for traces > 100 KB)

// When reasoning exceeds 100 KB, store in external file "c3d4e5f6a7b8c9d0...": { "type": "reasoning", "external": true, "blob_path": "blobs/c3d4e5f6a7b8c9d0.json.gz", "reasoning_token_count": 15420, "reasoning_model": "claude-opus-4-5", "created_at": "2026-02-03T14:30:05Z" } // The full reasoning is stored at: // .ai-audit/blobs/c3d4e5f6a7b8c9d0.json.gz

Line Annotation with Reasoning Hash

Stored in annotations.jsonl — note the reasoning_hash field linking code to the AI's reasoning trace.

{ "type": "line", "file_path": "src/auth.py", "line_start": 1, "line_end": 45, "environment_hash": "e7a3f1b2c4d5e6f7...", "prompt_hash": "a1b2c3d4e5f6a7b8...", "reasoning_hash": "c3d4e5f6a7b8c9d0...", "action": "create", "timestamp": "2026-02-03T14:30:10Z", "commit_hash": "abc123def456", "session_id": "550e8400-e29b-41d4-a716-446655440000", "assurance_level": "high", "anchor_context": "class AuthMiddleware:\n def __init__(self, app):\n self.app = app", "anchor_hash": "b3c4d5e6...", "file_content_hash": "2b3c4d5e...", "risk_score": 0.65, "risk_factors": [ {"signal": "action_type", "value": 0.6, "weight": 0.15}, {"signal": "scope_lines", "value": 0.85, "weight": 0.15, "reason": "45 of 53 lines"}, {"signal": "human_review_present", "value": 1.0, "weight": 0.15, "reason": "No review edge"} ] }

Example: Review annotation with reasoning

// High Assurance MAY record reasoning for review-only actions { "type": "function", "file_path": "src/auth.py", "function_name": "validate_signup", "function_signature": "def validate_signup(email: str, password: str) -> bool", "environment_hash": "e7a3f1b2c4d5e6f7...", "prompt_hash": "d5e6f7a8b9c0d1e2...", "reasoning_hash": "f8a9b0c1d2e3f4a5...", "action": "review", "timestamp": "2026-02-03T15:00:00Z", "session_id": "550e8400-e29b-41d4-a716-446655440000", "assurance_level": "high" }

Example: Complete High Assurance config.json

{ "standard": "VIBES", "standard_version": "1.0", "assurance_level": "high", "project_name": "flight-controller", "tracked_extensions": [".py", ".c", ".h"], "exclude_patterns": ["**/test/**", "**/.venv/**"], "compress_reasoning_threshold_bytes": 10240, "external_blob_threshold_bytes": 102400, "risk_scoring": { "enabled": false } }

Example: Reasoning hash computation input

// Canonical JSON (sorted keys, no whitespace) → SHA-256 // Input: {"reasoning_model":"claude-opus-4-5","reasoning_text":"Let me analyze...","reasoning_token_count":2847,"type":"reasoning"} // Hash: SHA-256 of the above string // Result: "c3d4e5f6a7b8c9d0..."

Example: DuckDB query — reasoning hashes for a file

-- Find all annotations with reasoning traces for a specific file -- Cross-reference reasoning_hash against manifest.json to retrieve full traces SELECT file_path, line_start, line_end, action, reasoning_hash, timestamp FROM read_json_auto('.ai-audit/annotations.jsonl') WHERE type = 'line' AND file_path = 'src/auth.py' AND reasoning_hash IS NOT NULL ORDER BY line_start;

Audience

Who should use this level

// 04 — Who Should Use This Level

High Assurance is designed for contexts where understanding the AI's reasoning process is as important as understanding its output. If "what was it asked?" isn't enough and you need "how did it think?", this is your level.

Safety-critical software — aerospace, medical devices, autonomous vehicles
Security-sensitive code — authentication, encryption, payment processing
AI research and academic analysis of model reasoning patterns
Organizations preparing for EU AI Act and similar regulatory compliance
Forensic analysis teams investigating AI-generated vulnerabilities

If you only need to know which tool and model were used, Low Assurance has minimal overhead. If you need to know what the AI was asked but not its reasoning, Medium Assurance is a lighter alternative.

Beyond source code: High Assurance reasoning traces are particularly valuable for autonomous agents in regulated domains — healthcare, legal, financial services — where explainability of AI decisions is a compliance requirement, not just a best practice.

Threat surface

Security considerations

// 05 — Security Considerations

High Assurance captures both prompt text and reasoning traces, which may contain sensitive information. The larger data volume requires additional precautions.

Reasoning Traces May Reveal Sensitive Context

Chain-of-thought traces can include internal reasoning about security architecture, proprietary algorithms, or sensitive business logic. The model may reference or analyze API keys, credentials, or confidential code that appeared in the context window. Projects operating at High Assurance should:

• Review manifest.json and blobs/ directory before committing to public repositories
• Use .gitignore to exclude manifest.json or specific blob files containing sensitive reasoning
• Consider encrypting sensitive manifest entries and blob files at rest
• Establish data retention policies — reasoning traces are large and accumulate quickly
• Evaluate whether blob files should be version-controlled or stored externally

Implementation steps

How tools emit this data

// 06 — Implementation

High Assurance builds on Medium with one key addition: capturing chain-of-thought output after each generation event. The primary complexity is size management for large reasoning traces.

On session start (same as Medium)

Compute the environment context hash from tool name, version, model name, and version. If the hash doesn't exist in manifest.json, add it. Append a "session" / "start" record to annotations.jsonl.

On pre-generation (same as Medium)

Capture the full prompt text, classify the prompt_type, and record context files. Compute the prompt hash using canonical JSON → SHA-256. Write the prompt context entry to manifest.json if the hash is new.

On post-generation enhanced

After receiving the model's response, capture the chain-of-thought output. Compute the reasoning hash using canonical JSON (sorted keys: reasoning_model, reasoning_text, reasoning_token_count, type) → SHA-256. Apply size management: if the reasoning text exceeds compress_reasoning_threshold_bytes (default 10 KB), store as gzip+base64; if it exceeds external_blob_threshold_bytes (default 100 KB), write to .ai-audit/blobs/<hash>.json.gz. Include both prompt_hash and reasoning_hash in annotation records.

On tool invocation (same as Medium)

When the AI executes a shell command, file operation, or API call, compute a command context hash and add it to manifest.json if new. Include the command_hash in associated annotation records.

PRISM computation optional

If PRISM is enabled, compute risk score using all available signals. At High assurance, all 8 standard signals are available including model_capability_tier from environment context. Include risk_score and risk_factors in the annotation record. PRISM is a standalone extension on top of VIBES.

On commit (required, same as Medium)

Backfill the commit_hash field on all annotation records created since the last commit. This field is required — every annotation must be linked to its git commit.

On session end (same as Medium)

Append a "session" / "end" record to annotations.jsonl with the session ID and timestamp.

The key addition is capturing and managing chain-of-thought output in step 3. Implementations MUST support reading both compressed and external blob entries. For detailed implementation guidance including size management algorithms, see the Implementors Guide or the full RFC specification.

Cross-references

Explore all assurance levels

// 07 — Explore Levels

Low Assurance Medium Assurance High Assurance

Back to the VIBES Standard overview →

High assurance data enables the richest PRISM risk assessments (reasoning trace analysis) and the strongest VERIFY attestations (full chain-of-thought integrity verification).