itsavibe.ai                                                 itsavibe.ai
Request for Comments: 0001                                  February 2026
Category: Standards Track
Status: DRAFT


       Verifiable Inventory of Bot-Engineered Signals (VIBES) v1.0


Status of This Memo

   This document specifies a proposed standard for the Internet and AI
   development community, and requests discussion and suggestions for
   improvements.  Distribution of this memo is unlimited.

Copyright Notice

   Copyright (c) 2026 itsavibe.ai.  All rights reserved.

Abstract

   This document defines the Verifiable Inventory of Bot-Engineered
   Source (VIBES) standard, the foundational data standard in a
   four-part ecosystem for AI code transparency.  VIBES specifies a
   three-tier framework for recording AI involvement in software
   development at the line and function level: what metadata to
   capture, how to store it using content-addressable hashing, and
   where the audit data resides relative to the project.  Companion
   specifications VERIFY (security attestation), PRISM (risk scoring),
   and EVOLVE (agent learning and governance) extend the data model
   defined here.  VIBES is
   tool-agnostic and designed for adoption by any AI tool or agent
   including code completion engines, autonomous agents,
   chat-driven development environments, and workflow automation
   systems.


Table of Contents

   1.  Introduction . . . . . . . . . . . . . . . . . . . . . . .   3
   2.  Terminology  . . . . . . . . . . . . . . . . . . . . . . .   4
   3.  Assurance Levels . . . . . . . . . . . . . . . . . . . . .   6
       3.1.  Low Assurance  . . . . . . . . . . . . . . . . . . .   6
       3.2.  Medium Assurance . . . . . . . . . . . . . . . . . .   7
       3.3.  High Assurance . . . . . . . . . . . . . . . . . . .   8
       3.4.  Level Comparison . . . . . . . . . . . . . . . . . .   9
   4.  Architecture . . . . . . . . . . . . . . . . . . . . . . .  10
       4.1.  File Layout  . . . . . . . . . . . . . . . . . . . .  10
       4.2.  Git Tracking Recommendations . . . . . . . . . . . .  10
   5.  Hash Specification . . . . . . . . . . . . . . . . . . . .  11
       5.1.  Algorithm  . . . . . . . . . . . . . . . . . . . . .  11
       5.2.  Canonicalization . . . . . . . . . . . . . . . . . .  11
       5.3.  Short References . . . . . . . . . . . . . . . . . .  12
       5.4.  Annotation ID Computation  . . . . . . . . . . . . .  13
   6.  Manifest File Specification  . . . . . . . . . . . . . . .  14
       6.1.  Top-Level Structure  . . . . . . . . . . . . . . . .  14
       6.2.  Environment Context Entry  . . . . . . . . . . . . .  14
       6.3.  Prompt Context Entry . . . . . . . . . . . . . . . .  15
       6.4.  Command Context Entry . . . . . . . . . . . . . . . .  16
       6.5.  Reasoning Context Entry . . . . . . . . . . . . . . .  17
       6.6.  Decision Context Entry . . . . . . . . . . . . . . .  18
       6.7.  Size Management  . . . . . . . . . . . . . . . . . .  19
   7.  Annotation Log . . . . . . . . . . . . . . . . . . . . . .  20
       7.1.  Format Selection Rationale . . . . . . . . . . . . .  20
       7.2.  Record Types . . . . . . . . . . . . . . . . . . . .  21
       7.3.  Line Annotation Record . . . . . . . . . . . . . . .  21
       7.3.1.  Line Number Stability . . . . . . . . . . . . . .  22
       7.4.  Function Annotation Record . . . . . . . . . . . . .  24
       7.5.  Session Record . . . . . . . . . . . . . . . . . . .  23
       7.6.  Edge Record  . . . . . . . . . . . . . . . . . . . .  24
       7.7.  Delegation Record  . . . . . . . . . . . . . . . . .  25
       7.8.  Derived Query Database . . . . . . . . . . . . . . .  26
       7.9.  Actions  . . . . . . . . . . . . . . . . . . . . . .  27
       7.10. JSONL Formatting Rules . . . . . . . . . . . . . . .  27
   8.  Configuration . . . . . . . . . . . . . . . . . . . . . .   28
       8.1.  Configuration Schema . . . . . . . . . . . . . . . .  28
       8.2.  Storage Strategies for High Assurance  . . . . . . .  29
   9.  Tool Integration . . . . . . . . . . . . . . . . . . . . .  30
       9.1.  Hook Points  . . . . . . . . . . . . . . . . . . . .  30
       9.2.  Minimum Implementation by Level  . . . . . . . . . .  31
       9.3.  Concurrency  . . . . . . . . . . . . . . . . . . . .  33
   10. Querying and Reporting . . . . . . . . . . . . . . . . . .  34
       10.4.  PRISM (Provenance & Risk Intelligence Scoring
              Model)  . . . . . . . . . . . . . . . . . . . . .  36
       10.4.1.  Signal Vocabulary  . . . . . . . . . . . . . . .  36
       10.4.2.  Reference Scoring Algorithm  . . . . . . . . . .  37
       10.4.3.  Severity Bands . . . . . . . . . . . . . . . . .  38
       10.4.4.  Storage Format . . . . . . . . . . . . . . . . .  39
       10.4.5.  CI/CD Integration  . . . . . . . . . . . . . . .  40
       10.4.6.  PRISM Configuration  . . . . . . . . . . . . . .  41
   11. Attestation (Informative) . . . . . . . . . . . . . . . .  44
       11.1. Attestation Format  . . . . . . . . . . . . . . . .  44
       11.2. Subject Binding . . . . . . . . . . . . . . . . . .  45
       11.3. Key Management  . . . . . . . . . . . . . . . . . .  45
       11.4. Submission  . . . . . . . . . . . . . . . . . . . .  45
       11.5. Timestamping  . . . . . . . . . . . . . . . . . . .  46
       11.6. Trust Model . . . . . . . . . . . . . . . . . . . .  46
   12. Versioning and Migration . . . . . . . . . . . . . . . . .  46
   13. Security Considerations  . . . . . . . . . . . . . . . . .  47
   14. References . . . . . . . . . . . . . . . . . . . . . . . .  48
       14.3. Companion Specifications  . . . . . . . . . . . . . .  49
   Appendix A.  Complete Example  . . . . . . . . . . . . . . . .  49
   Appendix B.  Relationship to Existing Standards  . . . . . . .  52
   Authors' Addresses . . . . . . . . . . . . . . . . . . . . . .  53


1.  Introduction

   AI coding tools -- code completion engines, autonomous agents, and
   chat-driven development environments -- are generating an increasing
   share of production code.  Yet the resulting codebases carry no
   standardized record of which tool produced which lines, what
   instructions were given, or how the model reasoned about the task.

   This makes auditing, reproducing, and attributing AI-generated code
   difficult or impossible.

   The Verifiable Inventory of Bot-Engineered Signals (VIBES) standard
   addresses this gap by defining a three-tier framework for recording
   AI involvement in code at the line and function level.

   The standard specifies:

      (a) What metadata to capture at each assurance level.

      (b) How to store it efficiently using content-addressable hashing.

      (c) Where the data lives relative to the project.

   VIBES is tool-agnostic.  Any AI coding tool -- Claude Code, Cursor,
   Windsurf, Copilot, Codex, CLINE, or others -- can implement it.  The
   goal is a common audit format that enables transparency,
   reproducibility, and accountability across the ecosystem.

1.1.  Design Principles

   1. No duplication.  Every unique combination of tool, model, prompt,
      or reasoning trace is stored once and referenced by hash.

   2. Progressive disclosure.  Low assurance is cheap to implement.
      Higher tiers add data, not complexity.

   3. Tool-agnostic.  The standard prescribes data formats, not tool
      behavior.

   4. Git-aware.  Audit data correlates with git commits but does not
      depend on git.

   5. Queryable.  The storage format supports structured queries out of
      the box.

1.2.  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
   document are to be interpreted as described in RFC 2119 [RFC2119].


2.  Terminology

   Environment Context:
      The combination of tool name, tool version, model name, model
      version, and model parameters used for a code generation or
      modification event.

   Prompt Context:
      The full text of the instruction, question, or prompt given to the
      AI tool that resulted in a code change.

   Command Context:
      A record of a tool invocation or shell command executed by the AI
      agent during a development session, including the command text,
      type, exit code, and optional output summary.

   Reasoning Context:
      The AI model's chain-of-thought, internal reasoning trace, or
      "thinking" output produced while generating code.

   Context Hash:
      A SHA-256 hash of the canonical JSON representation of a context
      object.  Used as a compact, unique reference.

   Manifest:
      A JSON file (.ai-audit/manifest.json) that maps context hashes to
      their full content.  The single source of truth for all context
      data.

   Annotation Log:
      A JSONL file (.ai-audit/annotations.jsonl) that records line-level
      and function-level annotations referencing context hashes.  Each
      line is one self-contained JSON object.

   Session:
      A logical grouping of related code generation events within a
      single tool invocation or development session.

   Annotation:
      A JSON record in the annotation log that associates a file
      location (line range or function) with one or more context hashes.

   Assurance Level:
      One of three tiers (Low, Medium, High) that determines the depth
      of metadata captured.

   Action:
      The type of code event: "create", "modify", "delete", or
      "review".

   Edge:
      A directed relationship between two audit events or context
      entries capturing causal, dependency, or informational links
      within the annotation log.

   Delegation:
      An annotation record capturing the act of one session (the
      parent) spawning or tasking another session (the child) for
      multi-agent orchestration.

   Decision Context:
      A manifest entry recording a structured decision point where
      the AI agent considered multiple alternatives and selected
      one.

   Annotation ID:
      A deterministic SHA-256 identifier for any annotation record,
      computed from its canonical JSON form (sorted keys, no
      whitespace, excluding the annotation_id field itself).  Used
      as a stable reference for edge records.


3.  Assurance Levels

   The standard defines three assurance levels.  Each tier is a strict
   superset of the one below it.

3.1.  Low Assurance

   Question answered: "What tool and model touched this code?"

   Required data:

      tool_name       Name of the AI coding tool.
                      Example: "Claude Code"

      tool_version    Semantic version of the tool.
                      Example: "1.5.2"

      model_name      Name of the AI model.
                      Example: "claude-opus-4-5"

      model_version   Model version or checkpoint identifier.
                      Example: "20251101"

   Optional data:

      model_parameters   Key parameters affecting output.
                         Example: {"temperature": 0.7}

      tool_extensions    Plugins, MCP servers, or extensions active.
                         Example: ["filesystem", "git"]

   Storage: Environment context hashes in the manifest.  Line and
   function annotations in the annotation log referencing those hashes.

   Use cases: Open source transparency, badge verification, basic
   provenance tracking.

   Implementation cost: Minimal.  Tools only need to record their own
   identity and the model they invoked.

3.2.  Medium Assurance

   Question answered: "What was the AI asked to do?"

   Includes everything in Low Assurance, plus:

      prompt_text           Full prompt or instruction text.
                            Example: "Add input validation to the
                            signup form..."

      prompt_type           Category of the prompt.  One of:
                            "user_instruction", "edit_command",
                            "chat_message", "inline_completion",
                            "review_request", "refactor_request",
                            "other".

      prompt_context_files  Files provided as context to the model.
                            Example: ["src/auth.py", "src/models.py"]

   Storage: Prompt context hashes added to the manifest.  Annotation
   records gain a prompt_hash field linking to the prompt that triggered
   the change.

   Use cases: Enterprise audit trails, regulated industry compliance,
   code review workflows, reproducibility testing.

   Implementation cost: Moderate.  Tools must capture and persist the
   prompt text for each generation event.

3.3.  High Assurance

   Question answered: "How did the AI reason about the task?"

   Includes everything in Medium Assurance, plus:

      reasoning_text         Full chain-of-thought or reasoning trace.
                             Example: "Let me analyze the existing
                             validation patterns in auth.py..."

      reasoning_token_count  Number of tokens in the reasoning output.
                             Example: 2847

      reasoning_model        Model that produced the reasoning.  Set
                             to the generation model if only one
                             model is used.  Example: "claude-opus-4-5"

   Storage: Reasoning context hashes added to the manifest.  Annotation
   records gain a reasoning_hash field.

   Use cases: Safety-critical systems, security-sensitive code, AI
   research, forensic analysis, compliance with future AI regulation.

   Implementation cost: High.  Requires capturing extended thinking
   output, which can be voluminous.  See Section 8.2 for storage and
   compression strategies.

3.4.  Level Comparison

   +------------------------------------+-----+--------+------+
   | Capability                         | Low | Medium | High |
   +------------------------------------+-----+--------+------+
   | Identify tool and model            | Yes | Yes    | Yes  |
   | Track AI-generated lines/functions | Yes | Yes    | Yes  |
   | Track commands/tool invocations    | Yes | Yes    | Yes  |
   | Correlate with git commits         | Yes | Yes    | Yes  |
   | Know what the AI was asked to do   | No  | Yes    | Yes  |
   | Know context files AI had          | No  | Yes    | Yes  |
   | Know how the AI reasoned           | No  | No     | Yes  |
   | Reproduction likelihood            | Low | Medium | High |
   | Storage overhead per annotation    | ~200B | ~2-10KB | ~10-500KB |
   +------------------------------------+-----+--------+------+


4.  Architecture

4.1.  File Layout

   All audit data lives in a .ai-audit/ directory at the project root.

      project/
      +-- .ai-audit/
      |   +-- manifest.json        Hash-to-context mappings (git-tracked)
      |   +-- annotations.jsonl    Append-only annotation log (git-tracked)
      |   +-- config.json          Project audit configuration (git-tracked)
      |   +-- audit.db             Generated query database (gitignored)
      |   +-- blobs/               External storage for large entries
      |       +-- <hash>.json.gz
      +-- .gitignore
      +-- src/
      |   +-- ...
      +-- ...

   The annotation log (annotations.jsonl) is the canonical record of
   all audit events.  It is an append-only file where each line is a
   self-contained JSON object representing one annotation event.

   The query database (audit.db) is a derived artifact generated from
   annotations.jsonl by compliant tooling.  It provides indexed SQL
   queries for analysis and reporting.  It is not a source of truth
   and MUST NOT be manually edited.

4.2.  Git Tracking Recommendations

      manifest.json      SHOULD be tracked.  Human-readable, produces
                         meaningful diffs, serves as the authoritative
                         hash registry.

      annotations.jsonl  SHOULD be tracked.  Append-only text; git
                         diffs show exactly which annotations were
                         added.  The canonical audit trail.

      config.json        SHOULD be tracked.  Small, human-readable,
                         defines project audit policy.

      audit.db           SHOULD NOT be tracked.  Binary file; derived
                         from annotations.jsonl by tooling.  Include
                         in .ai-audit/.gitignore.

      blobs/             MAY be tracked.  Depends on whether High
                         assurance reasoning data should be version-
                         controlled.

   Projects SHOULD include a .ai-audit/.gitignore containing audit.db.


5.  Hash Specification

5.1.  Algorithm

   SHA-256 as defined in FIPS 180-4 [FIPS180].

5.2.  Canonicalization

   To ensure deterministic hashing regardless of key ordering or
   formatting:

      1. Construct a hashable object by removing the "created_at" field
         from the context object.

      2. Serialize to JSON with sorted keys and no whitespace (no
         spaces after colons or commas).

      3. Encode the result as UTF-8.

      4. Compute the SHA-256 digest.

      5. Express the digest as a lowercase hexadecimal string
         (64 characters).

   The "created_at" field is excluded from hash computation so that
   identical contexts created at different times produce the same hash.

   Example canonical form for an environment context:

      {"model_name":"claude-opus-4-5","model_parameters":{"max_tokens":
      4096,"temperature":0.7},"model_version":"20251101","tool_extensio
      ns":["filesystem","git"],"tool_name":"Claude Code","tool_version"
      :"1.5.2","type":"environment"}

   Pseudocode:

      function compute_context_hash(context):
          hashable = remove_key(context, "created_at")
          canonical = json_serialize(hashable, sorted_keys=true,
                                     separators=[",", ":"])
          return sha256(utf8_encode(canonical)).hex_lower()

5.3.  Short References

   For display and human-readable contexts, the first 16 characters of
   the hex digest MAY be used as a short reference.  The full
   64-character hash is always stored and is authoritative for lookups.

   Example:

      Short:  a1b2c3d4e5f6a7b8
      Full:   a1b2c3d4e5f6a7b8091011121314151617181920212223242526...

5.4.  Annotation ID Computation

   An annotation ID is a deterministic SHA-256 identifier for any
   annotation record (line, function, session, edge, or delegation).
   It enables stable references from edge records and cross-record
   linkage.

   Computation procedure:

      1. Take the full annotation record as a JSON object.

      2. Remove the "annotation_id" field if present.

      3. Serialize with sorted keys and no whitespace (no spaces
         after colons or commas).

      4. Encode the result as UTF-8.

      5. Compute the SHA-256 digest.

      6. Express the digest as a lowercase hexadecimal string
         (64 characters).

   Pseudocode:

      function compute_annotation_id(record):
          hashable = remove_key(record, "annotation_id")
          canonical = json_serialize(hashable, sorted_keys=true,
                                     separators=[",", ":"])
          return sha256(utf8_encode(canonical)).hex_lower()

   Note: Unlike context hashes (Section 5.2), the annotation ID
   excludes the annotation_id field itself rather than the
   "created_at" field.  The timestamp remains part of the hashed
   content because two otherwise-identical annotations at different
   times represent distinct audit events.


6.  Manifest File Specification

   The manifest file (.ai-audit/manifest.json) maps context hashes to
   their full content.  It is the single reference file for all tooling.

6.1.  Top-Level Structure

      {
        "standard": "VIBES",
        "version": "1.0",
        "entries": {
          "<hash>": { ... },
          "<hash>": { ... }
        }
      }

   Fields:

      standard    REQUIRED.  String.  Always "VIBES".
      version     REQUIRED.  String.  Standard version (semver).
      entries     REQUIRED.  Object.  Map of hash to context object.

6.2.  Environment Context Entry

   Present at all assurance levels.

      {
        "e7a3f1b2c4d5e6f7...": {
          "type": "environment",
          "tool_name": "Claude Code",
          "tool_version": "1.5.2",
          "model_name": "claude-opus-4-5",
          "model_version": "20251101",
          "model_parameters": {
            "temperature": 1.0
          },
          "tool_extensions": ["filesystem", "git"],
          "created_at": "2026-02-03T14:30:00Z"
        }
      }

   Fields:

      type              REQUIRED.  Always "environment".
      tool_name         REQUIRED.  String.
      tool_version      REQUIRED.  String.  Semantic version.
      model_name        REQUIRED.  String.
      model_version     REQUIRED.  String.
      model_parameters  OPTIONAL.  Object.  Key-value pairs.
      tool_extensions   OPTIONAL.  Array of strings.
      created_at        REQUIRED.  String.  ISO-8601 timestamp.

6.3.  Prompt Context Entry

   Present at Medium and High assurance.

      {
        "a1b2c3d4e5f6a7b8...": {
          "type": "prompt",
          "prompt_text": "Add input validation to the signup form.",
          "prompt_type": "user_instruction",
          "prompt_context_files": [
            "src/routes/auth.py",
            "src/models/user.py"
          ],
          "created_at": "2026-02-03T10:05:00Z"
        }
      }

   Fields:

      type                  REQUIRED.  Always "prompt".
      prompt_text           REQUIRED.  String.  Full prompt text.
      prompt_type           REQUIRED.  String.  One of:
                            "user_instruction", "edit_command",
                            "chat_message", "inline_completion",
                            "review_request", "refactor_request",
                            "other".
      prompt_context_files  OPTIONAL.  Array of strings.  Relative
                            paths from project root.
      created_at            REQUIRED.  String.  ISO-8601.

6.4.  Command Context Entry

   Present at all assurance levels.  Records tool invocations and shell
   commands executed by the AI agent.

      {
        "d4e5f6a7b8c9d0e1...": {
          "type": "command",
          "command_text": "npm install express",
          "command_type": "shell",
          "command_exit_code": 0,
          "command_output_summary": "added 57 packages in 2.3s",
          "working_directory": "src/",
          "created_at": "2026-02-03T14:30:10Z"
        }
      }

   Fields:

      type                    REQUIRED.  Always "command".
      command_text            REQUIRED.  String.  The full command or
                              tool invocation text.
      command_type            REQUIRED.  String.  One of:
                              "shell", "file_write", "file_read",
                              "file_delete", "api_call", "tool_use",
                              "other".
      command_exit_code       OPTIONAL.  Integer.  Exit code for shell
                              commands.  NULL for non-shell types.
      command_output_summary  OPTIONAL.  String.  Truncated or
                              summarized output (max 1024 bytes
                              RECOMMENDED).
      working_directory       OPTIONAL.  String.  Relative path from
                              project root where the command executed.
      created_at              REQUIRED.  String.  ISO-8601.

   Security note: Command output may contain sensitive data.
   Implementations SHOULD truncate or redact output when operating in
   public repositories.


6.5.  Reasoning Context Entry

   Present at High assurance only.

      {
        "c3d4e5f6a7b8c9d0...": {
          "type": "reasoning",
          "reasoning_text": "Let me analyze the existing validation...",
          "reasoning_token_count": 2847,
          "reasoning_model": "claude-opus-4-5",
          "created_at": "2026-02-03T14:30:05Z"
        }
      }

   Fields:

      type                    REQUIRED.  Always "reasoning".
      reasoning_text          REQUIRED.  String.  Full CoT text.
      reasoning_token_count   OPTIONAL.  Integer.
      reasoning_model         REQUIRED.  String.  Model that produced
                              the reasoning.  Set to the environment
                              model if only one model is used.
      created_at              REQUIRED.  String.  ISO-8601.

6.6.  Decision Context Entry

   Present at all assurance levels when decision tracking is enabled.
   Records a structured decision point where the AI agent considered
   multiple alternatives and selected one.

      {
        "d5e6f7a8b9c0d1e2...": {
          "type": "decision",
          "decision_point": "Choose HTTP framework for new service",
          "options": [
            {
              "id": "express",
              "description": "Express.js - minimal, flexible",
              "pros": ["mature ecosystem", "lightweight"],
              "cons": ["manual middleware setup"]
            },
            {
              "id": "fastify",
              "description": "Fastify - performance-focused",
              "pros": ["faster throughput", "schema validation"],
              "cons": ["smaller ecosystem"]
            }
          ],
          "selected": "fastify",
          "rationale": "Performance requirements favor Fastify; built-in
              schema validation reduces middleware dependencies.",
          "confidence": "high",
          "created_at": "2026-02-03T10:02:00Z"
        }
      }

   Fields:

      type              REQUIRED.  Always "decision".
      decision_point    REQUIRED.  String.  Human-readable description
                        of the decision being made.
      options           REQUIRED.  Array of objects.  Each object:
                           id          REQUIRED.  String.  Unique
                                       identifier for this option.
                           description REQUIRED.  String.  Brief
                                       description of the option.
                           pros        OPTIONAL.  Array of strings.
                           cons        OPTIONAL.  Array of strings.
      selected          REQUIRED.  String.  The id of the chosen
                        option from the options array.
      rationale         REQUIRED.  String.  Explanation of why this
                        option was selected.
      confidence        OPTIONAL.  String.  One of: "high", "medium",
                        "low".
      created_at        REQUIRED.  String.  ISO-8601 timestamp.

6.7.  Size Management

   At High assurance, reasoning traces can be substantial.
   Implementations SHOULD use these mechanisms:

   Inline compression:

      When reasoning_text exceeds the configured threshold (default
      10,240 bytes), store as gzip-compressed base64:

      {
        "c3d4e5f6a7b8c9d0...": {
          "type": "reasoning",
          "reasoning_text_compressed": "H4sIAAAAAAAAA8tIzc...",
          "compressed": true,
          "reasoning_token_count": 2847,
          "created_at": "2026-02-03T14:30:05Z"
        }
      }

   External blobs:

      When entries exceed the external blob threshold (default 102,400
      bytes), store in a separate file:

      {
        "c3d4e5f6a7b8c9d0...": {
          "type": "reasoning",
          "external": true,
          "blob_path": "blobs/c3d4e5f6a7b8c9d0.json.gz",
          "reasoning_token_count": 2847,
          "created_at": "2026-02-03T14:30:05Z"
        }
      }


7.  Annotation Log

   The annotation log (.ai-audit/annotations.jsonl) is an append-only
   file that records all audit events.  Each line is a self-contained
   JSON object representing one annotation.  This is the canonical
   audit trail -- all other representations (query databases, reports)
   are derived from it.

7.1.  Format Selection Rationale

   The annotation log uses JSON Lines (JSONL) format.

   Rationale:

      - Git-native.  Each annotation is one line.  git-diff shows
        exactly which records were added.  Audit data is reviewable in
        pull requests, producing meaningful, human-readable diffs.

      - Append-only.  Writing an annotation is appending a line.  No
        transactions, no connection management, no binary format.

      - Universal.  Every programming language can read and write JSON.
        No library dependencies beyond a JSON parser.

      - Queryable via tooling.  DuckDB can query JSONL files directly
        without import.  Command-line tools like jq enable ad-hoc
        filtering.  Compliant tooling MAY generate a SQLite or DuckDB
        database for indexed queries (see Section 7.8).

      - Inspectable.  Any text editor, grep, or cat can read the audit
        trail.  No specialized tooling required for basic inspection.

   Alternatives considered:

      SQLite:  Full SQL queryability and indexing, but produces opaque
      binary diffs in git.  The audit trail itself cannot be audited in
      version control.  Retained as a derived query database (Section
      7.8), not the canonical store.

      Binary formats (BSON, MessagePack, CBOR, Protobuf):  Offer
      15-60% size reduction but sacrifice git diffability and human
      readability.  CBOR (IETF RFC 8949) MAY be considered as a future
      wire format for cross-tool interchange.

7.2.  Record Types

   Each line in annotations.jsonl is a JSON object with a "type" field
   that determines its schema.  Five record types are defined:

      line       Line-level annotation.  Associates a line range with
                 context hashes.

      function   Function-level annotation.  Associates a function,
                 method, or class with context hashes.

      session    Session lifecycle event.  Records session start and
                 end events.

      edge       Context graph relationship.  Captures a directed
                 link between two audit events or context entries.

      delegation Multi-agent orchestration event.  Records the act
                 of one session spawning or tasking another session.

7.3.  Line Annotation Record

   Example:

      {"type":"line","file_path":"src/auth.py","line_start":1,
       "line_end":45,"environment_hash":"e7a3f1b2...",
       "prompt_hash":"a1b2c3d4...","action":"create",
       "timestamp":"2026-02-03T10:05:00Z","commit_hash":"abc123",
       "session_id":"550e8400-...","assurance_level":"medium"}

   Fields:

      type              REQUIRED.  Always "line".

      file_path         REQUIRED.  Relative path from project root.
                        Forward slashes on all platforms.

      line_start        REQUIRED.  First line of the affected range.
                        1-based, inclusive.

      line_end          REQUIRED.  Last line of the affected range.
                        1-based, inclusive.

      environment_hash  REQUIRED.  References an environment entry in
                        the manifest.

      command_hash      OPTIONAL.  References a command entry in the
                        manifest.  Null when no command was involved.

      prompt_hash       OPTIONAL.  References a prompt entry in the
                        manifest.  Null at Low assurance.

      reasoning_hash    OPTIONAL.  References a reasoning entry in the
                        manifest.  Null at Low and Medium assurance.

      action            REQUIRED.  One of: "create", "modify",
                        "delete", "review", "rebase_remap",
                        "rebase_orphan".  See Section 7.9.

      timestamp         REQUIRED.  ISO-8601 timestamp of the event
                        (UTC).

      commit_hash       REQUIRED.  Git commit SHA linking this
                        annotation to version control history.
                        Backfilled at commit time for annotations
                        created during the session.

      session_id        OPTIONAL.  Groups related annotations within a
                        tool session.  UUID RECOMMENDED.

      assurance_level   REQUIRED.  One of: "low", "medium", "high".

      annotation_id     REQUIRED.  String.  Content-derived SHA-256
                        identifier for this annotation record,
                        computed per Section 5.4.  Enables edge
                        records to reference any annotation.

      decision_hash     OPTIONAL.  String.  References a decision
                        context entry in the manifest (Section 6.6).

      risk_score        OPTIONAL.  Number.  Aggregate Generation
                        Risk Score, 0.0-1.0.  Computed by the
                        implementation's scoring function (reference
                        algorithm or custom).  See Section 10.4.

      risk_factors      OPTIONAL.  Array.  Array of individual
                        signal assessments.  Each element is an
                        object with "signal" (string, REQUIRED),
                        "value" (number 0.0-1.0, REQUIRED), "weight"
                        (number, REQUIRED), and "reason" (string,
                        OPTIONAL).  Enables auditability of the
                        score.  See Section 10.4.4.

7.3.1.  Line Number Stability

   Line numbers in annotation records are best-effort temporal
   references.  They reflect the state of the file at annotation
   time and are accurate relative to the commit_hash recorded in
   the same annotation.  They may become stale after operations
   that rewrite git history (rebase, squash, amend) or after
   subsequent edits that shift line ranges.  Consumers MUST NOT
   assume that line_start and line_end are accurate relative to
   the current HEAD of a file.

   Content Anchoring

   Implementations SHOULD include the following optional fields
   on all line and function annotation records to enable
   content-based relocation after line shifts:

      anchor_context    OPTIONAL.  String.  The first three
                        lines of the annotated range at
                        annotation time, truncated to 256
                        bytes.  Enables fuzzy re-matching
                        after line number shifts.

      anchor_hash       OPTIONAL.  String.  SHA-256 of the
                        full content from line_start through
                        line_end at annotation time.  Enables
                        exact-match detection for content
                        drift.

      file_content_hash OPTIONAL.  String.  SHA-256 of the
                        entire file at annotation time.  If
                        this matches the current file content,
                        line numbers are still valid and no
                        anchor search is needed.

   When consuming annotations, implementations SHOULD check
   file_content_hash first -- if it matches the current file
   content, line numbers are still valid.  If not, use
   anchor_context as a substring search to relocate the
   annotated range.  Use anchor_hash to verify the relocated
   content is an exact match.

   Pre-Commit Hook for Rebase Remapping

   Projects that frequently rebase SHOULD install a pre-commit
   hook that recalculates line numbers for annotations affected
   by history rewrites.  The hook MUST:

      1. Read annotations.jsonl and identify line and function
         annotations whose commit_hash matches commits being
         rewritten.

      2. Fast-path: For each affected annotation, compare
         file_content_hash against the current file's SHA-256.
         If they match, the file is unchanged -- skip anchor
         search and preserve the annotation as-is.

      3. For each affected annotation with anchor_context,
         search the rewritten file for the anchor substring.

      4. If the anchor is found at new line numbers, append a
         new annotation record with updated line_start and
         line_end, the new commit_hash, action
         "rebase_remap", and a supersedes edge referencing
         the original annotation_id.  The original record is
         preserved (append-only invariant).

      5. If the anchor is NOT found, append a record with
         action "rebase_orphan" marking the annotation as
         unmatchable.

      6. Original records are NEVER modified or deleted.

   This hook is RECOMMENDED, not REQUIRED.  Without it, line
   annotations degrade gracefully -- they remain valid relative
   to their recorded commit_hash but may not match the current
   file state.

   Supersedes Edge for Rebase Remapping

   When a rebase_remap annotation is emitted, the implementation
   MUST also append a supersedes edge record linking the new
   annotation to the original:

      {"type":"edge","edge_type":"supersedes",
       "source_id":"<new_annotation_id>",
       "target_id":"<original_annotation_id>",
       "timestamp":"2026-02-03T10:10:00Z",
       "session_id":"..."}

   This preserves the audit chain -- consumers can trace a
   remapped annotation back to its original record.

7.4.  Function Annotation Record

   Same fields as line annotation, with line_start and line_end
   replaced by:

      type                REQUIRED.  Always "function".

      function_name       REQUIRED.  Name of the function, method, or
                          class.

      function_signature  OPTIONAL.  Full signature including
                          parameters and return type.

   Function annotations also support the annotation_id and
   decision_hash fields described in Section 7.3 (Line Annotation
   Record).

   Function annotations also support the risk_score and
   risk_factors fields described in Section 7.3, with the same
   semantics as line annotations.  See Section 10.4.

7.5.  Session Record

   Example:

      {"type":"session","event":"start",
       "session_id":"550e8400-e29b-41d4-a716-446655440000",
       "timestamp":"2026-02-03T10:00:00Z",
       "environment_hash":"e7a3f1b2...",
       "assurance_level":"medium",
       "description":"Adding signup validation"}

      {"type":"session","event":"end",
       "session_id":"550e8400-e29b-41d4-a716-446655440000",
       "timestamp":"2026-02-03T11:30:00Z"}

   Fields:

      type              REQUIRED.  Always "session".

      event             REQUIRED.  One of: "start", "end".

      session_id        REQUIRED.  Unique identifier.  UUID
                        RECOMMENDED.

      timestamp         REQUIRED.  ISO-8601 timestamp.

      environment_hash  REQUIRED on "start" events.  The primary
                        environment for this session.

      assurance_level   REQUIRED on "start" events.  The tier
                        configured for this session.

      description       OPTIONAL.  Human-readable session description.

      parent_session_id OPTIONAL.  String.  UUID of the parent
                        session that delegated work to this session.
                        Null for top-level sessions.

      agent_name        OPTIONAL.  String.  Human-readable name of
                        the agent instance (e.g., "worker-1",
                        "reviewer").

      agent_type        OPTIONAL.  String.  Type or family of the AI
                        agent (e.g., "claude-code", "cursor",
                        "maestro").

7.6.  Edge Record

   An edge record captures a directed relationship between two audit
   events or context entries.  Edges enable provenance graph
   construction and causal reasoning across the annotation log.

   Example:

      {"type":"edge","edge_type":"caused_by",
       "source_ref":"f9a8b7c6d5e4f3a2...",
       "source_type":"annotation",
       "target_ref":"a1b2c3d4e5f6a7b8...",
       "target_type":"context",
       "timestamp":"2026-02-03T10:06:00Z",
       "session_id":"550e8400-..."}

   Convention: edges read as "source [edge_type] target".  The source
   is the effect; the target is the cause.

   Fields:

      type              REQUIRED.  Always "edge".

      edge_type         REQUIRED.  One of: "caused_by", "depends_on",
                        "informed_by", "delegated_to", "supersedes",
                        "reviewed_by".

      source_ref        REQUIRED.  Hash reference identifying the
                        source node (annotation_id, context hash,
                        or session_id).

      source_type       REQUIRED.  One of: "annotation", "context",
                        "session".

      target_ref        REQUIRED.  Hash reference identifying the
                        target node.

      target_type       REQUIRED.  One of: "annotation", "context",
                        "session".

      timestamp         REQUIRED.  ISO-8601 timestamp.

      session_id        OPTIONAL.  UUID of the session that created
                        this edge.

      metadata          OPTIONAL.  Object.  Arbitrary key-value pairs
                        for tool-specific edge properties.

7.7.  Delegation Record

   A delegation record captures the act of one session (the parent)
   spawning or tasking another session (the child) for multi-agent
   orchestration.

   Example:

      {"type":"delegation",
       "parent_session_id":"aa110000-...",
       "child_session_id":"bb220000-...",
       "timestamp":"2026-02-04T09:00:00Z",
       "task_description":"Implement rate-limit middleware",
       "delegated_files":["src/middleware/rate_limit.py"],
       "delegation_type":"task"}

   Fields:

      type              REQUIRED.  Always "delegation".

      parent_session_id REQUIRED.  UUID of the orchestrator or parent
                        session.

      child_session_id  REQUIRED.  UUID of the spawned child session.

      timestamp         REQUIRED.  ISO-8601 timestamp.

      task_description  OPTIONAL.  String.  Human-readable description
                        of the delegated task.

      delegated_files   OPTIONAL.  Array of strings.  File paths
                        relevant to the delegated task.

      delegation_type   OPTIONAL.  One of: "task", "review", "test",
                        "refactor", "other".

      parent_environment_hash
                        OPTIONAL.  String.  Environment context hash
                        of the orchestrator session.

      child_environment_hash
                        OPTIONAL.  String.  Environment context hash
                        of the child agent session.

7.8.  Derived Query Database

   Compliant tooling MAY generate a query database from
   annotations.jsonl for indexed SQL queries.  Two approaches are
   supported:

   Direct query (no import step):  DuckDB can query JSONL files
   directly:

      SELECT * FROM read_json_auto('.ai-audit/annotations.jsonl')
      WHERE type = 'line' AND file_path = 'src/auth.py';

   Generated SQLite database:  Tooling MAY generate .ai-audit/audit.db
   from annotations.jsonl for persistent indexed queries.  This
   database is a derived artifact and MUST be listed in
   .ai-audit/.gitignore.

7.9.  Actions

      create    The AI tool generated new code at this location.

      modify    The AI tool changed existing code at this location.

      delete    The AI tool removed code at this location.

      review    The AI tool reviewed but did not change code at this
                location.  High assurance MAY record review reasoning.

      rebase_remap
                Line or function annotation remapped to new line
                numbers after a history rewrite (rebase, squash,
                amend).  The original annotation is preserved.
                A supersedes edge MUST link the new annotation
                to the original.  See Section 7.3.1.

      rebase_orphan
                Line or function annotation that could not be
                remapped after a history rewrite.  The content
                anchor was not found in the rewritten file.
                The original annotation is preserved but marked
                as orphaned.  See Section 7.3.1.

7.10.  JSONL Formatting Rules

      1. Each record MUST be a single line.  No line breaks within a
         record.

      2. Records MUST be separated by a single newline character.

      3. The file SHOULD end with a trailing newline.

      4. Field order within a record is not significant, but
         implementations SHOULD use a consistent order.

      5. Null values for optional fields MAY be omitted or explicitly
         included.  Implementations MUST handle both cases.


8.  Configuration

8.1.  Configuration Schema

   The file .ai-audit/config.json defines project-level audit policy.

      {
        "standard": "VIBES",
        "standard_version": "1.0",
        "assurance_level": "medium",
        "project_name": "my-project",
        "tracked_extensions": [
          ".py", ".js", ".ts", ".rs", ".go", ".java",
          ".c", ".cpp", ".rb"
        ],
        "exclude_patterns": [
          "**/node_modules/**",
          "**/vendor/**",
          "**/.venv/**",
          "**/dist/**"
        ],
        "compress_reasoning_threshold_bytes": 10240,
        "external_blob_threshold_bytes": 102400,
        "risk_scoring": {
          "enabled": false
        }
      }

   Fields:

      standard             REQUIRED.  Always "VIBES".
      standard_version     REQUIRED.  Version of this standard.
      assurance_level      REQUIRED.  "low", "medium", or "high".
      project_name         REQUIRED.  Human-readable project name.
      tracked_extensions   OPTIONAL.  File extensions to track.
                           Default: all files.
      exclude_patterns     OPTIONAL.  Glob patterns to skip.
                           Default: empty.
      compress_reasoning_threshold_bytes
                           OPTIONAL.  Integer.  Default: 10240.
      external_blob_threshold_bytes
                           OPTIONAL.  Integer.  Default: 102400.
      risk_scoring         OPTIONAL.  Object.  PRISM configuration.
                           Default: null.  When present, enables
                           PRISM score computation.  See
                           Section 10.4.6 for the full schema
                           definition including weights, thresholds,
                           and review rules.

8.2.  Storage Strategies for High Assurance

   Chain-of-thought traces at High assurance can range from 10 KB to
   500 KB per generation event.  Two mechanisms are defined:

   Inline compression:  When reasoning_text exceeds the configured
   compress_reasoning_threshold_bytes, the manifest entry stores a
   gzip-compressed, base64-encoded value in the field
   "reasoning_text_compressed" and sets "compressed": true.

   External blobs:  When entries exceed external_blob_threshold_bytes,
   the full content is stored in .ai-audit/blobs/<hash>.json.gz and
   the manifest entry includes "external": true with a "blob_path"
   field.

   Implementations MUST support reading both compressed and external
   entries.  Implementations MAY support writing only inline entries at
   their discretion.


9.  Tool Integration

9.1.  Hook Points

   Tools interact with the audit system at five points:

      Session start:     When the tool begins an interactive session.
                         Data available: environment context.

      Command execution: When the tool executes a shell command, writes
                         a file, calls an API, or invokes any tool.
                         Data available: command context.

      Pre-generation:    Before sending a prompt to the model.
                         Data available: environment + prompt context.

      Post-generation:   After receiving the model's response.
                         Data available: environment + prompt +
                         reasoning context, generated code with line
                         numbers.

      Commit-time:       When changes are committed to git.
                         Data available: all contexts + commit hash.

9.2.  Minimum Implementation by Level

   Low assurance:

      1. On session start, compute environment hash and write to
         manifest if new.  Append a session start record to
         annotations.jsonl.

      2. On command execution, compute command hash and write command
         entry to manifest if new.

      3. On post-generation, append line annotation records to
         annotations.jsonl with environment_hash and command_hash.

      4. On commit-time, backfill commit_hash on all annotation records
         created since the last commit.  This field is REQUIRED —
         annotations MUST be linked to their git commit before the
         audit data is considered complete.

   Medium assurance:

      1. Everything in Low.

      2. On pre-generation, compute prompt hash and write prompt entry
         to manifest if new.

      3. On post-generation, include prompt_hash in annotation records.

   High assurance:

      1. Everything in Medium.

      2. On post-generation, capture chain-of-thought output, compute
         reasoning hash, write to manifest (with compression if
         needed), include reasoning_hash in annotation records.

   Edge tracking (all levels):

      Implementations MUST append edge records to annotations.jsonl
      whenever a causal, dependency, or informational link between
      two audit events can be established.  At minimum, tools MUST
      emit caused_by edges linking code annotations to the prompt
      or command that triggered them, and delegated_to edges when
      spawning sub-agents.  Tools SHOULD also emit informed_by
      edges when reading files before generation.

   Decision tracking (all levels):

      Implementations MUST write a decision context entry to the
      manifest when the AI agent evaluates multiple approaches and
      selects one, and MUST include the decision_hash in the
      corresponding annotation records.  This provides complete
      visibility into why specific code was generated.

   Multi-agent delegation (all levels):

      When an orchestrator session spawns or tasks a child session,
      the orchestrator MUST append a delegation record to
      annotations.jsonl.  The child session's start record MUST
      include parent_session_id referencing the orchestrator.
      Implementations SHOULD record agent_name and agent_type on
      session start records in multi-agent topologies.

9.3.  Concurrency

   Multiple tool instances MAY write to the same audit files
   simultaneously (e.g., parallel AI-assisted development in a
   monorepo).

   Implementations MUST:

      - Use atomic append operations for annotations.jsonl.  On POSIX
        systems, writes up to PIPE_BUF (typically 4096 bytes) to a
        file opened with O_APPEND are atomic.  Since each JSONL record
        is a single line well within this limit, concurrent appends
        are safe.

      - Use atomic file writes (write to temp file, rename) when
        updating manifest.json.

      - Use file locking or retry logic for manifest updates.

   Known scaling characteristic (v1.0):  The file-locking approach
   for manifest.json creates a write bottleneck in workflows with
   many concurrent agents (e.g., 10+ parallel workers in a
   monorepo).  Each agent MUST acquire a lock, read, merge, and
   write the entire manifest file serially.  This is appropriate
   for most single-agent and moderate multi-agent workflows but
   does not scale to high-concurrency orchestration.

   Future direction:  A subsequent version of this standard MAY
   introduce per-agent write-ahead logs (e.g.,
   .ai-audit/wal/<agent-id>.jsonl) that are merged into the
   canonical manifest at commit time.  This eliminates lock
   contention during active development while preserving the
   single-manifest invariant at rest.  Implementations
   experimenting with write-ahead approaches SHOULD document
   their strategy and ensure the merge step is atomic.

   Interim mitigation:  As a simpler alternative, implementations
   MAY convert manifest.json to append-only JSONL format (one
   context entry per line), eliminating the read-modify-write
   cycle.  Duplicate entries are resolved by hash at query time
   or during audit.db rebuild.


10.  Querying and Reporting

   The annotation log supports multiple query methods, from simple
   command-line tools to full SQL.

10.1.  Command-Line Queries

   Find all annotations for a file:

      grep '"src/auth.py"' .ai-audit/annotations.jsonl | jq .

   List unique environment hashes:

      jq -s '[.[] | select(.type == "line")
        | .environment_hash] | unique' \
        .ai-audit/annotations.jsonl

10.2.  SQL Queries (DuckDB -- direct, no import)

   DuckDB can query annotations.jsonl directly without an import step.

   What percentage of a file was AI-generated?

      SELECT
          file_path,
          SUM(line_end - line_start + 1) AS ai_lines
      FROM read_json_auto('.ai-audit/annotations.jsonl')
      WHERE type = 'line'
        AND file_path = 'src/auth.py'
        AND action = 'create'
      GROUP BY file_path;

   Which models contributed to a function?

      SELECT DISTINCT
          function_name,
          environment_hash
      FROM read_json_auto('.ai-audit/annotations.jsonl')
      WHERE type = 'function'
        AND file_path = 'src/auth.py'
        AND function_name = 'validate_signup';

   All code generated from a specific prompt:

      SELECT
          file_path, line_start, line_end, action, timestamp
      FROM read_json_auto('.ai-audit/annotations.jsonl')
      WHERE type = 'line'
        AND prompt_hash LIKE 'b2c3d4e5f6a7b8c9%'
      ORDER BY file_path, line_start;

   Timeline of AI modifications to a file:

      SELECT
          timestamp, line_start, line_end, action,
          environment_hash, commit_hash
      FROM read_json_auto('.ai-audit/annotations.jsonl')
      WHERE type = 'line'
        AND file_path = 'src/auth.py'
      ORDER BY timestamp;

10.3.  SQL Queries (Generated SQLite Database)

   If a query database has been generated (see Section 7.8), the same
   queries work with standard SQLite syntax against audit.db:

      SELECT file_path, SUM(line_end - line_start + 1) AS ai_lines
      FROM line_annotations
      WHERE file_path = 'src/auth.py' AND action = 'create'
      GROUP BY file_path;


10.4.  PRISM (Provenance & Risk Intelligence Scoring Model)

   PRISM is a standalone extension on top of VIBES that integrates
   with VERIFY for attested risk scores and EVOLVE for agent
   learning feedback.  It is an OPTIONAL extension that quantifies
   the risk profile of an AI-generated code change based on
   metadata available at annotation time.  It transforms audit
   data from a passive log into an actionable signal for CI/CD
   pipelines and human reviewers.

   PRISM is designed as a framework, not a fixed formula.  This
   standard defines a signal vocabulary and a reference scoring
   algorithm.  Implementations MAY use the reference algorithm
   as-is, customize its weights, or substitute an entirely
   different scoring function.  The only normative requirement is
   the storage format -- if an implementation computes a risk
   score, it MUST store it using the fields defined in Section
   10.4.4.

10.4.1.  Signal Vocabulary

   The following named signals MAY contribute to a risk score.
   Each signal produces a normalized value between 0.0 (lowest
   risk) and 1.0 (highest risk).  Implementations are not
   required to compute all signals -- unavailable signals are
   omitted from the risk_factors array.

      temperature         Source: environment context.  Model
                          temperature parameter.  Higher
                          temperature means higher randomness
                          and higher risk.  Normalize:
                          min(temperature / 2.0, 1.0).

      action_type         Source: annotation record.  Risk
                          varies by action.  Reference values:
                          create = 0.6, modify = 0.4,
                          delete = 0.8, review = 0.1.

      scope_lines         Source: annotation record.  Normalized
                          line count: (line_end - line_start + 1)
                          / total_file_lines.  Larger scope means
                          higher risk.  Cap at 1.0.

      scope_files         Source: session context.  Number of
                          files touched in the session, normalized
                          against a configurable baseline
                          (default: 20 files = 1.0).
                          Cross-cutting changes indicate higher
                          risk.

      assurance_gap       Source: config + annotation.  Less
                          metadata captured means higher risk.
                          low = 1.0, medium = 0.5, high = 0.0.

      human_review_present
                          Source: edge records.  0.0 if a
                          reviewed_by edge exists targeting this
                          annotation, 1.0 if no review edge is
                          found.

      prompt_token_count  Source: prompt context.  Normalized
                          prompt length.  Longer prompts indicate
                          more complex instructions.  Normalize
                          against configurable baseline
                          (default: 2000 tokens = 1.0).

      model_capability_tier
                          Source: environment context or registry.
                          Self-reported or registry-derived
                          capability tier.  Smaller or older
                          models indicate higher risk.
                          Implementation-defined normalization.

   Extensibility:  Implementations MAY define additional custom
   signals.  Custom signal names SHOULD use a namespace prefix
   (e.g., "custom:security_sensitive", "org:compliance_tier") to
   avoid collision with future standard signals.

10.4.2.  Reference Scoring Algorithm

   This standard provides a reference algorithm as a default.
   Implementations MAY use this algorithm, customize its weights,
   or substitute their own scoring function entirely.

   The reference algorithm computes PRISM as a weighted average of
   available signals:

      PRISM = sum(weight_i * value_i) / sum(weight_i)
            for all available signals

   Signals that are unavailable (e.g., prompt_token_count at Low
   assurance) are excluded from both the numerator and
   denominator.  Their weight is redistributed proportionally to
   available signals.

   Default weights:

      Signal                     Default Weight
      -------------------------  --------------
      temperature                     0.15
      action_type                     0.15
      scope_lines                     0.15
      scope_files                     0.10
      assurance_gap                   0.10
      human_review_present            0.15
      prompt_token_count              0.10
      model_capability_tier           0.10

   Alternative: Rule-based gating.  Organizations MAY skip the
   scoring algorithm entirely and define rule-based triggers
   instead.  For example: "require review if temperature > 0.8
   AND action_type == 'create' AND scope_lines > 100".
   Rule-based gating is more transparent and auditable than score
   thresholds.  See Section 10.4.6 for the review_rules
   configuration.

10.4.3.  Severity Bands

   The following bands are RECOMMENDED defaults for the reference
   algorithm.  Organizations SHOULD configure thresholds based on
   their risk appetite.

      Band       PRISM Range   Recommended Action
      --------   -----------   ----------------------------
      Low        0.00 - 0.29   No special action required
      Medium     0.30 - 0.59   Flag for awareness in PR
                                summary
      High       0.60 - 0.79   Recommend human review
                                before merge
      Critical   0.80 - 1.00   Block merge; require senior
                                or security review

10.4.4.  Storage Format

   If an implementation computes risk information, it MUST store
   it using these fields on line and function annotation records:

      risk_score     OPTIONAL.  Number.  Aggregate risk score,
                     0.0-1.0.  Computed by the implementation's
                     scoring function (reference algorithm or
                     custom).

      risk_factors   OPTIONAL.  Array.  Array of individual
                     signal assessments.  Each element is an
                     object with the following fields:

                        signal   REQUIRED.  String.  Signal
                                 name from the vocabulary
                                 (Section 10.4.1) or a
                                 namespaced custom signal.

                        value    REQUIRED.  Number.
                                 Normalized signal value,
                                 0.0-1.0.

                        weight   REQUIRED.  Number.  Weight
                                 applied to this signal in
                                 the scoring function.

                        reason   OPTIONAL.  String.
                                 Human-readable explanation
                                 of the signal value.

   Example annotation with PRISM:

      {"type":"line","file_path":"src/auth.py",
       "line_start":1,"line_end":45,...,
       "risk_score":0.72,"risk_factors":[
       {"signal":"temperature","value":0.9,
       "weight":0.15,
       "reason":"Near-maximum randomness"},
       {"signal":"action_type","value":0.6,
       "weight":0.15},
       {"signal":"scope_lines","value":0.85,
       "weight":0.15,
       "reason":"45 of 53 lines (85%)"},
       {"signal":"assurance_gap","value":0.5,
       "weight":0.10},
       {"signal":"human_review_present","value":1.0,
       "weight":0.15,
       "reason":"No review edge found"}]}

   The risk_factors array provides transparency -- reviewers and
   CI/CD systems can inspect which signals drove the score, not
   just the aggregate number.

10.4.5.  CI/CD Integration

   Pipelines can query PRISM scores to enforce review policies.

   Score-based gating (bash):

      # Fail if any annotation has PRISM >= 0.8
      MAX_PRISM=$(jq -rs \
        '[.[] | select(.type == "line"
          and .risk_score != null)
          | .risk_score] | max // 0' \
        .ai-audit/annotations.jsonl)
      if (( $(echo "$MAX_PRISM >= 0.8" \
        | bc -l) )); then
        echo "BLOCKED: Critical risk ($MAX_PRISM)."
        exit 1
      fi

   Factor-based gating (bash):

      # Fail if any annotation has temperature > 0.9
      HIGH_TEMP=$(jq -rs \
        '[.[] | select(.risk_factors != null)
          | .risk_factors[]
          | select(.signal == "temperature"
            and .value > 0.9)]
          | length' \
        .ai-audit/annotations.jsonl)
      if [ "$HIGH_TEMP" -gt 0 ]; then
        echo "BLOCKED: $HIGH_TEMP high-temp annotations."
        exit 1
      fi

   DuckDB query -- high-risk annotations in a commit:

      SELECT file_path, line_start, line_end,
             risk_score, risk_factors
      FROM read_json_auto(
          '.ai-audit/annotations.jsonl')
      WHERE type = 'line'
        AND risk_score >= 0.6
        AND commit_hash = '<COMMIT_SHA>'
      ORDER BY risk_score DESC;

   PR summary generation (bash):

      jq -rs '
        [.[] | select(.type == "line"
          and .risk_score != null
          and .risk_score >= 0.3)]
        | group_by(.file_path)
        | map({
            file: .[0].file_path,
            max_risk: ([.[].risk_score] | max),
            count: length})
        | sort_by(-.max_risk)
        | .[]
        | "\(.file) - max PRISM: \(.max_risk)
           (\(.count) annotations)"
      ' .ai-audit/annotations.jsonl

10.4.6.  PRISM Configuration

   PRISM is configured via an optional risk_scoring object in
   config.json:

      {
        "standard": "VIBES",
        "standard_version": "1.0",
        "assurance_level": "medium",
        "risk_scoring": {
          "enabled": true,
          "algorithm": "weighted_average",
          "weights": {
            "temperature": 0.15,
            "action_type": 0.15,
            "scope_lines": 0.15,
            "scope_files": 0.10,
            "assurance_gap": 0.10,
            "human_review_present": 0.15,
            "prompt_token_count": 0.10,
            "model_capability_tier": 0.10
          },
          "thresholds": {
            "low": 0.00,
            "medium": 0.30,
            "high": 0.60,
            "critical": 0.80
          },
          "review_rules": [
            {
              "condition":
                "temperature > 0.8
                 AND action_type == 'create'
                 AND scope_lines > 100",
              "action": "require_review"
            },
            {
              "condition":
                "assurance_gap == 1.0",
              "action": "flag"
            }
          ]
        }
      }

   Fields:

      risk_scoring          OPTIONAL.  Object.  PRISM
                            configuration.  When absent or
                            null, PRISM is disabled.

      risk_scoring.enabled  OPTIONAL.  Boolean.
                            Default: false.  Enable PRISM
                            computation.

      risk_scoring.algorithm
                            OPTIONAL.  String.
                            Default: "weighted_average".
                            Scoring algorithm identifier.
                            "weighted_average" selects the
                            reference algorithm;
                            implementations MAY define
                            custom identifiers.

      risk_scoring.weights  OPTIONAL.  Object.  Signal
                            weight overrides.  Keys are
                            signal names, values are numeric
                            weights.  Weights will be
                            normalized by the scoring
                            function.  Default: see Section
                            10.4.2.

      risk_scoring.thresholds
                            OPTIONAL.  Object.  Severity
                            band boundaries.  Keys are band
                            names ("low", "medium", "high",
                            "critical"), values are the
                            lower bound of each band.
                            Default: see Section 10.4.3.

      risk_scoring.review_rules
                            OPTIONAL.  Array.  Rule-based
                            gating conditions as an
                            alternative to score thresholds.
                            Each element has:

                               condition
                                  REQUIRED.  String.
                                  Boolean expression over
                                  signal names and values.

                               action
                                  REQUIRED.  String.
                                  One of: "require_review",
                                  "block", "flag".

                            Default: empty array.


11.  Attestation (Informative)

   VIBES audit data MAY be accompanied by cryptographic attestations
   that provide:

      (a) Integrity: SHA-256 digests of .ai-audit/ files bound to
          a signed envelope.

      (b) Authenticity: Ed25519 digital signatures identifying the
          attesting entity.

      (c) Timestamping: Optional Bitcoin-anchored timestamps via
          OpenTimestamps proving existence before a given time.

11.1.  Attestation Format

   Attestations use the DSSE (Dead Simple Signing Envelope) format
   wrapping in-toto v1 attestation statements.

   Envelope payloadType: "application/vnd.in-toto+json"
   Predicate type: "https://itsavibe.ai/vibes/attestation/v1"

   The DSSE envelope structure:

      {
        "payloadType": "application/vnd.in-toto+json",
        "payload": "<base64url-encoded in-toto statement>",
        "signatures": [{
          "keyid": "<16-char hex key identifier>",
          "sig": "<base64url-encoded Ed25519 signature>"
        }]
      }

   The predicate contains validation results, project metadata,
   manifest statistics, and annotation statistics.

11.2.  Subject Binding

   The attestation subject binds to three files:

      - .ai-audit/manifest.json   (SHA-256)
      - .ai-audit/annotations.jsonl   (SHA-256)
      - .ai-audit/config.json   (SHA-256)

   Any modification to these files invalidates the attestation.

11.3.  Key Management

   Attestation keys are Ed25519 keypairs stored at
   ~/.vibescheck/keys/ by default.  Two files are maintained:

      vibescheck.key   Private key (PKCS8 PEM, mode 0600).
      vibescheck.pub   Public key (SPKI PEM, mode 0644).

   The key identifier is the first 16 hexadecimal characters of the
   SHA-256 hash of the DER-encoded public key.

   The public key (vibescheck.pub) SHOULD be distributed to
   verifiers.  The private key MUST be kept confidential and MUST
   NOT be committed to version control.

11.4.  Submission

   Signed attestations MAY be submitted to a public registry for
   discoverability and third-party verification.  The submission
   endpoint accepts the full DSSE envelope and stores it keyed by
   a content-derived identifier.

11.5.  Timestamping

   Attestations MAY be anchored to the Bitcoin blockchain via
   OpenTimestamps.  The OTS proof file is stored alongside the
   attestation envelope (e.g., attestation.intoto.jsonl.ots).

   Timestamping provides a trust-minimized proof that the
   attestation existed before a specific block height, independent
   of any certificate authority.

11.6.  Trust Model

   An attestation proves that a specific entity (identified by their
   Ed25519 key) signed specific file hashes at a specific time.

   It does NOT prove:

      - That the audit data itself is correct.  The accuracy of
        manifest entries and annotations is the auditor's
        responsibility.

      - That the code behaves as intended.  Attestation covers the
        audit metadata, not the audited source code.

      - That the signer is trustworthy.  Trust in the signing entity
        must be established out of band.


12.  Versioning and Migration

12.1.  Standard Versioning

   The VIBES standard follows semantic versioning:

      MAJOR:  Breaking changes to file formats, hash algorithms, or
              record structure.

      MINOR:  New optional fields, new record types, new query
              capabilities.

      PATCH:  Clarifications, editorial corrections, examples.

   Tools MUST check manifest "version" and config "standard_version"
   before reading or writing audit data.  If the major version is
   unsupported, the tool MUST refuse to modify audit data and SHOULD
   warn the user.

12.2.  Annotation Log Compatibility

   The JSONL annotation log is append-only and inherently forward-
   compatible:

      - New optional fields added in minor versions are ignored by
        older tools.

      - New record types (beyond "line", "function", "session",
        "edge", "delegation") added
        in minor versions MUST be silently skipped by tools that do
        not recognize them.

      - Tools MUST NOT modify or remove existing records in the
        annotation log.

      - Tools MUST preserve unrecognized fields when copying or
        processing annotation records.

12.3.  Manifest Compatibility

   New entry types or fields added in minor versions MUST be ignored by
   tools that do not understand them.  Tools MUST NOT remove or modify
   manifest entries they do not recognize.

12.4.  Derived Database Regeneration

   Since the query database (audit.db) is a derived artifact, schema
   changes to the generated database do not require migration.  Tools
   simply regenerate the database from the annotation log using the
   current schema.


13.  Security Considerations

13.1.  Sensitive Data in Prompts

   Medium and High assurance levels capture prompt text and reasoning
   traces, which may contain sensitive information (API keys, internal
   documentation, proprietary logic).

   Projects operating at these levels SHOULD:

      - Review audit data before committing to public repositories.

      - Use .gitignore to exclude manifest.json or specific blob files
        if they contain sensitive prompts.

      - Consider encrypting sensitive manifest entries at rest.

13.2.  Hash Integrity

   If audit data integrity is critical, projects MAY:

      - Sign manifest.json using GPG or sigstore.

      - Compute a root hash over all manifest entries (Merkle tree) and
        store it in a signed git tag.

      - Verify manifest integrity in CI pipelines.

13.3.  Tamper Detection

   The content-addressable hash system provides basic tamper detection:
   if an entry's content is modified, its hash will no longer match.

   Tools SHOULD verify hash consistency when reading the manifest.


14.  References

14.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [FIPS180]  National Institute of Standards and Technology, "Secure
              Hash Standard (SHS)", FIPS PUB 180-4, August 2015.

   [JSONL]    JSON Lines format, https://jsonlines.org/.

   [SQLITE]   Hipp, D. R., "SQLite", https://www.sqlite.org/.

   [DUCKDB]   DuckDB Foundation, "DuckDB", https://duckdb.org/.

14.2.  Informative References

   [SLSA]     Supply-chain Levels for Software Artifacts,
              https://slsa.dev/.

   [SBOM]     NTIA, "Software Bill of Materials",
              https://www.ntia.gov/sbom.

   [EUAIACT]  European Parliament, "Regulation on Artificial
              Intelligence (AI Act)", 2024.

   [DSSE]     Secure Systems Lab, "Dead Simple Signing Envelope",
              https://github.com/secure-systems-lab/dsse.

   [INTOTO]   in-toto project, "in-toto Attestation Framework",
              https://github.com/in-toto/attestation.

   [OTS]      OpenTimestamps, "OpenTimestamps: Scalable, Trustless,
              Distributed Timestamping with Bitcoin",
              https://opentimestamps.org/.

14.3.  Companion Specifications

   VERIFY (Validated Evidence & Representation for Intelligent
   sYstems) extends VIBES with cryptographic attestation using DSSE
   envelopes, Ed25519 signatures, and a public attestation registry.

   PRISM (Provenance & Risk Intelligence Scoring Model) extends VIBES
   with risk scoring, severity bands, and CI/CD gating criteria.

   EVOLVE (Explainable Validated Optimization & Learning Via
   Execution) extends VIBES with agent learning, governance
   frameworks, and reinforcement feedback pipelines for any domain.

   All three extensions build on the VIBES data model defined in this
   document and are specified separately.


Appendix A.  Complete Example

   A small project at Medium assurance after two AI coding sessions.

   .ai-audit/config.json:

      {
        "standard": "VIBES",
        "standard_version": "1.0",
        "assurance_level": "medium",
        "project_name": "signup-service",
        "tracked_extensions": [".py"],
        "exclude_patterns": ["**/test/**"]
      }

   .ai-audit/manifest.json:

      {
        "standard": "VIBES",
        "version": "1.0",
        "entries": {
          "e7a3f1b2c4d5...": {
            "type": "environment",
            "tool_name": "Claude Code",
            "tool_version": "1.5.2",
            "model_name": "claude-opus-4-5",
            "model_version": "20251101",
            "model_parameters": {"temperature": 1.0},
            "created_at": "2026-02-03T10:00:00Z"
          },
          "a1b2c3d4e5f6...": {
            "type": "prompt",
            "prompt_text": "Create a signup endpoint that validates
                email format and password strength.",
            "prompt_type": "user_instruction",
            "prompt_context_files": ["src/models/user.py"],
            "created_at": "2026-02-03T10:05:00Z"
          },
          "b2c3d4e5f6a7...": {
            "type": "prompt",
            "prompt_text": "Add rate limiting to the signup endpoint.
                Max 5 attempts per IP per hour.",
            "prompt_type": "user_instruction",
            "prompt_context_files": [
              "src/routes/auth.py",
              "src/middleware/rate_limit.py"
            ],
            "created_at": "2026-02-03T14:20:00Z"
          }
        }
      }

   .ai-audit/annotations.jsonl:

      {"type":"session","event":"start","session_id":"550e8400-...",
       "timestamp":"2026-02-03T10:00:00Z",
       "environment_hash":"e7a3f1b2c4d5...",
       "assurance_level":"medium"}
      {"type":"line","file_path":"src/routes/auth.py","line_start":1,
       "line_end":45,"environment_hash":"e7a3f1b2c4d5...",
       "prompt_hash":"a1b2c3d4e5f6...","action":"create",
       "timestamp":"2026-02-03T10:05:00Z","commit_hash":"abc123",
       "session_id":"550e8400-...","assurance_level":"medium"}
      {"type":"session","event":"end","session_id":"550e8400-...",
       "timestamp":"2026-02-03T11:30:00Z"}
      {"type":"session","event":"start","session_id":"660f9511-...",
       "timestamp":"2026-02-03T14:00:00Z",
       "environment_hash":"e7a3f1b2c4d5...",
       "assurance_level":"medium",
       "description":"Adding rate limiting"}
      {"type":"line","file_path":"src/routes/auth.py","line_start":12,
       "line_end":28,"environment_hash":"e7a3f1b2c4d5...",
       "prompt_hash":"b2c3d4e5f6a7...","action":"modify",
       "timestamp":"2026-02-03T14:20:00Z","commit_hash":"def456",
       "session_id":"660f9511-...","assurance_level":"medium"}
      {"type":"line","file_path":"src/middleware/rate_limit.py","line_start":1,
       "line_end":32,"environment_hash":"e7a3f1b2c4d5...",
       "prompt_hash":"b2c3d4e5f6a7...","action":"create",
       "timestamp":"2026-02-03T14:22:00Z","commit_hash":"def456",
       "session_id":"660f9511-...","assurance_level":"medium"}
      {"type":"session","event":"end","session_id":"660f9511-...",
       "timestamp":"2026-02-03T15:00:00Z"}

   Multi-agent delegation example.  A Maestro orchestrator session
   delegates a task to a worker agent.

   Delegation record (appended to annotations.jsonl by orchestrator):

      {"type":"delegation",
       "parent_session_id":"aa110000-0000-0000-0000-000000000001",
       "child_session_id":"bb220000-0000-0000-0000-000000000002",
       "timestamp":"2026-02-04T09:00:00Z",
       "task_description":"Implement rate-limit middleware",
       "delegated_files":["src/middleware/rate_limit.py"],
       "delegation_type":"task"}

   Child session start with parent_session_id:

      {"type":"session","event":"start",
       "session_id":"bb220000-0000-0000-0000-000000000002",
       "parent_session_id":"aa110000-0000-0000-0000-000000000001",
       "agent_name":"worker-1","agent_type":"claude-code",
       "timestamp":"2026-02-04T09:00:05Z",
       "environment_hash":"e7a3f1b2c4d5...",
       "assurance_level":"medium"}

   Edge record linking the child's annotation back to the delegation:

      {"type":"edge",
       "edge_type":"caused_by",
       "source_ref":"f9a8b7c6d5e4f3a2...",
       "source_type":"annotation",
       "target_ref":"bb220000-0000-0000-0000-000000000002",
       "target_type":"session",
       "timestamp":"2026-02-04T09:05:00Z",
       "session_id":"bb220000-0000-0000-0000-000000000002"}


Appendix B.  Relationship to Existing Standards

   SBOM (Software Bill of Materials):
      VIBES complements SBOMs by tracking the AI tools that generated
      code, similar to how SBOMs track dependencies.

   SLSA (Supply-chain Levels for Software Artifacts):
      VIBES assurance levels parallel SLSA build levels.  High
      assurance provides provenance equivalent to SLSA Level 3+.

   Git blame:
      VIBES extends beyond git blame by distinguishing AI-generated
      lines from human-written lines and tracking the model/prompt
      context that produced them.

   EU AI Act:
      For AI systems that generate code used in regulated domains,
      VIBES High assurance provides the traceability required by the
      Act's transparency obligations.

   Context Graphs / Provenance DAGs:
      VIBES edge records (Section 7.6) enable directed acyclic graph
      (DAG) based provenance tracking.  Each annotation, context
      entry, or session becomes a node; edge records capture typed
      relationships (caused_by, depends_on, informed_by, etc.)
      between those nodes.  The resulting provenance DAG is similar
      in spirit to academic provenance graph systems such as the
      W3C PROV model and OPM (Open Provenance Model), but is
      purpose-built for AI-assisted software development workflows
      and multi-agent orchestration.


Authors' Addresses

   itsavibe.ai
   https://itsavibe.ai
   https://github.com/openasocket/itsavibe.ai