The Conformance Contract

How multiple enforcement kernels prove they enforce the same Laws


The Codex defines what must be true. The conformance suite proves it is.


The Multi-Kernel Problem

AgentVector is a constitutional governance framework. The Codex defines eleven composable Laws. Enforcement kernels implement those Laws in language-specific runtimes. SwiftVector enforces them in Swift. RustVector will enforce them in Rust. TSVector will enforce them in TypeScript.

This creates an immediate and serious question: how do you know they agree?

A Law that rejects a filesystem write outside writable paths in SwiftVector but permits it in TSVector is not a governance framework — it is a governance contradiction. Two kernels enforcing different rules under the same Codex is worse than having no Codex at all, because it creates the illusion of safety where none exists.

The conformance contract is how AgentVector solves this. It is not a handshake agreement. It is not a specification document that implementers read and interpret. It is a machine-verifiable test suite that every kernel must pass to claim AgentVector compliance.


What the Contract Is

The conformance contract is a set of JSON fixtures. Each fixture is a complete test case: an input state, a proposed action, a governance configuration, and the expected verdict.

{
  "name": "Law 0: reject write outside writable paths",
  "law": 0,
  "version": 1,
  "input": {
    "state": {
      "writablePaths": ["/Users/operator/Documents"],
      "currentMode": "active"
    },
    "action": {
      "type": "fileWrite",
      "path": "/etc/passwd",
      "content": "..."
    },
    "config": {
      "bypassGate": false
    }
  },
  "expected": {
    "decision": "reject",
    "reason_contains": "outside writable"
  }
}

This fixture says: given this state and this action under this configuration, any AgentVector-compliant kernel must reject the action, and the rejection reason must indicate the path was outside writable boundaries.

The fixture does not specify how the kernel reaches this verdict. It does not require a particular data structure, algorithm, or error message format. It specifies the contract: these inputs must produce this class of output. The implementation is the kernel’s business. The verdict is the Codex’s.


Why Fixtures, Not Specifications

The obvious alternative is a prose specification: a document that describes what each Law must do, with sufficient precision that implementers can build compliant kernels from the description alone.

This is how most standards work. It is also how most interoperability failures happen.

Prose specifications are ambiguous by nature. “The kernel must reject actions that exceed resource budgets” leaves open questions that only surface at implementation time. What does “exceed” mean — strictly greater than, or greater than or equal to? Does a budget of zero mean the resource is disabled or that the agent has no remaining allowance? What happens when multiple budgets are exhausted simultaneously — does the kernel report the first violation, all violations, or the most severe?

Each ambiguity is a potential conformance gap. Two implementers reading the same specification, making different reasonable interpretations, producing kernels that disagree on edge cases. The disagreement might not surface for months — until an agent operating under TSVector does something that SwiftVector would have blocked.

Fixtures eliminate this class of failure. The edge case is not described in prose — it is encoded as a test. The zero-budget case is not left to interpretation — there is a fixture that specifies exactly what must happen. The simultaneous-exhaustion case is not a footnote — it is a JSON file with an expected verdict.

The specification still exists. The Codex defines the Laws in human-readable language. But the fixtures are the contract. When the prose and the fixture disagree, the fixture wins — because the fixture is what the test runner executes.


The Reference Kernel

Not all kernels are equal. AgentVector designates one kernel as the reference implementation — currently SwiftVector. The reference kernel has a specific role in the conformance ecosystem:

It generates the fixtures. When a new Law is implemented or an edge case is discovered, the fixture is written alongside the SwiftVector test. The Swift test asserts the behavior. The fixture captures it. They are two expressions of the same truth.

It resolves ambiguity. When a conformance question arises that the fixtures don’t cover, the reference kernel’s behavior is authoritative. A new fixture is created to capture the decision, and all other kernels must match it.

It sets the assurance bar. SwiftVector’s compile-time type safety, actor isolation, and deterministic memory management provide the highest-assurance enforcement. Other kernels meet the same behavioral contract through different mechanisms — Rust through ownership semantics, TypeScript through runtime checks. The reference kernel proves what is possible; the conformance suite proves what is required.

This is not a claim that SwiftVector is better. It is a structural role. The reference kernel is the single source of truth for what the Codex means in practice. Without this, the multi-kernel architecture degenerates into multiple implementations that happen to share a name.


Fixture Anatomy

Every conformance fixture follows the same structure:

Identity: A name, the Law it tests, and a version number. The version tracks fixture evolution — when a fixture is updated (because a Law’s behavior is refined), the version increments and all kernels must re-pass.

Input: The complete context the kernel needs to render a verdict. This always includes state (the current governance state), action (the proposed agent action), and config (the governance configuration, including any bypass flags, trust levels, or jurisdiction-specific parameters).

Expected output: The verdict the kernel must produce. This is deliberately minimal — it specifies the decision class (accept, reject, escalate) and may include a constraint on the reason (using reason_contains rather than exact string matching, because error message formatting is an implementation detail, not a governance decision).

Metadata: Optional fields for categorization — which jurisdiction the fixture applies to, whether it tests a boundary condition, and cross-references to related fixtures.

fixtures/
  law0/
    law0-reject-outside-boundary.json
    law0-allow-within-boundary.json
    law0-reject-path-traversal.json
    law0-boundary-with-symlink.json
  law4/
    law4-budget-transition-degraded.json
    law4-budget-transition-halted.json
    law4-zero-budget-at-init.json
    law4-simultaneous-exhaustion.json
  law8/
    law8-require-approval-delete.json
    law8-low-risk-auto-approve.json
    law8-escalation-chain.json

The directory structure mirrors the Laws. A kernel implementer working on Law 4 support runs the law4/ fixtures. A jurisdiction author composing Laws 0, 4, and 8 runs the corresponding directories. The full suite runs during CI to verify complete compliance.


How a Kernel Passes

Each kernel provides a test runner — a program that loads fixtures, feeds them through the kernel’s reducer, and compares the output against the expected verdict.

The SwiftVector test runner is a standard XCTest suite:

func testConformance() throws {
    let fixtures = try loadFixtures(from: "law0/")
    
    for fixture in fixtures {
        let state = try decode(GovernanceState.self, from: fixture.input.state)
        let action = try decode(AgentAction.self, from: fixture.input.action)
        let config = try decode(GovernanceConfig.self, from: fixture.input.config)
        
        let verdict = reducer.evaluate(state: state, action: action, config: config)
        
        XCTAssertEqual(verdict.decision, fixture.expected.decision,
            "Fixture '\(fixture.name)' expected \(fixture.expected.decision)")
        
        if let reasonContains = fixture.expected.reason_contains {
            XCTAssertTrue(verdict.reason.contains(reasonContains),
                "Fixture '\(fixture.name)' reason should contain '\(reasonContains)'")
        }
    }
}

A TypeScript test runner would load the same JSON files through a different deserializer and feed them through TSVector’s reducer. A Rust test runner would do the same through serde. The fixtures are identical. The runners are language-native. The verdicts must agree.

A kernel passes conformance when every fixture in the suite produces the expected verdict. Partial compliance is not compliance — a kernel that passes 99 of 100 fixtures has a governance gap on the one it fails.


When Kernels Disagree

Disagreement between kernels is not a bug to be papered over — it is a governance event that requires resolution. The conformance suite is designed to surface disagreements early, make them precise, and force explicit decisions.

Case 1: A new kernel fails an existing fixture. This is the common case during kernel development. The kernel implementer examines the fixture, understands the expected behavior, and fixes their implementation. The fixture is the authority.

Case 2: The reference kernel’s behavior changes. When SwiftVector’s behavior is updated — a bug fix, a Law refinement, a new edge case — the corresponding fixture is updated or a new one is created. All other kernels receive the updated fixture through the shared repository and must pass it. This is how governance decisions propagate across runtimes.

Case 3: A fixture is disputed. An implementer believes a fixture specifies incorrect behavior — the expected verdict doesn’t match the Law’s intent. This triggers a review process: the dispute is filed against the Codex repository, the fixture is examined against the Law’s prose definition, and a decision is rendered. If the fixture is wrong, it is corrected and all kernels re-test. If the fixture is right, the disputing kernel must conform.

Case 4: Language semantics force a difference. Floating-point behavior, integer overflow handling, and string comparison rules vary across languages. When a fixture touches language-specific behavior, the fixture must be designed to avoid false disagreements. Fixtures test governance logic, not arithmetic — if a verdict depends on whether 0.1 + 0.2 equals 0.3, the fixture is poorly designed.


Trust Profiles

The conformance contract guarantees behavioral equivalence — every kernel produces the same verdicts for the same inputs. But behavioral equivalence is not the only dimension of trust. Different kernels provide different levels of implementation assurance.

SwiftVector provides compile-time type safety, actor-isolated concurrency, and deterministic memory management through ARC. Illegal states are unrepresentable at the type level. Concurrency violations are compiler errors. There is no garbage collector to introduce temporal non-determinism. Trust basis: the compiler enforces what the conformance suite tests.

RustVector provides ownership-based memory safety, zero-cost abstractions, and no_std support for bare-metal environments. The borrow checker prevents data races at compile time. Resource-constrained environments (drones, embedded systems) can run the full governance kernel without a runtime or allocator. Trust basis: the ownership model prevents classes of bugs that other languages detect only at runtime.

TSVector provides native integration with Node.js agent pipelines — no IPC overhead, no sidecar process, no language bridge. The runtime checks are compensated by the conformance suite: TypeScript cannot make illegal states unrepresentable at compile time, but it can prove through exhaustive testing that its reducer produces identical verdicts to the reference kernel. Trust basis: conformance tests plus integration simplicity.

These are not rankings. They are trade-offs. A desktop agent framework might choose TSVector for integration simplicity. A drone system might choose RustVector for bare-metal deployment. An Apple platform agent chooses SwiftVector for native ecosystem integration. The conformance contract guarantees that the governance is identical regardless of which kernel enforces it.


Precedent

AgentVector is not the first system to use shared test fixtures for cross-implementation conformance. The pattern is established and proven:

JSON Schema defines validation rules for JSON documents. Multiple implementations exist across dozens of languages. The JSON Schema Test Suite is a shared repository of JSON fixtures that every implementation must pass. This is the closest structural precedent to the AgentVector conformance suite.

Cedar (Amazon) is a policy language for authorization. Cedar policies are evaluated by engines in Rust and Java. Shared test fixtures verify that both engines produce identical authorization decisions for the same policies and requests.

Open Policy Agent (OPA) defines policies in Rego and evaluates them across Go, Wasm, and other runtimes. Conformance tests ensure policy evaluation is consistent regardless of the evaluation engine.

The pattern is: specification defines intent, fixtures define behavior, implementations prove compliance. AgentVector applies this proven pattern to agent governance — a domain where enforcement inconsistency is not an inconvenience but a safety failure.


The Fixture Lifecycle

Fixtures are not static. They evolve as the framework matures:

Discovery. A new edge case is found — through implementation, through a jurisdiction author’s question, or through a real-world agent behavior that exposes an ambiguity. The edge case is documented.

Encoding. The edge case is implemented in the reference kernel’s test suite and simultaneously captured as a JSON fixture. The Swift test and the fixture are committed together — they are two views of the same decision.

Propagation. The fixture is merged into the shared agentvector/codex repository. All kernel implementations receive the new fixture through their normal dependency update process.

Verification. Each kernel’s CI runs the updated fixture suite. Failures surface immediately. Kernel maintainers update their implementations to match the new fixture.

Versioning. The fixture’s version field increments when its expected behavior changes (as opposed to metadata updates). Version history provides an audit trail of how governance decisions have evolved.

This lifecycle ensures that governance decisions are not tribal knowledge locked inside one implementation. They are portable, versioned, machine-verifiable artifacts that any kernel can consume.


What the Contract Does Not Cover

The conformance contract is deliberately scoped. It verifies what a kernel decides, not how it decides:

Performance. A kernel that takes 10 seconds to render a verdict passes conformance but fails deployment. Performance requirements are kernel-specific and domain-specific — a drone system has latency budgets that a cloud pipeline does not.

Error message formatting. Fixtures use reason_contains rather than exact string matching. The governance decision is the verdict; the human-readable explanation is an implementation detail.

Internal architecture. A kernel may implement the reducer as a pure function, a state machine, an actor, or a pipeline. The conformance suite does not care. It cares about inputs and outputs.

Jurisdiction-specific configuration. Jurisdictions may define configuration parameters beyond what the base Laws require. These are tested by jurisdiction-specific fixtures, not the core conformance suite.

Deployment topology. Whether the kernel runs as a library, a sidecar, or a network service is an integration decision. The conformance suite tests the reducer logic in isolation.


Building the Suite

The conformance suite grows incrementally. It does not need to be complete at launch — it needs to be correct and growing.

Phase 1: Core Laws. Fixtures for Laws 0, 4, and 8 — the three Laws that ClawLaw composes. These are the first fixtures because they have a real jurisdiction exercising them. Every fixture corresponds to a behavior that ClawLaw depends on.

Phase 2: Boundary conditions. Edge cases within the core Laws: zero budgets, empty writable path lists, simultaneous constraint violations, bypass gate interactions. These are the fixtures that catch the ambiguities prose specifications miss.

Phase 3: Cross-Law interactions. Fixtures that exercise multiple Laws simultaneously — a resource-exhausted agent attempting a high-risk action (Law 4 + Law 8), a filesystem write that crosses both path boundaries and authority thresholds (Law 0 + Law 8). These test composition, not individual Laws.

Phase 4: Full Codex coverage. Fixtures for all eleven Laws, including Laws that currently have no active jurisdiction (Law 5, Law 9, Law 10). These establish the behavioral contract before any kernel implements them, preventing the “first implementer becomes the de facto specification” problem.

The suite lives in the agentvector/codex repository alongside the Codex prose, the Law definitions, and the JSON Schema definitions for the reducer interface. It is the single source of truth for what AgentVector governance means in practice.


The Guarantee

The conformance contract provides a specific, verifiable guarantee:

Any action evaluated by any AgentVector-compliant kernel, given the same state and configuration, will produce the same governance verdict.

This is not a promise. It is not a best practice. It is a property that is tested on every commit to every kernel, against every fixture in the suite. When the tests pass, the guarantee holds. When they fail, the kernel is not compliant.

One Codex defines the Laws. Multiple kernels enforce them. The conformance contract proves they agree.

That is the contract.


"Trust, but verify — and automate the verification."