The four-layer evidence stack for AI agents

Q: Where do guardrails libraries sit?

Layer 2. Input classifiers, output filters, tool whitelists, and authorisation gates run at execution time and prevent harmful actions before they happen. They produce a runtime decision (allow, block, escalate) and emit a log entry, but the log entry is not the artefact a regulator reads two years later. Runtime guards prevent harm. They do not prove past behaviour.

Q: Is Warrant a logging platform?

No. Warrant is Layer 3 plus Layer 4. We integrate with your Layer 1 platform, not replace it. The trace ingestion endpoint accepts the same OpenTelemetry-compatible payload your existing observability stack already produces. We sit downstream and emit an evidence package per attestable action that any auditor can verify independently, without contacting Warrant.

Q: How does this map to NIST AI RMF?

The Measure function covers Layer 1 and Layer 2, telemetry plus runtime controls. The Manage function requires Layer 3 records: documented evidence of how the system was operated, how risks were addressed, and how decisions were made. The Govern function implies Layer 4 attestability, accountability that survives staff turnover and vendor change requires a record an outside party can verify independently, without contacting Warrant.

FOUR LAYERS · L1 · L2 · L3 · L4

Observability, runtime, evidence, attestation. Warrant lives in L3 and L4 and integrates with L1 and L2.

01 · THE FOUR LAYERS, DEFINED

The four layers, defined.

L1 observability · L2 runtime guards · L3 evidence · L4 attestation

Each layer answers a different question, in a different timeframe, for a different reader.

L1 · OBSERVABILITY

Telemetry

Internal logs, traces, metrics, dashboards. Reader is the on-call engineer. Timeframe is now to last 30 days.

L2 · RUNTIME GUARDS

Policy at execution

Input filters, output classifiers, tool whitelists, authorisation gates. Reader is the agent. Timeframe is the next millisecond.

L3 · EVIDENCE

Reconstruction surface

The full record of an action, retained long enough that an auditor in 2027 can read what happened in 2026. Reader is the auditor. Timeframe is years.

L4 · ATTESTATION

Independently verifiable record

A record any outside party can verify is unchanged since the action, without contacting Warrant. Reader is anyone, including a court. Timeframe is forever.

Most products live in L1 with a thin shim into L2 and call the bundle "AI compliance". The bundle is fine for engineering. It is not what a regulator asks for. Warrant lives in L3 and L4.

02 · LAYER 1 · OBSERVABILITY

Layer 1 · observability.

Internal telemetry · OpenTelemetry, W3C Trace Context, distributed tracing semantics

Observability is the layer in every modern stack. Logs, traces, metrics, dashboards, alerts. The W3C Trace Context recommendation defines propagation primitives, OpenTelemetry standardises the wire format, vendor agents emit OTLP without thinking. Your platform team has owned this layer for a decade.

Layer 1 is good for debugging a latency spike at 03:14 last Tuesday, counting error-budget burn, watching p99 inference latency by model variant, paging the on-call engineer before the customer notices.

Layer 1 is not good for post-incident forensics under adversarial conditions or regulator submission. Not a feature gap, a structural property: observability platforms are built for live systems, not for posterity.

The mismatch shows up in retention. Default observability tier rotates at 7 to 30 days. The cold tier extends to 12 to 15 months. Both miss the regulator's clock. EU AI Act Article 12 sets the floor:

"High-risk AI systems shall technically allow for the automatic recording of events (logs) over the lifetime of the system." EU AI Act · Regulation (EU) 2024/1689 · Article 12(1)

Article 12(2) extends the same idea into specific use cases:

"The logging capabilities shall enable the recording of events relevant for: (a) identification of situations that may result in the AI system presenting a risk; (b) facilitation of the post-market monitoring; (c) monitoring of the operation of high-risk AI systems referred to in Article 26(5)." EU AI Act · Article 12(2)

Article 19(1), read with Article 12, fixes retention at six months minimum for high-risk operators. A 30-day-rotation telemetry stack fails the lifetime test on its own. The fix is not a longer retention plan. The fix is to recognise the layer as Layer 1 and put a Layer 3 system downstream of it.

03 · LAYER 2 · RUNTIME GUARDS

Layer 2 · runtime guards.

NIST AI RMF (Measure, Manage) · OWASP LLM Top 10 · runtime policy enforcement

Runtime guards are the policy layer. Input classifiers reject injection-shaped prompts. Output classifiers strip personally identifiable text. Tool whitelists prevent the agent from calling unauthorised endpoints. Authorisation gates require a human signature for trades above a threshold, wire transfers, any irreversible action.

Concrete shape, no vendor names. An order-routing agent gets a request to liquidate a position. The Layer 2 policy reads: order.notional > 50_000_USD ⟹ require_HITL_approval. The agent halts, surfaces the action to a human reviewer, waits for the reviewer's approval, then proceeds. NIST AI RMF describes the pattern under Measure and Manage; the OWASP LLM Top 10 catalogues the failure modes runtime guards address (LLM01, LLM06, LLM08).

Runtime guards prevent harm. That is the entire point. They tell the agent what not to do, in the millisecond before it would have done it. The decision becomes a log entry, hits Layer 1, and the on-call engineer sees a count of blocked attempts.

Runtime guards do not tell the regulator what happened in production six months ago. A "blocked: prompt-injection-pattern-04" entry is not the artefact an auditor reads. The auditor reads the trace, the prompt, the retrieved context, the rationale, the disposition. That artefact lives at Layer 3.

L2 ANSWERS

"Did the system stop a bad action before it happened?"

Yes or no, in real time, with a policy decision and a halt or escalate signal. This is the prevention question.

L2 DOES NOT ANSWER

"Show me what the agent did on June 12, 2026, at 14:23 IST."

The runtime did not retain the trace, the prompt version, the retrieval citations, or the model rationale. That is a Layer 3 question.

04 · LAYER 3 · EVIDENCE

Layer 3 · evidence.

EU AI Act Annex IV · NYDFS § 500.6 · the reconstruction surface

Evidence is the layer most products skip. Evidence is expensive and invisible until the moment it is needed. By that moment the cost of not having it is many multiples of the cost of having had it.

An auditor in 2027 investigates a customer complaint about a credit decision from March 2026. The question is not "show me uptime that day". The question is: show me the trace, the prompt version that was live, the retrieved policy clauses the model used, the customer-supplied inputs, the intermediate outputs, the final disposition, the rationale paragraph, and the named human who reviewed the action.

That bundle is the reconstruction surface. Annex IV of the EU AI Act calls it "technical documentation". NYDFS Part 500 calls it "audit trails". The NYDFS clause, verbatim:

"include audit trails designed to detect and respond to Cybersecurity Events that have a reasonable likelihood of materially harming any material part of the normal operations of the Covered Entity." 23 NYCRR § 500.6(a)(2) · NYDFS Cybersecurity Regulation

The model risk guidance has carried the same idea since 2011 and continues under SR 26-2, the Fed / OCC / FDIC Revised Guidance on Model Risk Management issued 17 April 2026 (OCC Bulletin 2026-13), which supersedes and replaces SR 11-7 and SR 21-8 for federally supervised institutions. That phrasing is regulator language, preserved here in its statutory sense: "model risk management framework" is the term of art SR 26-2 uses for the documented programme banks must run for any model whose output materially affects the institution.

L3, in record terms, holds:

The trace. Span tree, timestamps, parent-child relationships, the OpenTelemetry payload L1 emitted, captured losslessly.
Intermediate model outputs. Every model call, every response, including responses discarded in favour of a re-prompt.
Prompt versions. The exact template live on the day, identified by version hash.
Retrieved context. RAG citations, document IDs, chunk offsets, retrieval scores. What the model saw, not just what the model said.
Decision rationale. The model's stated reasoning where requested, and the structured fields the rationale produced.
Sub-clause citations. For each obligation triggered, the specific article, paragraph, sub-paragraph the system mapped against.

This is the evidence layer. It is mutable in the sense that storage is your storage; you can re-render the PDF, re-export the JSON. It is reconstructable in the sense that any honest re-render produces the same canonical bytes.

05 · LAYER 4 · ATTESTATION

Layer 4 · attestation.

Independently verifiable record · a property an architect can falsify

Attestation is the property that makes the Layer 3 evidence stand on its own: the record is independently verifiable by any auditor, without contacting Warrant. Verification resolves against an external, append-only public reference that Warrant does not operate. Any party recomputes the binding from the canonical bytes of the record and reaches the same yes-or-no result. That is a claim an architect can reason about and falsify: take the record, recompute, compare against the public reference, and the answer is binary.

Attestation is its own layer because of admissibility. Layer 3 evidence on its own is "the vendor's database, as the vendor presents it today". A regulator asking who attests to integrity over time gets one answer, the vendor, and that answer has a known failure mode: staff turnover, data migration, deliberate edit by an insider.

Layer 4 inverts the trust assumption. Verification runs against the record itself, not against Warrant's storage system, and it resolves against an external public reference rather than a Warrant service. An auditor checks the record on their own machine. None of it depends on Warrant being honest or even online.

Layer 3 says "here is the evidence". Layer 4 says "and here is why you can believe it without trusting us".

06 · THE LAYER-CAKE DIAGRAM

The layer-cake diagram.

L4 sits on L3 sits on L2 sits on L1 · each layer with its own job

Each layer runs on the layer beneath it; each carries its own job. The diagram, in mono:

                     ┌──────────────────────────┐
                     │  Layer 4 · Attestation    │  independently verifiable
                     ├──────────────────────────┤
                     │  Layer 3 · Evidence       │  trace · prompts · citations
                     ├──────────────────────────┤
                     │  Layer 2 · Runtime guards │  policy · tool whitelist
                     ├──────────────────────────┤
                     │  Layer 1 · Observability  │  metrics · logs · traces
                     └──────────────────────────┘

The same picture in code, the four layers as record types, makes the boundaries explicit:

python

from dataclasses import dataclass
from typing import List, Optional
from api.data.citation import ObligationCitation

@dataclass
class L1Telemetry:
    span_id: str
    parent_span_id: Optional[str]
    start_ns: int
    end_ns: int
    attributes: dict          # OpenTelemetry payload

@dataclass
class L2RuntimeDecision:
    policy_id: str
    decision: str             # allow | block | escalate
    triggered_rules: List[str]
    requires_hitl: bool

@dataclass
class L3Evidence:
    trace: List[L1Telemetry]
    runtime_decisions: List[L2RuntimeDecision]
    prompt_version: str
    retrieved_citations: List[ObligationCitation]
    model_rationale: str
    final_disposition: str
    obligation_map: List[ObligationCitation]   # canonical citation type, both id + display forms

@dataclass
class L4Attestation:
    evidence_ref: str            # binds to one L3Evidence by content
    author: str                  # named author on the public record
    external_timestamp: str      # pinned to an external public timeline
    independently_verifiable: bool   # checkable by any auditor, no Warrant call

An L4Attestation binds to one L3Evidence by its content, an L3Evidence contains the lower layers, and the typing makes accidental layer-collapse a compile-time concern. Anything that expects an L4Attestation cannot be tricked into accepting raw telemetry.

07 · WHAT BREAKS

What breaks when a layer is missing.

Failure modes per layer · the negative case for the four-layer split

The clearest argument for the split is the four failure cases, each a real outage shape:

Without L1, you cannot debug

A latency regression lands in production in Q3. The platform team needs span-level traces to bisect to a model variant, a region, a routing decision. Without observability, the team is blind.

Without L2, bad inputs reach production agents

A prompt-injection attack, a tool-call to an unauthorised endpoint, an order above the desk limit. Without runtime guards, the agent does the harmful thing and the team learns from the post-mortem. Real damage, real regulatory blast radius.

Without L3, you cannot reconstruct

The regulator opens an inquiry and asks for the trace from June 12, 2026, 14:23 IST. The team has telemetry, the right errors-per-second graph, but not behaviour, in the sense of the prompt the model ran, the clauses it cited, the human who approved. The regulator cannot grade behaviour the team cannot show.

Without L4, you cannot defend

Twelve months after a pivotal action, the CTO leaves under contested terms. A deposition asks whether the action ran with the controls the firm claimed in its quarterly attestations. The firm has logs. It cannot prove the logs are unchanged since the date of the action. The defence collapses on chain of custody.

08 · L3 / L4 BOUNDARY

The boundary between L3 and L4 is the regulator's friend.

Mutable storage, independently verifiable record · why the split is load-bearing

The counter-intuitive choice in the Warrant stack is the split between L3 and L4. A simpler product fuses them: write the evidence into a write-once log, call the log an attestation. The fused design fails on a soft requirement that turns out to be hard.

The soft requirement is re-rendering. An evidence package as PDF has a cover page, header, footer, page-numbering, layout the auditor's preferences can change. Six months in, the regulator asks for the same evidence in a different language with annexes attached. The L3 record is the source of truth; the PDF is a render. A fused design that attests the rendered PDF cannot re-render without breaking the attestation.

Warrant attests the canonical L3 record, not the PDF. The canonical form is reproducible: any honest re-render that round-trips through the same canonicalization reproduces the same bytes, so the attestation stays valid across re-renders and languages. Any party recomputes the binding from those canonical bytes and reaches the same yes-or-no, with no Warrant call in the path.

The hard requirement is integrity. The L3 record is mutable in the sense that it lives in storage your team operates. An insider with access could edit it. The L4 record makes the edit detectable. Change a byte of the canonical record and verification fails, while the external public reference still points at the original. The edit is loud.

Splitting evidence (mutable, your storage) from attestation (independently verifiable, externally referenced) gives both flexibility and integrity. A regulator gets the L3 evidence in whatever shape they need; the L4 attestation binds the shape to the original bytes.

09 · WORKED EXAMPLE

A regulator query, four responses.

"Show me the trace from 2026-06-12 14:23 IST" · what each layer returns

Put the same regulator question to each layer and read the four responses side by side:

Layer	Response to "show me 2026-06-12 14:23 IST"	Sufficient?
L1	p99 latency for the loan-decision service was 1.2s, error rate 0.04%, no alerts fired in the 10-minute window.	No, the regulator did not ask about latency.
L2	Two policy checks ran on requests in the window. Both returned allow. No HITL escalations, no tool-whitelist denials.	No, the regulator did not ask what was prevented.
L3	Trace 7de8...4a47, prompt v3.2.1, retrieved 4 policy clauses from internal-policy-2026-Q2, model rationale "applicant qualifies under SBA 7(a) eligibility", final disposition "approve $48,000", reviewed by underwriter U-2031.	Yes, the regulator can read the action.
L4	Record 7de8ceaeac42a47b..., authored by Warrant Compliance, pinned to an external public timeline at 2026-06-12 18:14:02 UTC. The auditor verifies it on their own machine, without contacting Warrant, and every check passes.	Yes, the regulator can trust the L3 record.

L1 and L2 are honest answers to questions the regulator did not ask. L3 answers the question. L4 proves the answer has not been edited since the action. A submission with L3 and L4 closes the inquiry. A submission with only L1 and L2 invites a follow-up the team cannot answer.

10 · WHAT WARRANT SHIPS

Closing · what Warrant ships.

L3 + L4 product, with hooks into L1 + L2

Warrant's product is L3 plus L4 with hooks into L1 and L2. We do not replace your observability platform. We do not replace your input filter or your policy engine. We sit downstream of both.

The integration shape is small. The platform accepts a trace payload in OpenTelemetry's wire format, accepts a runtime-decisions payload from your policy layer, and produces an L3 evidence package per attestable action. The package carries the L4 attestation and is independently verifiable from any laptop, with no Warrant API call.

Two-year regulator review is the test we hold the artefact to. The 2027 auditor reading the 2026 record reads the trace, follows the citations, verifies the record independently on their own machine, and closes the file in under twenty minutes.

Pick the layer your problem actually lives at. If the answer is "all four", the four-layer stack is what you build, and the two layers downstream are the layers most teams have skipped.

11 · FAQ

Questions an architect asks first.

FAQ · sourced from inbound from platform and risk teams Apr to May 2026

Do I need all four layers?

It depends on regulatory exposure. At a minimum, if you operate a high-risk AI system under EU AI Act Article 6 you need L3 and L4. Article 12(2) sets a six-month minimum retention floor that observability platforms typically miss. L1 and L2 remain in scope for engineering reasons but neither is sufficient for record-keeping or audit-trail clauses.

Can my Datadog dashboard count as Layer 3?

Not as currently configured. A general-purpose telemetry platform rotates logs out at 7 to 30 days under the default retention, samples high-volume traces, and produces no record an outside party can independently check. It sits in L1. Promoting it to L3 would require extended retention, lossless capture of prompts and retrievals, and a record an auditor can verify without contacting the vendor. That is a different product.

Where do guardrails libraries sit?

L2. Input classifiers, output filters, tool whitelists, and authorisation gates run at execution time and prevent harmful actions before they happen. They produce a runtime decision (allow, block, escalate) and emit a log entry, but the log entry is not the artefact a regulator reads two years later. Runtime guards prevent harm. They do not prove past behaviour.

Is Warrant a logging platform?

No. Warrant is L3 + L4. We integrate with your L1 platform, not replace it. The trace ingestion endpoint accepts the same OpenTelemetry-compatible payload your existing observability stack already produces. We sit downstream and emit an evidence package per attestable action that any auditor can verify independently, without contacting Warrant.

How does this map to NIST AI RMF?

The Measure function covers L1 and L2, telemetry plus runtime controls. The Manage function requires L3 records: documented evidence of how the system was operated, how risks were addressed, and how decisions were made. The Govern function implies L4 attestability, accountability that survives staff turnover and vendor change requires a record an outside party can verify independently, without contacting Warrant.

Can I retrofit Layer 4 onto historical Layer 1 logs?

Only for the period your logs are still intact and tamper-evident. If the underlying records have been rotated, sampled, or sit in a system without write-once storage, an attestation added today cannot prove the records existed in their current form on the original date. An independently verifiable record can only be made from bytes you can reproduce. Bytes you cannot reproduce cannot be attested.

12 · READ THE SOURCE

Read the source directly.

Authored by Warrant Engineering, the platform team at Warrant. [email protected]. Engineering commentary on the boundary between observability, runtime policy, evidence, and attestation. Not legal advice.

the four-layer evidence stack.