ENTRY № 38 · STATUTORY READING · 23 NYCRR § 500.6(a)(2) + SR 11-7
PUBLISHED 2026-06-04 · ~12-MIN READ · WARRANT COMPLIANCE

NYDFS Part 500 + SR 11-7: the audit trail an AI agent must produce.

Standard inference logs and LLM API logs do not satisfy a NYDFS 23 NYCRR § 500.6(a)(2) audit trail or SR 11-7 ongoing monitoring. When a regulator examines an AI agent inside a US financial-services firm, it asks five questions about each consequential action: what the agent accessed, when, under what authorization, under what constraints, and what decision it influenced. An inference log answers none of them. This post maps each of the five questions to the per-action record that answers it and to the clause that demands it.

Warrant is regulator-grade evidence infrastructure for AI agents in regulated industries: drop an agent's execution trace, get a record mapped to a specific regulatory obligation, independently verifiable without contacting Warrant.

NYDFS
§ 500.6(a)(2)
Audit trails designed to detect and respond to Cybersecurity Events. Applied to AI by the 16 Oct 2024 Industry Letter.
FEDERAL RESERVE
SR 11-7· SR 26-2
Model risk management. Section V.A.5 ongoing monitoring. Carried forward with explicit AI/ML scope in 2026.
CROSS-OVER
Annex III 5(b)
EU AI Act creditworthiness. One credit agent can carry NYDFS, SR 11-7, and Article 12 obligations at once.
01 · WHY A LOG IS NOT A TRAIL

Why an inference log is not an audit trail.

"Each covered entity shall securely maintain systems that, to the extent applicable and based on its risk assessment ... (2) include audit trails designed to detect and respond to cybersecurity events that have a reasonable likelihood of materially harming any material part of the normal operations of the covered entity." 23 NYCRR § 500.6(a)(2) · Second Amendment · effective 1 November 2023

The 16 October 2024 NYDFS Industry Letter does not impose a new rule. It applies 23 NYCRR Part 500 to AI, including § 500.6(a)(2). Read against that clause, a standard AI deployment's logs come up short. The API gateway records timestamp, method, path, status, latency. The LLM inference log records prompt tokens, completion tokens, the model id, and a redacted prompt body. Both are traffic-shaped. A § 500.6(a)(2) audit trail is operation-shaped. The full statutory read of why standard logs fall short is in standard API call logs do not satisfy 23 NYCRR § 500.6.

SR 11-7 sets the same bar from the model-risk direction. The Federal Reserve / OCC / FDIC interagency guidance, originally 4 April 2011, was carried forward by SR 26-2 in 2026 with explicit AI/ML scope. Its comprehensive-documentation standard requires that a knowledgeable third party reconstruct what the model did, when, and why, at the granularity of each tool call and each retrieval. The pillar-by-pillar reading is in SR 11-7 / SR 26-2, line by line.

Put the two regimes side by side and they converge on one shape: a record per consequential action that an examiner can read end-to-end. The question is what that record has to contain.

02 · THE FIVE QUESTIONS

The five questions a regulator asks.

An examiner reconstructing an AI agent's decision does not start from the architecture. It starts from a single consequential action and asks five questions. Each question is anchored to a clause, and each clause is satisfied by a field in the per-action record.

Q1
What did the agent access? The specific Nonpublic Information element, not a request hash. NYDFS § 500.6(a)(1) reconstruction + § 500.1(k) NPI · SR 11-7 § III.B(4) documentation → trace.actions[*] (subject, inputs, outputs)
Q2
When did it happen? A timestamp the Covered Entity cannot retroactively change. NYDFS § 500.6(b) retention · § 500.17(a)(1) 72-hour clock → trace.actions[*].ts on the public record, fixed in time
Q3
Under what authority? The policy, role, and purpose under which the action was permitted. NYDFS § 500.7 access privileges · SR 11-7 § V.A governance → authorization_envelope.* + within_purpose
Q4
Under what constraints? The preconditions, oversight, and reversibility that bounded the action. SR 11-7 § V.A use-context · § V.B effective challenge → authorization_envelope.preconditions_met + alternatives considered
Q5
With what result? The decision the action influenced, and the model that produced it. NYDFS § 500.11 third-party · SR 11-7 § IV.A inventory → trace.actions[*].outputs + model inventory binding
"An inference log answers none of the five. The per-action record answers all five."Warrant Compliance · 2026-06-04
03 · Q1 · WHAT WAS ACCESSED

What did the agent access.

The first question is the simplest to ask and the hardest for a standard log to answer. § 500.6(a)(1) requires systems designed to reconstruct material financial transactions. The reconstruction has to name the specific data element the agent touched, because Nonpublic Information under § 500.1(k) is defined broadly: business information whose tampering would materially impact operations, the classic combination-PII prong, and health information.

An agent that fetched a customer's last twelve transactions records those twelve specific transaction IDs, not "GET /transactions returned 200 in 84ms." The record names the subject of each action and the inputs and outputs that flowed through it. In the per-action record this is trace.actions[*] carrying subject, inputs, and outputs. SR 11-7 § III.B(4) reads the same requirement as the documentation pillar: a knowledgeable third party must be able to reconstruct what the model consumed and produced.

This is also where the agentic shape diverges sharply from a 2011-era model. A credit-score regression issued one decision and the trail was a single database row. An AI agent issues a sequence of tool calls and retrievals before the customer-facing decision, and the audit trail has to reconstruct each one. Anything coarser fails the third-party replicability test on examination.

04 · Q2 · WHEN

When did it happen.

Each covered entity shall maintain records ... for not fewer than five years [under (a)(1)] and ... for not fewer than three years [under (a)(2)], to reconstruct material financial transactions and to detect and respond to cybersecurity events. 23 NYCRR § 500.6(b) · retention · Second Amendment

The second question is about time, and time is where most logs quietly fail. § 500.6(b) sets a retention floor of five years for (a)(1) records and three years for (a)(2) records. An audit trail that lives inside a 30-day application-log rotation does not satisfy. More to the point, the retention clock is meaningless if the timestamps inside the trail can be edited after a Cybersecurity Event.

The § 500.17(a)(1) 72-hour notice clock starts at determination of a Cybersecurity Event, not at occurrence. To meet that clock, an investigator has to place the agent's actions at fixed points relative to the determination. A record whose timestamp is under the Covered Entity's own control cannot do that against an adversarial reading. The per-action trace.actions[*].ts is placed on the public record and fixed in time, so the placement is independently verifiable without contacting Warrant rather than asserted from the Covered Entity's own clock.

05 · Q3 · UNDER WHAT AUTHORITY

Under what authorization.

The third question separates legitimate access from compromised access, and it is the one a § 500.6(a)(2) audit trail is built around. An agent that accessed an account number is in scope for § 500.7 access privileges, and the trail must record the authorization the access satisfied. Without it, an investigator cannot tell a permitted read from a breach.

In the per-action record this is the authorization_envelope together with the within_purpose determination: the policy, the role, and the purpose limitation under which the agent was permitted to take the action. Standard logs record that a request succeeded. They do not record whether the agent was allowed to make it.

SR 11-7 reads the same field from the governance pillar. § V.A requires a named human officer accountable for the model's outputs; the agent does not displace that accountability, it inherits it. The record binds each action to the policy version current at decision time and to the accountable officer's role, so the authorization an examiner reads is the one that actually applied, not a reconstruction after the fact. The deployer-side accountability question, read against the EU regime, is in the Article 26 deployer obligations, line by line.

06 · Q4 · UNDER WHAT CONSTRAINTS

Under what constraints.

An effective validation framework should include ... ongoing monitoring ... Validation activities should continue on an ongoing basis after a model goes into use, to track known model limitations and identify any new ones. SR 11-7 · § V.A.5 and § V.B · ongoing monitoring and effective challenge

The fourth question is the one most logs never even attempt. SR 11-7 § V.B defines effective challenge as critical analysis by objective, qualified individuals who can identify model limitations and assumptions. Under SR 26-2 that expectation extended to runtime: the bank is expected to log, per decision, what alternatives the agent considered and why the chosen path was preferred. The agent that emits one path through one tool with no record of the alternatives it weighed and discarded is the gap finding most likely to surface in the next examination cycle.

In the per-action record the constraints leg is authorization_envelope.preconditions_met together with the per-action capture of whether human oversight was appropriate, whether the action was reversible, and the alternatives considered. This is the same shape § 500.6(a)(2) reads from the cybersecurity direction: an action that stayed inside the preconditions its authorization required is legitimate; an action that did not is a Cybersecurity Event under § 500.1(f). The constraint record is what lets an examiner tell the two apart.

SR 11-7 § V.A.5 also fixes when the constraints have to be re-established. A foundation-model swap, a prompt-template rewrite that broadens the use case, or a retrieval-corpus change that introduces new domains each read as triggers for re-validation. The per-action record carries the model and policy version in force at decision time, so an examiner can see whether the action ran under the constraints that were actually validated.

07 · Q5 · WITH WHAT RESULT

With what result.

The fifth question closes the loop: what decision did the action influence, and which model produced it. § 500.6(a)(2) requires the trail to reconstruct a Cybersecurity Event end-to-end, not merely log its detection. SR 11-7 § IV.A requires every material model in production to carry an inventory row with version, owner, last validation date, and residual risk. An LLM-driven decisioning agent without an inventory row is, on the operative SR 26-2 read, an unmanaged model, and unmanaged models are the most common phrasing in Matters Requiring Attention letters.

In the per-action record the result leg is trace.actions[*].outputs bound to the model inventory identifier of the version that produced it. An examiner pulling a single decision walks from the action, through the outputs, to the inventory row, to the model card and active validation. The walk takes seconds. The same walk for a firm that does not bind decisions to inventory rows takes weeks and often produces a partial answer, which is itself the gap finding.

§ 500.11 third-party service provider governance attaches here too. A foundation-model provider that processes NPI on the Covered Entity's behalf is a third-party service provider, and the audit trail must record the model identity and the model provider per action. The result field carries the model that produced the output, so the third-party chain is on the record alongside the decision it shaped.

W
Sample US evidence package · small-business underwriting agentINDEPENDENTLY VERIFIABLE · MAPPED TO § 500.6(a)(2) + SR 11-7
→ us-fintech.pdf
08 · THE EU CROSS-OVER

Where the EU AI Act crosses over.

The five questions are not unique to the US. An AI agent that evaluates the creditworthiness of natural persons or establishes their credit score is high-risk under Annex III point 5(b) of Regulation (EU) 2024/1689, which brings the Article 12 record-keeping obligation. Application of Article 12 to Annex III standalone systems begins 2 August 2026, subject to a provisional deferral to 2 December 2027 under the May 2026 Digital Omnibus, pending publication in the Official Journal. Non-compliance is reachable under Article 99(4) at up to EUR 15 million or 3 percent of global annual turnover.

So one credit-decisioning agent serving EU and US customers can carry three record obligations at once: NYDFS § 500.6(a)(2), SR 11-7 ongoing monitoring, and EU AI Act Article 12. The supervisors differ; the questions do not. Each asks what the agent accessed, when, under what authority, under what constraints, and with what result. That convergence is the point: one per-action record, mapped to a specific obligation under each regime. The classification reading for creditworthiness is in the high-risk classification Guidelines, read in full.

3 regimes
ONE CREDIT AGENT
NYDFS Part 500, SR 11-7 / SR 26-2, and EU AI Act Article 12 can all attach to a single credit-decisioning agent.
5 questions
ONE RECORD SHAPE
What, when, under what authority, under what constraints, with what result. The same per-action record answers all five across all three.
09 · FAQ

Questions a compliance officer asks first.

Do standard LLM inference logs satisfy a NYDFS 500.6(a)(2) audit trail?

No. § 500.6(a)(2) requires audit trails designed to detect and respond to Cybersecurity Events. An LLM inference log records prompt tokens, completion tokens, the model id, and a redacted prompt body. It does not record what specific Nonpublic Information the agent accessed, under what authorization, or what decision the output influenced. It is traffic-shaped, not operation-shaped, so it does not answer the questions a § 500.6(a)(2) audit trail must answer.

What are the five questions a regulator asks of an AI agent decision?

What the agent accessed, when, under what authorization, under what constraints, and what decision it influenced. Under NYDFS Part 500 these map to § 500.6(a)(2) audit trails, § 500.6(a)(1) reconstruction, § 500.7 access privileges, and § 500.11 third-party governance. Under SR 11-7 they map to the model inventory, ongoing monitoring, the use-context check, and the comprehensive-documentation replicability standard. The same per-action record answers all five.

Does SR 11-7 require evidence per decision or only at validation time?

Both. SR 11-7 § V.A.5 directs validation to continue on an ongoing basis after a model goes into use. SR 26-2 expanded effective challenge to include alternatives-considered logging at runtime, not only at validation. The § III.B(4) comprehensive-documentation standard, read with § III.A.5, requires a knowledgeable third party to reconstruct what the model did at the granularity of each tool call and retrieval. That is a per-decision evidence obligation.

How does the EU AI Act creditworthiness rule cross over with NYDFS and SR 11-7?

An AI agent that evaluates the creditworthiness of natural persons is high-risk under Annex III point 5(b) of Regulation (EU) 2024/1689, which brings the Article 12 record-keeping obligation. The same agent inside a US bank is a material model under SR 11-7 and SR 26-2 and touches Nonpublic Information under § 500.1(k). One agent can carry three record obligations at once, and the five regulator questions are common to all three.

What is the penalty exposure for an inadequate AI audit trail?

Under NYDFS Part 500, civil money penalties plus consent orders. Recent settlements include PayPal at USD 2 million (27 January 2025) and the combined Geico and Travelers settlement at USD 11.3 million (28 November 2023) citing § 500.6 audit-trail gaps. Under SR 11-7, exposure runs through Matters Requiring Attention and Matters Requiring Immediate Attention findings and civil money penalties; recent enforcement includes the Wells Fargo model-risk actions exceeding USD 3 billion and the Citigroup USD 400 million order.

Does the CISO certification under 500.17(b) cover the AI audit trail?

It covers the obligation, not the evidence. § 500.17(b)(1)(i) requires the certification to rest on data and documentation sufficient to accurately determine and demonstrate material compliance. A CISO who certifies § 500.6(a)(2) compliance for AI without an operation-level evidence trail is signing on faith. The Second Amendment certification under § 500.17(b)(2) is submitted by the Covered Entity's highest-ranking executive, which puts that officer on the same risk.

10 · READ THE SOURCE

Read the source directly.

Authored by Warrant Compliance, the regulatory-analysis function at Warrant. [email protected]. Editorial commentary on regulatory text. Not legal advice. The five-question framing is Warrant's reading of 23 NYCRR § 500.6(a)(2) applied to AI per the 16 October 2024 Industry Letter and of SR 11-7 / SR 26-2 ongoing monitoring; the regulators did not write it as a numbered list. The verbatim quotations of § 500.6 and SR 11-7 § V are from the official texts cited above.