NIST AI RMF + Generative AI Profile for AI agents

01 · § 5 · THE FOUR FUNCTIONS IN ONE PARAGRAPH

The load-bearing claim.

The AI RMF Core is composed of four functions: GOVERN, MAP, MEASURE, and MANAGE. Each function is broken down into categories and subcategories. GOVERN is a cross-cutting function that is infused throughout AI risk management and enables the other functions of the process. The MAP function establishes the context to frame risks related to an AI system. The MEASURE function employs quantitative, qualitative, or mixed-method tools, techniques, and methodologies to analyze, assess, benchmark, and monitor AI risk and related impacts. The MANAGE function entails allocating risk resources to mapped and measured risks on a regular basis and as defined by the GOVERN function. NIST AI 100-1 · § 5 · the AI RMF Core

Four verbs, one paragraph, the entire methodology. NIST chose verbs not nouns deliberately. GOVERN is not governance; it is the act of cultivating a culture in which AI risk is named and owned. MAP is not a mapping document; it is the act of establishing context per system. MEASURE is not metrics; it is the act of running tests, evaluations, verifications, and validations against the system. MANAGE is not risk register; it is the act of allocating treatment to a measured risk. The verbs specify continuing obligations, not artefacts.

The single most-misread sentence in the entire RMF is the one that follows the four definitions in § 5. The four functions are not sequential. They are concurrent and interdependent. A firm that treats GOVERN as a one-time policy publication, MAP as a system-launch checklist, MEASURE as a quarterly metrics review, and MANAGE as a ticket queue has not implemented the RMF. The RMF requires the four functions to operate together against every AI system the firm runs, every release the firm cuts, and every decision the firm's AI produces.

The 19 categories and 72 subcategories under the four functions are the supervisor's expansion of what each verb means in operational practice. GOVERN has 6 categories. MAP has 5. MEASURE has 4. MANAGE has 4. Each subcategory is a discrete practice that an organisation can self-attest against. NIST does not score the attestation; the document is constructed for self-assessment. But the absence of a subcategory from a firm's self-attestation reads as an absence to a federal procurement officer reviewing the firm's RMF posture.

For an AI agent operating in production, the four functions translate into per-decision evidence categories. GOVERN is the audit trail of who decided what was acceptable risk and when. MAP is the per-trace context (purpose, jurisdiction, affected parties) that frames the decision. MEASURE is the eval-suite run record plus the per-decision residual-risk record. MANAGE is the per-decision rationale and the treatment-of-risk evidence. The four functions read against an AI agent in production become four columns in the per-decision evidence package.

"GOVERN, MAP, MEASURE, MANAGE. Four verbs at the head of NIST AI 100-1 specify continuing obligations. Everything below that is the supervisor explaining what counts as evidence."Warrant Compliance · 2026-05-09

02 · § 5.1 · GOVERN

Culture, accountability, transparency.

GOVERN is the cross-cutting function. It is not the first stage of a pipeline; it is the soil in which the other three functions grow. NIST's framing in § 5.1 is that GOVERN cultivates a culture of AI risk management that is enabled by senior leadership, that is inclusive of diverse perspectives, and that is supported by clear policies, procedures, and practices. The function exists in the language of culture because the RMF authors observed that AI risk management without cultural reinforcement tends to collapse into a documentation exercise.

Policies, processes, procedures, and practices across the organization related to the mapping, measuring, and managing of AI risks are in place, transparent, and implemented effectively. NIST AI 100-1 · GOVERN 1 · category statement

GOVERN 1 has six subcategories. The load-bearing one is GOVERN 1.1: legal and regulatory requirements involving AI are understood, managed, and documented. For a firm running an AI agent across multiple jurisdictions, GOVERN 1.1 alone encompasses the entire EU AI Act mapping, the FCA Consumer Duty mapping, the SR 11-7 model-risk mapping, the SEBI retail-algo mapping, and any other regime that engages the firm's AI use. The RMF reads through to those regimes; it does not displace them.

GOVERN 1.1

Legal and regulatory requirements involving AI are understood, managed, and documented. SCOPE · per-system mapping of every regime engaged. The supervisor reads through the RMF to the underlying regulator's expectations.

GOVERN 2.1

Roles and responsibilities and lines of communication related to mapping, measuring, and managing AI risks are documented. SCOPE · named accountable individual per AI system. Equivalent to the Senior Manager designation under SMCR or the Chief AI Officer designation under OMB M-24-10.

GOVERN 4

Organizational teams are committed to a culture that considers and communicates AI risk. SCOPE · an AI risk-management workforce, with training, escalation paths, and dedicated capacity. Not a side-of-desk responsibility.

GOVERN 5

Processes are in place for active engagement with relevant AI actors. SCOPE · stakeholder engagement that is documented, repeatable, and inclusive of impacted parties. Maps to EU AI Act Article 27 conformity-assessment stakeholder consultation.

GOVERN 6

Policies and procedures are in place to address AI risks and benefits arising from third-party software and data. SCOPE · supplier risk for foundation models, vector stores, eval providers, and any other component the firm did not build. Reads through to the EU AI Act Article 25 value-chain provisions.

For an AI agent in production, the GOVERN function is the audit trail of who decided what was acceptable risk and when. A per-release attestation that names the accountable Senior Manager, the regulatory regimes engaged, the residual risk accepted, and the date of acceptance is the operational discharge of GOVERN 1.1, GOVERN 2.1, and GOVERN 4 simultaneously. A GOVERN function that exists only as a published policy document and a quarterly steering-committee minute is the configuration the supervisor expects to fail.

The absence of GOVERN evidence is the most common gap NIST cites in its consultation responses. Firms run MAP-style work in product launches and MEASURE-style work in evaluation suites; the gap is at GOVERN, where the cultural and accountability infrastructure should turn the technical work into a defensible record. The Warrant package addresses this gap: every per-decision record names the accountable signer and the release it belongs to, and the record is independently verifiable without contacting Warrant. That is the GOVERN evidence the technical pipelines do not by themselves produce.

03 · § 5.2 · MAP

Context and risks.

MAP is the function that establishes context per AI system. The function's five categories work together to answer five questions: what is this system for, what is it, what does it know, what does it depend on, and what does it affect. NIST's framing is that without an answer to all five questions the firm cannot meaningfully measure or manage the risks the system creates. The MAP function is where Warrant's classification stage produces its first evidence layer.

Context is established and understood. Categorization of the AI system is performed. AI capabilities, targeted usage, goals, and expected benefits and costs compared with appropriate benchmarks are understood. Risks and benefits are mapped for all components of the AI system including third-party software and data. Impacts to individuals, groups, communities, organizations, and society are characterized. NIST AI 100-1 · § 5.2 · MAP function summary

MAP 1

Context is established and understood. SCOPE · the purpose and the operating environment of the AI system, including the regulatory perimeter and the deployment scope.

MAP 2

Categorization of the AI system is performed. SCOPE · risk-tier classification, system-class identification, and the applicable category under any external standard (high-risk under the EU AI Act, safety-impacting under OMB M-24-10).

MAP 3

AI capabilities, targeted usage, goals, and expected benefits and costs compared with appropriate benchmarks are understood. SCOPE · capability inventory, knowledge-cutoff awareness, scientific-integrity claims for any modelled domain.

MAP 4

Risks and benefits are mapped for all components of the AI system including third-party software and data. SCOPE · supply-chain mapping. Foundation model, embedding model, retrieval store, eval harness, telemetry pipeline. Each with its own risk profile and licence terms.

MAP 5

Impacts to individuals, groups, communities, organizations, and society are characterized. SCOPE · per-decision affected-party check. Who could the action affect, who is the action's primary subject, who carries downstream consequences.

The MAP function is where a per-trace classification produces direct RMF evidence. Warrant maps each trace to a domain, a set of jurisdictions, the regulatory regimes engaged, and a risk tier; that output discharges MAP 1, MAP 2, and MAP 5 in a single record. The third-party-component answer for MAP 4 is encoded once at deployment time (the model identity, the embedding provider, the eval harness) and bound to every per-decision package by reference. The capability-and-benchmark answer for MAP 3 is the eval suite the firm runs against the system on a release cadence.

For an AI agent making a customer-facing decision, MAP 5 is the operative subcategory the supervisor will pull on examination. The supervisor's question is whether the firm characterised the impact of this action on this customer (or this protected group) before the action shipped. A characterisation that exists only at the system level, not the per-action level, does not survive the question. MAP 5 is per action, not per system, and the per-action record is the one the supervisor asks to see.

04 · § 5.3 · MEASURE

Analytics, metrics, validation.

MEASURE is the function that quantifies the risks the MAP function identified. Its four categories are the load-bearing instrumentation: appropriate methods and metrics, the test-evaluate-verify-validate (TEVV) regime, the mechanism for tracking risks over time, and the feedback loop that surfaces new risks as the system operates. MEASURE is also the function whose subcategory NIST writes most prescriptively about test methodology; MEASURE 2.7 is the explicit TEVV obligation.

TEVV processes for AI systems are in place, and AI systems' uncertainty quantification, robustness, accuracy, and reliability are documented, including TEVV outputs that inform decisions about the system's deployment and operations. NIST AI 100-1 · MEASURE 2.7 · TEVV obligation

MEASURE 1

Appropriate methods and metrics are identified and applied. SCOPE · the firm names which metrics matter for this AI system. Accuracy, calibration, fairness, latency, refusal rate, and any domain-specific test the regime requires.

MEASURE 2

AI systems are evaluated for trustworthy characteristics. SCOPE · TEVV. Documented evaluation suite covering uncertainty quantification, accuracy, reliability, and safety. Per-release evidence.

MEASURE 3

Mechanisms for tracking identified AI risks over time are in place. SCOPE · per-decision residual-risk record. Tracking is per cohort and per decision, not exclusively per cohort.

MEASURE 4

Feedback about efficacy of measurement is gathered and assessed. SCOPE · the loop that flags when a metric the firm chose has decayed or no longer captures the risk. Includes user-facing reporting channels and adversarial reporting.

MEASURE 2.7 is the subcategory that expands TEVV into operational practice. Uncertainty quantification means the model can distinguish high-confidence from low-confidence outputs, and the firm has evidence per output of which class applied. Robustness means the model's behaviour is stable under distribution shift, adversarial input, and edge-case input, and the firm has evidence the test cases were exercised. Accuracy means the model's outputs match the ground truth where ground truth exists, and the firm has evidence per release of the accuracy on the relevant test set. Reliability means the model degrades gracefully under failure modes, and the firm has evidence the failure modes were enumerated.

The Warrant 200-trace evaluation suite (the regulator-grade-evals work shipped February 2026) is a TEVV instantiation that satisfies MEASURE 2.7 directly. The suite is run against every release; the suite's run record is bound by reference to every evidence package the system produces post-release. The supervisor's question on MEASURE 2.7 (show me the TEVV evidence for the release that produced this decision) reduces to a record that is independently verifiable without contacting Warrant rather than a discovery exercise across observability tooling. /blog/regulator-grade-evals documents the suite's construction in full.

MEASURE 3 is the subcategory that closes the per-decision loop. The RMF reads tracking as continuous, not periodic. A tracking mechanism that operates only at the cohort level and only on a quarterly cadence does not satisfy MEASURE 3 for an AI system that produces decisions every minute. The per-decision residual-risk record is the load-bearing piece, and the per-cohort rollup reads off the per-decision records, not the other way around. MEASURE 3 is what fails when an AI agent's monitoring rotates with the underlying observability data.

05 · § 5.4 · MANAGE

Prioritisation and treatment.

MANAGE is the function that takes the risks MAP identified and MEASURE quantified and decides what to do about them. Its four categories are: prioritise treatment based on assessment, develop strategies to maximise benefits, manage third-party risks, and document risk treatments and decisions. The MANAGE function produces the risk-treatment evidence record per decision.

AI risks based on assessments and other analytical output from the MAP and MEASURE functions are prioritized, responded to, and managed. Strategies to maximize AI benefits and minimize negative impacts are planned, prepared, implemented, documented, and informed by input from relevant AI actors. NIST AI 100-1 · § 5.4 · MANAGE function summary

MANAGE 1

AI risks based on assessments are prioritized, responded to, and managed. SCOPE · risk-treatment per identified risk. Accept, mitigate, transfer, or avoid, with the choice documented and justified.

MANAGE 2

Strategies to maximize AI benefits and minimize negative impacts are planned. SCOPE · the upside-and-downside framing. The RMF expects the firm to be intentional about the benefits the system creates, not exclusively defensive about the risks.

MANAGE 3

AI risks and benefits from third-party entities are managed. SCOPE · supplier risk in operation, not just at procurement. Foundation-model behaviour change, embedding-provider drift, eval-harness regression. Continuing obligation.

MANAGE 4

Risk treatments, including response and recovery, and communication plans for the identified and measured AI risks, are documented and monitored regularly. SCOPE · the per-decision rationale, the per-cohort treatment plan, the recovery procedure. Documented and reviewable on supervisor request.

MANAGE 4 is the subcategory that lands hardest on per-decision evidence. The RMF expects the firm to document the risk treatment per identified risk and the decision basis on which the treatment was selected. For an AI agent producing decisions in real time, the per-decision rationale is the document the supervisor will pull. A rationale that exists only as a system-level policy document, not as a per-action record bound to the action's outputs, does not survive the question.

The Warrant per-action decision_rationale field is the direct answer to MANAGE 4. Per decision, Warrant produces an authorisation envelope: within_purpose, preconditions_met, human_oversight_appropriate, reversible, justification. The justification is the per-decision rationale that MANAGE 4 requires. Bound into a record that is independently verifiable without contacting Warrant, the field is retrievable per decision long after the action shipped, which is the survival property MANAGE 4 implicitly requires for a regime that the supervisor reads in years.

06 · NIST AI 600-1 · THE GENERATIVE AI PROFILE

Twelve risk categories specific to generative AI.

NIST published the Generative AI Profile (NIST AI 600-1) on 26 July 2024. The profile is the companion document to AI RMF 1.0 that addresses risks specific to generative AI systems. The structure follows the four functions of the parent RMF; the additions are the twelve generative-specific risk categories and the suggested actions per category.

The Generative AI Profile is a companion resource to the AI RMF. It identifies risks that are novel to or exacerbated by generative AI use, including: CBRN information; confabulation; dangerous, violent, or hateful content; data privacy; environmental impacts; harmful bias and homogenization; human-AI configuration; information integrity; information security; intellectual property; obscene, degrading, and abusive content; and value chain and component integration. NIST AI 600-1 · § 2 · suite of risks unique to or exacerbated by generative AI

Twelve risk categories. Each category has GOVERN, MAP, MEASURE, and MANAGE-style suggested actions, totalling more than 200 individual practices a generative-AI deployment can self-attest against. The profile is not a checklist; NIST is explicit that not every category applies to every deployment. A code-completion model in an enterprise IDE has a different risk profile than a customer-facing chatbot in a regulated industry, and the profile expects the firm to map the categories to the deployment's actual surface.

The twelve categories partition into three rough groups by how directly they engage AI evidence-of-record obligations. The first group (CBRN, dangerous content, obscene content) reaches outputs that should not have been produced; the evidence is the refusal record and the upstream filtering record. The second group (confabulation, information integrity, harmful bias, intellectual property) reaches output quality and trustworthiness; the evidence is the grounding record, the citation record, and the residual-risk record. The third group (data privacy, information security, environmental impacts, human-AI configuration, value chain and component integration) reaches operational and architectural concerns; the evidence is the supply-chain record, the human-oversight record, and the deployment-scope record.

The middle group is where Warrant's evidence shape lands hardest. A per-decision package that records the inputs, the retrieval-grounded sources, the output, and the residual risk is direct evidence against confabulation, information integrity, and harmful bias risks simultaneously. The package does not eliminate the risks; the RMF does not expect elimination. It produces the evidence record that the firm identified the risk, treated it, and accepted the residual.

07 · THE LOAD-BEARING GENAI RISKS FOR EVIDENCE

Three of twelve that produce per-decision evidence.

Of the twelve generative-AI risk categories, three produce direct per-decision evidence shape that maps cleanly to the Warrant record. The remaining nine produce evidence at the system level (CBRN refusal policies), at the architecture level (information security controls), or at the supply-chain level (value chain and component integration). The three load-bearing categories for per-decision evidence are confabulation, information integrity, and human-AI configuration.

CONFABULATION

Model-generated false content presented as fact. EVIDENCE · per-decision retrieval-grounded response, source citations bound to outputs, refusal-when-uncertain pattern recorded.

INFORMATION INTEGRITY

Degradation of trust in information ecosystems. EVIDENCE · a record per decision that is independently verifiable without contacting Warrant, with canonical-source citation.

HUMAN-AI CONFIGURATION

Conditions under which humans must intervene in or override the AI system. EVIDENCE · per-decision human-oversight check, escalation log, named reviewer where the trigger fired.

Confabulation is the generative-AI risk that has produced the most regulator commentary across 2024 and 2025. The supervisor's framing is that a model that asserts false content with the same confidence as true content places the entire information chain at risk; the firm cannot defend the deployment without evidence that the chain is grounded. The mitigation evidence is per-decision: the retrieval sources the model consulted, the citations bound to the output, and the refusal pattern where the model lacked sufficient grounding. A grounding record that exists at the architecture level but not the per-decision level does not answer the supervisor's per-customer question.

Information integrity is the second-order risk that follows confabulation at scale. NIST's framing is that even where any single confabulated output is recoverable, a population of confabulated outputs degrades trust in the entire information ecosystem. The mitigation is structural: every output the AI agent produces should carry a verifiable provenance trail, and the trail should be independently inspectable. The Warrant evidence package is the structural answer; each record is independently verifiable without contacting Warrant, which is the property NIST asks for under information integrity.

Human-AI configuration is the risk that lives in the boundary between automation and oversight. NIST's framing in AI 600-1 is that a generative-AI system's human-oversight model must be specified per decision class, not per system; some decisions warrant pure automation, others warrant human review, others warrant human-in-the-loop coordination. The mitigation evidence is the per-decision oversight check: did the trigger conditions fire, did a reviewer engage, what was the reviewer's identity and time-on-task. The Warrant authorization envelope's human_oversight_appropriate boolean is the field that carries this evidence into the per-decision package.

08 · CROSSWALKS

RMF, ISO 42001, EU AI Act.

NIST has published official crosswalks from the AI RMF to ISO/IEC 42001:2023, to the OECD AI Principles, and to other major standards. The crosswalks are not for show; they are the operational answer to the question every multi-jurisdiction firm asks: do i need to run RMF, 42001, and EU AI Act compliance separately? The honest answer is that the three regimes overlap by 60 to 70 percent on substance, and a single per-decision evidence shape can satisfy all three.

The RMF-to-42001 crosswalk is the cleanest. The four RMF functions map to ISO 42001 management-system clauses with high fidelity. GOVERN maps to ISO 42001 Clauses 5 (leadership) and 7 (support); the cultural and accountability infrastructure the RMF describes is the management-system commitment ISO codifies. MAP maps to ISO 42001 Clause 6.1 (planning and risk assessment); the contextualisation work is the same in both regimes. MEASURE maps to ISO 42001 Clause 9 (performance evaluation); both regimes treat measurement as a continuing obligation. MANAGE maps to ISO 42001 Clauses 8 (operation) and 10 (improvement); the treatment-of-risk and continuous-improvement work is shared.

GOVERN

↔ ISO 42001 Clauses 5 + 7 · leadership and support. The cultural and accountability infrastructure the RMF describes is the management-system commitment ISO codifies.

MAP

↔ ISO 42001 Clause 6.1 · planning + risk assessment. The contextualisation and risk-identification work is substantially the same in both regimes.

MEASURE

↔ ISO 42001 Clause 9 · performance evaluation. Both regimes treat measurement as a continuing obligation tied to releases and operational evidence.

MANAGE

↔ ISO 42001 Clauses 8 + 10 · operation + improvement. Treatment-of-risk plus continuous-improvement work is shared across the two regimes.

The RMF-to-EU-AI-Act crosswalk runs through Articles 9, 12, and 13 of the binding regulation. Article 9 (risk management system) maps to RMF MAP plus MANAGE; the EU AI Act's Annex IV documentation requirement reads directly into the RMF MAP and MANAGE outputs. Article 12 (logging) maps to MEASURE 3 (mechanisms for tracking risks over time); both regimes require per-event records that survive across the regulatory horizon. Article 13 (transparency) maps to MAP 5 (impacts characterised) plus GOVERN 5 (engagement with relevant AI actors); the transparency obligation reaches the affected-party characterisation and the stakeholder-engagement record together.

The meta-answer for a CTO running an AI agent across multiple regimes is that one evidence shape can satisfy three jurisdictions. The shape is the per-decision package that captures the four-function evidence in a single record, mapped to a specific obligation under each regime. The regime-specific obligations are then a binding step at the end: same per-action record, different external citations. /blog/one-agent-many-jurisdictions walks the binding step in detail.

09 · WHERE WARRANT MAPS NIST AI RMF + GENAI PROFILE

The category-to-field map.

The table below names the RMF function or category, the evidence the AI agent must produce per action, and the Warrant evidence field that carries the record into the per-decision package. The mapping is the shape an accountable Senior Manager or Chief AI Officer can hand to a federal procurement officer or an internal-audit team without further engineering.

RMF function · category	What evidence must show	Warrant evidence field
GOVERN 1.1 · legal req	Regulator-mapping per decision.	regulator_evidence.regimes_engaged
GOVERN 2.1 · roles	Named accountable signer per release.	trace.signed_off_by
MAP 2 · categorization	Risk-tier classification per trace.	classification.risk_tier
MAP 5 · impacts	Per-action affected-party check.	trace.actions[*].affected_parties
MEASURE 2 · TEVV	Eval-suite run record per release.	regulator_evidence.eval_suite_record
MEASURE 3 · track over time	Per-decision residual-risk record.	trace.actions[*].residual_risk_check
MANAGE 4 · decisions documented	Per-decision rationale.	trace.actions[*].decision_rationale
GenAI · confabulation	Retrieval-grounded responses + refusal.	trace.actions[*].retrieval_grounded_check
GenAI · information integrity	A record independently verifiable without contacting Warrant, with canonical citations.	trace.actions[*].citations
GenAI · human-AI config	Human-oversight trigger log.	trace.actions[*].oversight_trigger

The mapping is reversible. Given a procurement officer's question on a specific RMF subcategory, the firm reads the column, retrieves the field, and produces the per-decision record. Given a specific customer or internal-audit case, the firm reads the per-decision record and produces the bound RMF subcategories. Either direction is one query against the per-decision package.

Sample evidence package · NIST AI RMF subcategories bound per actionINDEPENDENTLY VERIFIABLE WITHOUT CONTACTING WARRANT

→ /v/7de85ceaeac42a47

10 · EO 14110 + OMB M-24-10

How a voluntary RMF became federal procurement-gate.

Executive Order 14110, signed 30 October 2023, is the document that turned the voluntary RMF into the de facto federal methodology. Section 4.1 of the order directed NIST to extend the AI RMF to address generative-AI risks (which produced AI 600-1 in July 2024) and tasked federal agencies with adopting RMF-aligned governance. The order is a policy instrument, not a statute, but it binds every executive-branch agency to the methodology by direction.

OMB Memorandum M-24-10, dated 28 March 2024, is where the order operationalised. The memorandum directs federal agencies to inventory their AI uses, conduct impact assessments, designate a Chief AI Officer with named accountability, and apply specific safeguards to safety-impacting and rights-impacting AI uses. The methodological backbone for those safeguards is the AI RMF. Where an agency cannot describe a deployment against the four RMF functions, the agency cannot complete the M-24-10 governance obligation.

EO 14110

EXECUTIVE ORDER · 2023-10-30

Directed federal agencies to use the AI RMF. Tasked NIST with the GenAI Profile. Section 4.1(a)(i) is the bridge.

M-24-10

OMB MEMO · 2024-03-28

Made RMF operational for federal AI use. Inventory, impact assessments, named Chief AI Officer per agency.

CAIO

CHIEF AI OFFICER

Named accountable individual at every covered agency. The federal-side equivalent of the SMCR Senior Manager.

FY26

PROCUREMENT CYCLE

RMF alignment is now a contestable line item in federal AI acquisition language. Vendors that cannot answer face attrition at evaluation.

For a SaaS vendor selling into US federal procurement in 2026, the implication is direct. The federal acquisition cycle increasingly references the RMF as baseline; the GSA AI procurement guidance, the DoD CDAO acquisition language, and the civilian-agency AI buying patterns all read the RMF as the methodological vocabulary. A vendor that cannot describe its product against the four functions, the relevant subcategories, and the GenAI Profile risks is a vendor that introduces friction at procurement evaluation. RMF alignment is a procurement-gate property in 2026, regardless of the voluntary label on the document.

The supervisory parallel runs in regulated industry as well. The SR 11-7 model-risk regime in US banking has been reading through to RMF concepts in supervisory examination across 2024 and 2025; the FDA's evolving guidance on AI-based medical devices reads RMF MAP and MEASURE language directly into the device's algorithm change protocol. /blog/sr-11-7-model-risk walks the SR 11-7 reading in detail. The pattern is that voluntary federal methodology becomes mandatory regulator expectation through the sectoral pathway, not the procurement pathway alone.

11 · WHERE THE RMF STOPS SHORT

Framework, not artefact.

The RMF is honest about its limits. NIST writes the document as a methodology, not a binding standard. The RMF does not certify; there is no RMF-compliant badge. It does not pass-fail; the categories are self-attested. It does not produce an artefact; the firm running the methodology produces whatever artefacts its choice of tooling generates, and NIST stays out of that choice.

The honesty is also the gap. A federal procurement officer reading a vendor's RMF self-attestation has no independent way to verify the attestation's evidence base. An internal auditor reading the same attestation against the firm's actual operating posture has the same problem. The RMF asks the firm to produce evidence; it does not specify what the evidence looks like or how the firm should make the evidence portable across tooling, retention horizons, and personnel changes.

The Warrant evidence package is the evidence-of-decision instantiation that the RMF asks for but does not specify how to produce. The document is per-decision. It binds the four-function evidence (GOVERN signer, MAP classification, MEASURE eval record, MANAGE rationale) to a specific action a specific AI agent took at a specific time. It is independently verifiable without contacting Warrant, so the file's integrity survives any internal infrastructure decision the firm makes. The result is an artefact the firm controls and the supervisor can verify independently.

That gap closure is the load-bearing claim of Warrant against the RMF. The methodology asks for evidence. The per-decision package is the evidence the RMF does not by itself produce, and each record is independently verifiable without contacting Warrant. The four-function vocabulary becomes the evidence's organising principle; the per-decision package becomes the evidence's physical form. The RMF self-attestation rests on records the firm can produce on demand, at any horizon the procurement officer or the internal auditor names. The document tower the RMF asks for stands on a foundation the voluntary register did not, by itself, supply.

For the firm, the operational consequence is that an RMF self-attestation can be made over a population of per-decision packages rather than a population of internal narratives. The four-function language describes what the firm does; the packages prove what the firm did. Read together, the language and the packages form a posture a procurement officer can verify without negotiating retention, a supervisor can read without site-visit cooperation, and a board can sign without trusting that the underlying observability stack will retain its data through the next decade.

12 · FAQ

Questions a CAIO and an internal auditor ask first.

Is the NIST AI RMF mandatory?

The AI RMF is voluntary on its face. It is not a binding standard and NIST does not certify against it. The RMF becomes operationally mandatory through three indirect channels: Executive Order 14110 directs federal agencies to use it; OMB Memorandum M-24-10 makes it the methodological backbone for federal AI governance; and federal procurement language increasingly requires RMF alignment for AI vendors. For a SaaS company selling into US federal procurement, RMF alignment is procurement-gate territory regardless of the voluntary label.

How does NIST AI RMF differ from ISO/IEC 42001?

RMF is a risk-management methodology with four functions and nineteen categories. ISO/IEC 42001:2023 is a management-system standard with audit certification. The two overlap by 60 to 70 percent on substance. RMF GOVERN maps to ISO 42001 Clauses 5 and 7. RMF MAP maps to ISO 42001 Clause 6.1. RMF MEASURE maps to ISO 42001 Clause 9. RMF MANAGE maps to ISO 42001 Clauses 8 and 10. NIST published an official crosswalk. A firm running an ISO 42001 management system already produces most of the artefacts an RMF self-attestation requires.

What is the Generative AI Profile?

NIST AI 600-1, published 26 July 2024, is the companion profile to AI RMF 1.0 that addresses risks specific to generative AI systems. The profile lists twelve risk categories: CBRN information, confabulation, dangerous or violent recommendations, data privacy, environmental impacts, harmful bias and homogenization, human-AI configuration, information integrity, information security, intellectual property, obscene or degrading or abusive content, and value chain and component integration. Each risk has GOVERN, MAP, MEASURE, and MANAGE-style suggested actions.

Do i need RMF for federal procurement?

Yes for almost any AI procurement at the federal level. OMB Memorandum M-24-10 directed agencies to adopt RMF-aligned governance for safety-impacting and rights-impacting AI uses. Federal acquisition language across GSA, DoD, and civilian agencies now references the RMF as baseline. A vendor that cannot describe its product against the four RMF functions will face procurement friction even where the contract does not name RMF explicitly.

What is TEVV?

TEVV stands for Test, Evaluate, Verify, and Validate. The acronym originates in the systems-engineering and safety-critical-software domains and is carried into the RMF through the MEASURE function. MEASURE 2.7 of the RMF specifically calls for TEVV processes that quantify model uncertainty, robustness, accuracy, and reliability. For an AI agent in production, TEVV is the obligation to run a documented evaluation suite per release and to retain the per-evaluation evidence.

How does Executive Order 14110 relate to RMF?

Executive Order 14110, signed 30 October 2023, directed federal agencies to adopt the AI RMF as the baseline approach to AI risk management. Section 4.1(a)(i) of the order tasked NIST with extending the RMF for generative AI, which produced AI 600-1. The order is the bridge that turned the voluntary RMF into the de facto federal methodology.

Can RMF substitute for EU AI Act compliance?

No. RMF is voluntary and does not certify; the EU AI Act is binding and prescribes specific obligations including Article 9 (risk management), Article 12 (logging), Article 13 (transparency), and Article 14 (human oversight). The two regimes share substance: RMF MAP plus MANAGE is the methodological cousin of EU AI Act Article 9, and RMF MEASURE 3 is the methodological cousin of Article 12. A single per-decision evidence record can satisfy both regimes for a cross-border deployment, but RMF alignment alone does not discharge EU AI Act obligations.

What is a NIST AI 100-1 reference?

NIST AI 100-1 is the document identifier for the AI Risk Management Framework 1.0, published 26 January 2023. The naming convention follows NIST's series for AI publications. AI 600-1 is the Generative AI Profile, published 26 July 2024. Together with the AI RMF Playbook (an unnumbered companion) and the AI RMF Roadmap, the four documents form the operative reference set.

13 · READ THE SOURCE

Read the source directly.

Authored by Warrant Compliance, the regulatory-analysis function at Warrant. [email protected]. Editorial commentary on the AI RMF and the Generative AI Profile. Not legal advice. The verbatim quotations of NIST AI 100-1 § 5, GOVERN 1, MEASURE 2.7, the MANAGE function summary, and the AI 600-1 risk-category list reflect the published NIST text in force on 9 May 2026.

NIST AI RMF 1.0 + Generative AI Profile, line by line.