The Autonomy Assurance Maturity Model.
Five levels from ad hoc to continuous adaptive assurance. A framework for the Chief Audit Executive, the Chief Risk Officer, and the audit committee to read where the enterprise sits today on autonomous AI governance, and decide what comes next.
Most regulated enterprises sit between Level 1 and Level 2 today for autonomous AI, even when traditional model risk management is at Level 3 or Level 4. The gap is the agentic surface. This page names the levels and what each one looks like operationally, so internal audit and risk leadership can read the position without a vendor pitch in between.
Five levels. From ad hoc to continuous adaptive assurance.
Each level carries a descriptor, an operational read, the executive view from that position, and a short diagnostic. Read the descriptors in order. Most enterprises recognize themselves between two adjacent levels for the agentic surface, even when traditional model risk management sits further up.
Ad hoc
No formal autonomous AI governance. Agents operate without runtime controls. Audit catches up after incidents.
Some AI is running in the business. No one holds a current inventory. Controls exist at the identity layer or the data layer, but not at the agent layer. When a regulator asks for evidence, it gets assembled manually over weeks. Most enterprises with material agentic AI exposure today sit here.
“We know we have AI somewhere. We are not certain where, what it is allowed to do, or what it is quietly doing that no one has yet inventoried.”
- No current inventory of agents and what they can do
- No runtime authorization gate before high-risk agent actions
- No continuous residual risk read
- Evidence assembled manually when a regulator or auditor asks
Defined and reactive
AI policies documented. Some controls in place. Evidence assembled by hand when asked.
Policies and standards exist on paper. Agent inventory is partial and ages quickly. Risk owners exist, but their tooling treats AI as a manual review process. Audit testing is sample-based and periodic. Residual risk is unknown between testing cycles. The first two lines of defense have visibility into part of the surface; the third line is informed late.
“We have policies. We can produce some evidence on request. We do not yet know if the policies fire on every material action, or what the residual risk looks like between testing cycles.”
- Policies documented but not enforced at the agent layer
- Sample-based audit testing on a manual log
- Partial agent inventory, ages between testing cycles
- Risk reports lag reality by weeks
Managed and operational
Controls fire at the agent layer. Inventory is current. Evidence is generated but not yet continuous.
Agent inventory is live. Identity and data controls are augmented by agent-action controls at the runtime layer. Audit can pull representative samples on demand against a real evidence chain, not a hand-compiled log. Risk teams have a dashboard, but it lags reality by days. Three Lines of Defense are documented across the agentic surface, with the first line operating the agents, the second line monitoring risk and controls, the third line reading evidence.
“We can answer the runtime question for most material actions, most of the time. The dashboard is a few days behind the live state.”
- Live agent inventory
- Runtime authorization gate on most high-risk actions
- Audit can pull samples from a real evidence chain
- Risk dashboard lags reality by days, not weeks
Quantitative and measured
Continuous monitoring of agent actions. Residual risk tracked against tolerance. Audit-grade evidence assembled in days.
Population-level testing replaces sample-based testing. Residual risk is computed continuously against management-approved tolerance, with drift signals when posture moves. Audit reads the evidence chain directly. Executives see role-appropriate views (Chief Risk Officer, Chief Audit Executive, CISO, CFO) from the same underlying record. Regulator queries are answered in days. The third line of defense audits population-level evidence instead of sample-based snapshots.
“We can prove the controls fired. We can prove the residual risk is inside tolerance. We can produce regulator-grade evidence in days.”
- Population-level testing replaces sample-based testing
- Residual risk computed continuously against approved tolerance
- Role-appropriate executive views from one underlying record
- Regulator queries answered in days, not weeks
Continuous adaptive assurance
Every agent action authorized before, verified after. Trust scoring drives the next decision. Audit-grade evidence as a byproduct of runtime.
Pre-action assurance fires on every material agent action. Outcome validation runs on every result. The agent’s trust posture and the enterprise’s residual risk update continuously, feeding the next pre-action decision. Audit-grade evidence is captured at the runtime layer by construction; it is not assembled after the fact. Each line of defense sees the same continuous record at its fidelity. Regulator queries are answered live, from the same record the third line reads.
“We can prove autonomous operations are inside risk appetite right now, by name, at action level, against the business objective each agent is pursuing.”
- Pre-action assurance on every material action
- Outcome validation on every result
- Continuous trust scoring and residual risk
- Audit-grade evidence as a byproduct of runtime
What it takes to move from Level 4 to Level 5.
The first three levels are organizational discipline: inventory, policy, controls, and a real evidence chain. Level 4 adds continuous measurement against approved tolerance. Level 5 is the runtime layer where every agent action gets two checks. One before, one after. Trust scoring continuously feeds the next decision. Audit-grade evidence is captured by construction, not assembled after the fact.
Every material agent action is authorized before it executes, against scope, objective, policy in force, controls available, and the agent’s current risk posture.
Every result is validated against the business objective and the controls in force, with the validation captured as signed evidence and fed into the trust score.
Residual risk and the agent’s trust posture update continuously, in line with outcomes, so the next pre-action decision reads the live state.
Evidence is captured at the runtime layer as a byproduct of operating the agents. Audit reads the chain directly. No after-the-fact assembly.
Three Lines of Defense, at each level.
The model is audit-focused by design. Each level is read against the IIA Three Lines of Defense doctrine, so internal audit can place the enterprise on the framework that the audit committee, the regulator, and the external attestor already share.
- First line of defense — Business and operations running the agents
At Levels 1–2, the first line operates agents without runtime controls. At Level 3, the first line uses controls. At Levels 4–5, the first line operates inside continuous authorization and outcome validation.
- Second line of defense — Risk, model risk, and compliance
At Levels 1–2, the second line has limited visibility into the agentic surface. At Level 3, the second line monitors controls. At Levels 4–5, the second line reads continuous residual risk against approved tolerance and produces board-grade signals.
- Third line of defense — Internal audit
At Levels 1–2, internal audit reads a manual log on a sampling basis. At Level 3, internal audit pulls samples from a real evidence chain. At Levels 4–5, internal audit reads population-level evidence directly from the audit chain, in the formats the external attestor and the regulator already expect.
- Independent assurance — External attestor and regulator
The model is designed so that the same evidence chain serves all three lines of defense and the external attestor without renegotiation. At Level 5, the regulator queries the same record the third line reads, live.
Read your level. Decide what comes next.
The model is here so you can read where the enterprise sits without a vendor in the room. When you want to talk through it, the engagement funnel has four stages and Stage 1 is unpaid.
45 minutes. We walk the model with you, read where you sit today, and name the gap to Level 4 or Level 5. No commitment.
Two to four weeks. A written assessment pack mapped to the model, with a sequenced roadmap to the next level.
Ninety days. One agent, one workflow, one set of high-risk actions, instrumented against the Level 5 runtime pattern.
Common questions about the model.
The five level descriptors below carry quick diagnostic questions. If you can answer three or more of them in the negative for a given level, you are likely below that level. The cleanest read is to do an Executive Assurance Briefing or, if you want a formal scoping, a Runtime Risk and Governance Assessment.
No. The Autonomy Assurance Maturity Model is iTmethods’s framing for internal audit and risk leadership. It is consistent with how regulators describe maturity (SR 26-2, OSFI E-23, EU AI Act, DORA), and it maps cleanly to the IIA risk-based audit plan, but it is not itself a regulator-issued document.
Most regulated enterprises with material autonomous AI exposure sit between Level 1 and Level 2 today. Some banks and life-sciences operators with mature model risk practices sit at Level 3 for traditional models, but Level 2 or below for agentic AI.
No. Level 5 describes the continuous adaptive assurance state, which is achievable by any platform that authorizes agent actions before execution, validates outcomes after, computes residual risk continuously, and produces audit-grade evidence as a byproduct of runtime. Reign is built to that state. Other platforms approach it from different angles.
A PDF version of the model with worksheets for the audit committee, the Chief Audit Executive, and the Chief Risk Officer. The download is being prepared. The framework on this page is the same content that will appear in the PDF.
Schedule an Executive Assurance Briefing. The 45-minute conversation is shaped around the model. You walk away with a read on your current level and a sequenced path to Level 4 or Level 5, depending on where you start.
Most enterprises walk the model with us in 45 minutes.
Schedule an Executive Assurance Briefing. We bring the model. You bring the operating picture. Together we read where you sit and name the path forward.