AI Agent Identity and Least Privilege: The Three-Layer Model

25 min read · 4,924 words

The Identity Blind Spot That AI Agents Created

Every security team tracks human identities. Service accounts get annual audits. API keys rotate on a schedule. But somewhere between the shift to microservices and the explosion of AI agents, the identity attack surface quietly became unmanageable. By 2026, non-human identities (NHIs) outnumber human identities 50:1 in the average enterprise. That is not a typo. For every employee with a corporate badge, there are fifty service accounts, machine credentials, API tokens, and now — AI agent identities operating across your infrastructure.

The consequences are already measurable. 68% of IT security incidents now involve non-human identities, according to Obsidian Security’s 2025 analysis. These are not theoretical risks. Attackers have learned that NHIs are the path of least resistance: they are over-provisioned, rarely monitored, and almost never subject to the same governance as human accounts. An AI agent with a long-lived API key to your CRM is a standing invitation.

The problem is not that security teams are negligent. The problem is that AI agents fundamentally break the assumptions embedded in every identity and access management (IAM) system deployed today. Agents act on behalf of users, but they are not users. They invoke tools and APIs, but they are not traditional service accounts. They create sub-tasks and spawn child processes, but they are not microservices. They exist in a category that existing IAM was never designed to handle.

This article is a practitioner’s guide to securing AI agent access. We will walk through the emerging identity frameworks — from the CSA’s agentic identity model to the IETF’s first agent authentication draft — and show you how to implement least privilege for agents using ephemeral credentials, per-request authorization at the MCP layer, and enterprise-grade patterns that actually work in production.

If you are building, deploying, or securing AI agents, this is the identity model you need. For the broader zero trust context that underpins all of this, see our deep dive on Zero Trust Architecture for AI Systems. And for the governance framework that ties identity to AI risk management, our Cybersecurity AI Framework (CAI) covers the organizational layer.

The Three Identity Layers for AI Agents

Before we can secure agent identities, we need to understand what they actually are. The Cloud Security Alliance (CSA) published the “Agentic AI Identity Management Approach” in March 2025, authored by Ken Huang, and it remains the most practical decomposition of this problem. The CSA model defines three distinct identity layers, each with different trust characteristics, lifecycle management requirements, and authorization scopes.

Thinking of agent identity as a single monolithic concept is the first mistake most teams make. An AI agent operates simultaneously across all three layers, and each layer requires a different credential type, a different trust model, and a different revocation mechanism.

Layer 1: Workload Identity

Workload identity answers the question: Is this agent process running where and how it is supposed to be running? This is the foundational layer — if you cannot attest to the agent’s runtime environment, nothing above it matters.

The dominant frameworks here are SPIFFE/SPIRE and WIMSE (Web Identity, Machine, Service, and Endpoint). SPIFFE provides cryptographic identity documents (SVIDs) that are bound to workload metadata — namespace, pod, service mesh identity. WIMSE extends this model with web-native bindings, allowing workloads to prove their identity through a combination of TLS certificates, hardware attestations, and platform-level identity assertions.

In practice, a workload identity layer means your agent’s container or serverless function receives a short-lived X.509 SVID at startup. This SVID is cryptographically signed by your trust domain’s CA and encodes the agent’s allowed deployment context. If the agent is running in an unexpected namespace or on an unapproved host, the SVID will not be issued.

Layer 2: Delegated User Authority

Workload identity tells you the agent is legitimate software. Delegated user authority tells you which human authorized this agent to act on their behalf, and what they are allowed to delegate.

This layer is built on OAuth 2.0 with specific extensions for machine-to-machine delegation. The key concept is delegation scope: a user authorizes an agent to perform a specific set of actions within their permission envelope. The agent receives an access token that represents the user’s delegated authority — not the user’s full permissions, and certainly not the agent’s own unbounded access.

Delegation scopes must be narrow and auditable. An agent tasked with summarizing emails should receive a read-only token scoped to the user’s inbox, not a blanket mail API key. An agent debugging a production issue should receive temporary access to specific log streams, not a persistent admin credential.

Layer 3: Task-Scoped Authorization

The third layer is the most dynamic and the most important for least privilege. Task-scoped authorization answers: Is this specific agent invocation — this exact request, right now — authorized to perform this specific action?

Unlike workload identity (which attests the process) and delegated authority (which attests the user’s permission envelope), task-scoped authorization is runtime-evaluated and expires with the task. It is the difference between giving someone a badge that opens all doors and giving them a temporary keycard that opens exactly one door, once, and self-destructs after use.

This layer is where fine-grained policy engines operate — evaluating the agent’s current task context, the data sensitivity of the target resource, and real-time risk signals before every individual operation. It is computationally more expensive but provides the strongest security guarantee.

How the Layers Work Together

A practical agent authentication flow combines all three layers. The agent proves its workload identity via SVID (Layer 1), presents a user-delegated OAuth token (Layer 2), and then receives task-scoped authorization from a policy engine for each specific operation (Layer 3). If any layer fails, the request is denied.

Identity Layer	What It Proves	Key Technologies	Lifecycle	Revocation
Workload Identity	Agent is legitimate software in approved context	SPIFFE/SPIRE, WIMSE	Process lifetime (~1h SVID rotation)	CA trust bundle rotation
Delegated Authority	User authorized agent to act on their behalf	OAuth 2.0, OIDC	Session/token lifetime (minutes to hours)	Token revocation, scope reduction
Task-Scoped AuthZ	This specific operation is authorized right now	Policy engines (OPA, Cedar), runtime PDP	Single request / task duration	Automatic — expires with task

Ephemeral Authentication: Why Long-Lived Agent Credentials Are Dead

The single most impactful change you can make to agent security is eliminating long-lived credentials. Full stop. No exceptions.

A long-lived API key sitting in an agent’s configuration file is a time bomb. It cannot be scoped to a specific task. It cannot be revoked without breaking every instance that uses it. It cannot express the context of who authorized the agent or what it was supposed to be doing. It is a static secret in a world that demands dynamic trust.

Ephemeral authentication replaces static secrets with short-lived, context-bound credentials that are issued at runtime and automatically expire. The CSA’s agentic identity framework explicitly calls this out as a core requirement, and it is the foundation of every production-grade agent security architecture we have seen.

The IETF Weighs In: draft-klrc-aiagent-auth-00

In March 2026, the IETF published draft-klrc-aiagent-auth-00, the first formal specification for AI agent authentication. This draft is significant because it synthesizes the three identity layers into a single coherent protocol by combining WIMSE workload attestation, SPIFFE SVIDs for cryptographic identity, and OAuth 2.0 for delegated authority.

The draft defines a token exchange flow specifically for agents. An agent requests an identity token by presenting its WIMSE attestation and receiving an OAuth 2.0 access token in return. The token’s claims include the workload identity, the delegating user (if applicable), the task scope, and an expiration time measured in minutes — not days or months.

What makes this draft practically useful is that it builds on existing, widely-deployed protocols rather than inventing new cryptography. Your SPIRE infrastructure, your OAuth authorization server, and your WIMSE attestations all continue to work — the draft just defines how they compose for agent use cases.

Implementing Ephemeral Credentials

Here is a concrete implementation pattern for ephemeral agent authentication. This Python example shows an agent requesting a task-scoped token at the start of a task and using it for all subsequent operations. When the token expires, the agent must re-attest and re-authenticate — there is no refresh token, no silent renewal.

import httpx
import time
from cryptography.hazmat.primitives import serialization
from cryptography.hazmat.primitives.asymmetric import ec

class AgentAuthClient:
    """
    Ephemeral agent authentication client.
    Requests task-scoped tokens that expire with the task.
    Implements IETF draft-klrc-aiagent-auth-00 token exchange.
    """
    
    def __init__(self, auth_endpoint: str, spire_socket: str):
        self.auth_endpoint = auth_endpoint
        self.spire_socket = spire_socket
        self._private_key = ec.generate_private_key(ec.SECP256R1())
        self._token: str | None = None
        self._token_expires: float = 0.0
    
    def fetch_svid(self) -> str:
        """Fetch workload identity SVID from SPIRE agent socket."""
        with open(self.spire_socket, "rb") as sock:
            # In production, use the SPIRE Workload API gRPC client
            # This fetches a short-lived X.509 SVID (typically 1h TTL)
            svid = self._read_svid_from_socket(sock)
        return svid
    
    def request_task_token(self, task_id: str, scopes: list[str],
                           delegating_user: str | None = None) -> str:
        """
        Exchange workload identity + delegation proof for a task-scoped token.
        Token TTL is tied to the task, typically 5-30 minutes.
        """
        svid = self.fetch_svid()
        
        # Build the token request per draft-klrc-aiagent-auth-00
        token_request = {
            "grant_type": "urn:ietf:params:oauth:grant-type:token-exchange",
            "subject_token": svid,
            "subject_token_type": "urn:ietf:params:oauth:token-type:spiffe-svid",
            "requested_token_type": "urn:ietf:params:oauth:token-type:access_token",
            "scope": " ".join(scopes),
            "agent_claims": {
                "task_id": task_id,
                "delegating_user": delegating_user,
                "request_timestamp": int(time.time()),
                "task_expires_at": int(time.time()) + 900  # 15 min hard cap
            }
        }
        
        # Sign the request with the agent's ephemeral key pair
        signature = self._sign_request(token_request)
        token_request["agent_signature"] = signature
        
        response = httpx.post(self.auth_endpoint, json=token_request)
        response.raise_for_status()
        
        token_data = response.json()
        self._token = token_data["access_token"]
        self._token_expires = token_data["expires_at"]
        
        return self._token
    
    def get_valid_token(self) -> str:
        """Return current token or raise if expired."""
        if not self._token or time.time() >= self._token_expires:
            raise RuntimeError(
                "Agent token expired. Re-attestation required — "
                "no silent renewal allowed."
            )
        return self._token
    
    def _sign_request(self, request: dict) -> str:
        """Sign the canonical request with the ephemeral key."""
        canonical = str(sorted(request.items()))
        return self._private_key.sign(
            canonical.encode(), ec.ECDSA(hashes.SHA256())
        ).hex()
    
    def _read_svid_from_socket(self, sock) -> str:
        """Simplified SVID fetch — use SPIRE Workload API in production."""
        return "spiffe://hmmnm.com/agents/data-processor/prod"


# Usage: agent authenticates per task, not per deployment
auth = AgentAuthClient(
    auth_endpoint="https://auth.hmmnm.com/agent/token",
    spire_socket="/tmp/spire-agent.sock"
)

# Token is scoped to THIS task, expires in 15 minutes
token = auth.request_task_token(
    task_id="task-20260407-001",
    scopes=["logs:read", "metrics:read"],
    delegating_user="prabhu@hmmnm.com"
)

The critical design properties here: the agent has no static secrets, the token is scoped to a specific task with specific permissions, it expires in minutes, and there is no refresh mechanism. If the agent needs to continue working, it must re-attest its workload identity and receive a new token. This makes revocation trivial — simply stop issuing tokens for that task ID or workload identity.

Model Context Protocol as an Authorization Chokepoint

The Model Context Protocol (MCP) has emerged as the standard interface layer between AI agents and the tools, APIs, and data sources they access. What many teams have not yet realized is that MCP is not just a connectivity layer — it is a natural authorization chokepoint.

The default MCP security model authenticates and authorizes at connection time. When an agent connects to an MCP server, the server verifies the agent’s identity and grants access to a set of tools and resources. This is equivalent to giving someone a badge at the building entrance — it proves they belong inside, but it says nothing about which specific rooms they should access on any given trip.

Per-request authorization changes this model entirely. Instead of authorizing at connection time, every single tool invocation or resource access is individually evaluated against a policy engine. The agent connects once (with its workload identity and delegated authority), but each request carries its own authorization context that is evaluated in real time.

Why Connection-Time Authorization Is Insufficient

Consider a customer support agent that connects to your MCP server with access to the ticketing tool, the knowledge base tool, and the user lookup tool. Under connection-time authorization, the agent has blanket access to all three tools for the duration of the session. If the agent is later compromised or its task context changes, it retains full access until the session ends or is manually terminated.

With per-request authorization, the policy engine evaluates each tool call individually. The agent might be authorized to call the knowledge base tool at any time (it is read-only public data), but calling the user lookup tool requires that the current conversation involves a legitimate support ticket, that the user has consented to data access, and that the request is within the agent’s delegated scope. The ticketing tool might require that the agent’s task ID matches an open ticket.

MCP Gateway Patterns

Several production-grade MCP gateway implementations have emerged that enforce per-request authorization:

Kong MCP Gateway: Kong’s API gateway now includes an MCP plugin that intercepts every tool invocation, evaluates it against Kong’s ACL and rate-limiting policies, and can integrate with external policy engines via OpenPolicyAgent (OPA).
AWS IAM SigV4: MCP servers running on AWS can require that every tool invocation is signed with SigV4, binding the request to a specific IAM role with specific permissions. The IAM policy becomes the per-request authorization policy.
Okta Integration: Okta’s fine-grained access management can be applied at the MCP layer, evaluating each request against the delegating user’s group memberships, the resource’s classification, and real-time risk signals from Okta’s threat engine.

Here is a practical example of implementing per-request authorization at the MCP gateway level using OPA as the policy decision point:

"""
MCP Gateway with per-request OPA authorization.
Every tool invocation is evaluated against policy before forwarding.
"""
import httpx
import json
from fastapi import FastAPI, Request, HTTPException
from fastapi.responses import JSONResponse

app = FastAPI()
OPA_URL = "http://opa:8181/v1/data/mcp/allow"
MCP_SERVER_URL = "http://mcp-server:8000"

@app.post("/mcp/tools/{tool_name}/invoke")
async def invoke_tool(tool_name: str, request: Request):
    """
    Intercept tool invocation, evaluate against OPA policy,
    then forward to MCP server if authorized.
    """
    body = await request.json()
    
    # Extract agent identity from request headers (set by upstream auth)
    agent_id = request.headers.get("x-agent-id")
    task_id = request.headers.get("x-task-id")
    delegating_user = request.headers.get("x-delegating-user")
    
    # Build authorization input for OPA
    opa_input = {
        "input": {
            "agent": {
                "id": agent_id,
                "task_id": task_id,
                "delegating_user": delegating_user
            },
            "action": {
                "tool": tool_name,
                "method": "invoke",
                "parameters": body.get("parameters", {})
            },
            "resource": body.get("resource_id"),
            "timestamp": int(time.time())
        }
    }
    
    # Evaluate policy per-request
    opa_response = httpx.post(OPA_URL, json=opa_input).json()
    
    if not opa_response.get("result", False):
        # Policy denied — return structured denial with reason
        raise HTTPException(
            status_code=403,
            detail={
                "error": "authorization_denied",
                "tool": tool_name,
                "agent": agent_id,
                "task": task_id,
                "reason": opa_response.get("reason", "Policy evaluation failed")
            }
        )
    
    # Forward authorized request to MCP server
    mcp_response = httpx.post(
        f"{MCP_SERVER_URL}/tools/{tool_name}/invoke",
        json=body,
        headers={
            "x-agent-id": agent_id,
            "x-authorized-at": str(int(time.time())),
            "x-authorization-ttl": "300"  # 5 min max for downstream
        }
    )
    
    return JSONResponse(content=mcp_response.json())


# Example OPA policy (Rego) for per-request MCP authorization:
# package mcp
# 
# import future.keywords.if
# import future.keywords.in
#
# default allow := false
#
# allow if {
#     # Tool is in agent's approved set
#     input.action.tool in agent_approved_tools
#     # Task is still active (not expired)
#     input.agent.task_id != ""
#     task_is_active[input.agent.task_id]
#     # Parameters don't contain blocked patterns (e.g., SQL injection)
#     not contains_blocked_pattern(input.action.parameters)
#     # Resource sensitivity matches delegation scope
#     resource_sensitivity_ok(input.resource, input.agent.delegating_user)
# }

The power of this pattern is that the authorization decision is made for each individual request, not once at session start. If the agent’s task is cancelled upstream, the OPA policy immediately denies all subsequent requests because the task ID becomes invalid. If the delegating user revokes their consent, the policy engine reflects that change instantly. There is no stale session state to exploit.

Building Least-Privilege Agents: Practical Implementation

Least privilege for AI agents is not a configuration toggle — it is an architectural pattern that must be designed into every layer of the agent stack. Here is how to actually build it.

Principle 1: No Standing Access

Agents should never possess credentials that outlive their current task. Every token, every certificate, every authorization assertion should be scoped to the current task and should expire when the task completes. This is not just good practice — it is the only pattern that is defensible under zero trust principles.

In practice, this means agents request credentials at task start and release them at task end. There is no credential caching between tasks. There is no “warm standby” mode where an agent holds credentials waiting for the next assignment. The credential lifecycle is strictly bound to the task lifecycle.

Principle 2: Minimal Delegation Scope

When a user delegates authority to an agent, the delegation should be as narrow as technically feasible. An agent summarizing a document does not need write access to the document store. An agent analyzing logs does not need access to the configuration management API. Each delegated scope should be the minimum required for the specific task.

Principle 3: Every Access Decision Is Logged and Auditable

Per-request authorization produces a natural audit trail. Every tool invocation, every API call, every data access is recorded with the agent’s identity, the task ID, the delegating user, the resource accessed, and the policy decision. This audit trail is not optional — it is the evidence you need for incident response, compliance reporting, and policy refinement.

Putting It All Together: A Least-Privilege Agent Framework

This Python example shows a complete least-privilege agent framework that integrates ephemeral authentication, per-request authorization, and structured audit logging:

"""
Least-Privilege Agent Framework
Combines ephemeral auth, per-request authorization, and audit logging.
"""
import time
import json
import logging
from dataclasses import dataclass, field
from enum import Enum
from typing import Any
import httpx

logger = logging.getLogger("agent-framework")


class Permission(Enum):
    READ = "read"
    WRITE = "write"
    DELETE = "delete"
    EXECUTE = "execute"
    ADMIN = "admin"


@dataclass
class AgentIdentity:
    """Represents the agent's three-layer identity."""
    workload_id: str          # Layer 1: SPIFFE/SPIRE SVID
    delegating_user: str      # Layer 2: User who authorized this agent
    task_id: str              # Layer 3: Current task identifier
    delegated_scopes: list[str] = field(default_factory=list)
    task_expires_at: float = 0.0
    
    @property
    def is_expired(self) -> bool:
        return time.time() >= self.task_expires_at


@dataclass
class AccessDecision:
    """Result of a per-request authorization evaluation."""
    allowed: bool
    reason: str
    policy_id: str
    evaluated_at: float = field(default_factory=time.time)


class AuditLogger:
    """Structured audit logging for every agent access decision."""
    
    def __init__(self, sink_url: str | None = None):
        self.sink_url = sink_url
    
    def log(self, identity: AgentIdentity, action: str,
            resource: str, decision: AccessDecision, context: dict = None):
        event = {
            "timestamp": time.time_ns(),
            "event_type": "agent_access_decision",
            "agent": {
                "workload_id": identity.workload_id,
                "delegating_user": identity.delegating_user,
                "task_id": identity.task_id
            },
            "action": action,
            "resource": resource,
            "decision": {
                "allowed": decision.allowed,
                "reason": decision.reason,
                "policy_id": decision.policy_id
            },
            "context": context or {}
        }
        
        logger.info(json.dumps(event))
        
        # Send to external audit sink (SIEM, compliance platform)
        if self.sink_url:
            httpx.post(self.sink_url, json=event, timeout=2.0)


class LeastPrivilegeAgent:
    """
    Agent that enforces least privilege on every operation.
    No standing access. Per-request authorization. Full audit trail.
    """
    
    def __init__(self, identity: AgentIdentity,
                 authz_endpoint: str, audit: AuditLogger):
        self.identity = identity
        self.authz_endpoint = authz_endpoint
        self.audit = audit
    
    def _check_authorization(self, action: str, resource: str,
                              context: dict = None) -> AccessDecision:
        """Evaluate per-request authorization before every operation."""
        if self.identity.is_expired:
            return AccessDecision(
                allowed=False,
                reason="Task expired — re-attestation required",
                policy_id="task-lifecycle"
            )
        
        # Call external policy engine
        try:
            response = httpx.post(
                self.authz_endpoint,
                json={
                    "agent": {
                        "workload_id": self.identity.workload_id,
                        "delegating_user": self.identity.delegating_user,
                        "task_id": self.identity.task_id,
                        "scopes": self.identity.delegated_scopes
                    },
                    "action": action,
                    "resource": resource,
                    "context": context or {}
                },
                timeout=1.0
            )
            result = response.json()
            return AccessDecision(
                allowed=result.get("allowed", False),
                reason=result.get("reason", "Policy evaluation failed"),
                policy_id=result.get("policy_id", "unknown")
            )
        except httpx.RequestError as e:
            # Fail closed: deny on policy engine unavailability
            logger.error(f"Policy engine unreachable: {e}")
            return AccessDecision(
                allowed=False,
                reason=f"Policy engine unavailable: {e}",
                policy_id="fail-closed"
            )
    
    def execute(self, tool_name: str, parameters: dict,
                resource: str = None) -> dict[str, Any]:
        """
        Execute a tool call with full least-privilege enforcement.
        Every call goes through authorization + audit.
        """
        # Step 1: Per-request authorization
        decision = self._check_authorization(
            action=f"tool:{tool_name}",
            resource=resource or "unknown",
            context={"parameters": parameters}
        )
        
        # Step 2: Audit the decision (allowed or denied)
        self.audit.log(
            identity=self.identity,
            action=f"tool:{tool_name}",
            resource=resource or "unknown",
            decision=decision,
            context={"parameters": parameters}
        )
        
        # Step 3: Deny if unauthorized
        if not decision.allowed:
            raise PermissionError(
                f"Access denied for tool '{tool_name}': {decision.reason}"
            )
        
        # Step 4: Execute the tool
        logger.info(f"Executing tool '{tool_name}' for task {self.identity.task_id}")
        # ... actual tool execution here ...
        
        return {
            "tool": tool_name,
            "status": "executed",
            "authorized_by": decision.policy_id,
            "task_id": self.identity.task_id
        }


# Usage example
identity = AgentIdentity(
    workload_id="spiffe://hmmnm.com/agents/support-agent/prod",
    delegating_user="prabhu@hmmnm.com",
    task_id="task-20260407-042",
    delegated_scopes=["tickets:read", "tickets:write", "kb:read"],
    task_expires_at=time.time() + 900  # 15 minutes
)

audit = AuditLogger(sink_url="https://audit.hmmnm.com/ingest")
agent = LeastPrivilegeAgent(
    identity=identity,
    authz_endpoint="https://policy.hmmnm.com/v1/evaluate",
    audit=audit
)

# This call will be authorized (scope matches)
result = agent.execute("lookup_ticket", {"ticket_id": "TKT-1234"})

# This call will be denied (admin scope not in delegated_scopes)
try:
    agent.execute("delete_all_tickets", {})
except PermissionError as e:
    print(f"Blocked: {e}")  # "Access denied for tool 'delete_all_tickets'..."

The framework enforces three non-negotiable properties: fail-closed (policy engine unavailability means deny), per-request evaluation (no connection-time blanket grants), and structured audit (every decision is recorded with full context). This is the minimum bar for production agent security.

Enterprise Patterns and Real-World Deployments

The theory is solid. But what does agent identity management actually look like in enterprise deployments? Several patterns have emerged that are worth studying.

HashiCorp A2A + Vault OIDC: Agent-to-Agent Token Exchange

When agents need to communicate with other agents — a common pattern in multi-agent systems — the identity problem compounds. Agent A needs to prove its identity to Agent B, and Agent B needs to verify that Agent A is authorized to request the specific operation. HashiCorp’s A2A (Agent-to-Agent) pattern uses Vault’s OIDC provider to solve this.

The flow works as follows: Agent A authenticates to Vault using its workload identity (SPIFFE SVID), receives a Vault-signed OIDC token, and presents that token to Agent B. Agent B validates the OIDC token against Vault’s public keys and extracts the agent’s identity, delegation scope, and task context. Vault acts as a trusted intermediary — neither agent needs to directly trust the other, they both trust Vault.

This pattern is particularly powerful because Vault already handles certificate rotation, secret leasing, and revocation. The OIDC token inherits Vault’s security properties: short TTL, scoped claims, and immediate revocation capability through Vault’s lease management.

Oasis AAM: The First Agentic Access Management Platform

Oasis AAM (Agentic Access Management) is the first platform specifically designed for managing AI agent access at enterprise scale. Rather than bolting agent identity onto existing IAM systems, Oasis AAM was built from the ground up with agent-specific concepts: task-scoped permissions, dynamic delegation chains, agent-to-agent trust relationships, and real-time policy evaluation.

Oasis AAM provides centralized policy management for agent access across heterogeneous environments — cloud services, internal APIs, databases, and MCP tool servers. It integrates with existing identity providers (Okta, Azure AD, Google Workspace) for the delegated user authority layer while managing the task-scoped authorization layer natively.

Pattern: Progressive Trust for Agent Actions

One enterprise pattern that deserves attention is progressive trust escalation. Not all agent actions carry the same risk. Reading a public knowledge base article is low-risk. Writing to a production database is high-risk. A progressive trust model adjusts the authorization requirements based on the sensitivity of the action.

Low-risk actions (read public data, compute on local context): Automated approval with standard per-request authorization.
Medium-risk actions (read private user data, write to non-production systems): Require explicit delegation scope from the user and enhanced audit logging.
High-risk actions (write to production, delete data, modify infrastructure): Require real-time user confirmation, multi-factor verification of the delegating user, and time-boxed authorization windows.

This pattern mirrors human access models — you do not need a manager’s signature to read a wiki page, but you do need one to push to production. Agents should follow the same principle.

Pattern: Agent Identity Federation

In large organizations, agents operate across multiple trust domains — different cloud providers, different business units, different security perimeters. Agent identity federation extends the familiar SSO/federation model to non-human identities.

An agent receives its primary identity from its home trust domain (via SPIFFE/SPIRE) and then federates that identity to external trust domains through token exchange. The external domain maps the federated identity to its own local permissions without ever seeing the agent’s raw credentials. This is the same pattern that allows you to log into a third-party service with your Google account, but applied to machine identities.

Pattern	Use Case	Key Technologies	Complexity
Ephemeral Task Tokens	Single-task agents with bounded lifetime	OAuth 2.0, SPIFFE/SPIRE	Low
MCP Gateway Authorization	Per-request tool/API access control	OPA, Kong, AWS SigV4	Medium
Agent-to-Agent (A2A)	Multi-agent systems with inter-agent trust	HashiCorp Vault OIDC	Medium
Progressive Trust Escalation	Risk-tiered action authorization	Custom policy engine + human-in-loop	High
Identity Federation	Cross-domain agent operations	SPIFFE federation, SAML/OIDC token exchange	High
Agentic Access Management	Enterprise-wide agent governance	Oasis AAM, Okta Integration	Medium-High

What Happens When You Get It Wrong

The cost of inadequate agent identity management is not theoretical. Here are the failure modes we see in practice:

Credential sprawl. Teams create a new API key for each agent deployment and never rotate them. Six months later, there are hundreds of standing credentials with no clear owner, no expiration, and no audit trail. Attackers love this. Each static credential is an independent attack vector.

Over-delegation. An agent is given admin-level access “to make development easier” and that access is never narrowed for production. The agent only needs read access to three tables but has drop privileges on the entire database. When the agent is compromised — through prompt injection, supply chain attack, or model hallucination — the attacker inherits the full permission envelope.

Identity confusion. When agents operate under shared service accounts, it becomes impossible to attribute actions to a specific agent, task, or delegating user. Audit logs show “svc-agent-account accessed customer database” but cannot tell you which agent, which task, or which user authorized it. This is the identity equivalent of logging every action as “root.”

Stale delegation. A user delegates authority to an agent for a specific task, the task completes, but the delegation token is never revoked. The token persists in the agent’s memory or token cache, creating a window where the agent could make unauthorized requests if compromised. Without task-scoped authorization with automatic expiration, stale delegation is almost guaranteed.

These failure modes are not edge cases. They are the default outcome when you apply traditional IAM patterns to AI agents without adaptation. The three-layer model, ephemeral authentication, and per-request authorization exist specifically to prevent them.

Key Takeaways

Non-human identities are the new primary attack surface. With 50:1 NHI-to-human ratios and 68% of incidents involving NHIs, agent identity is not a niche concern — it is the center of the threat landscape.
Agent identity is not a single concept — it operates across three layers. Workload identity (is the software legitimate?), delegated authority (did a user authorize this?), and task-scoped authorization (is this specific action allowed right now?). Each layer needs its own credential type, trust model, and revocation mechanism.
Ephemeral credentials are non-negotiable. Long-lived API keys for agents are a liability. Every credential should be scoped to a task, issued at runtime, and expired automatically. The IETF’s draft-klrc-aiagent-auth-00 provides the protocol blueprint.
MCP is your authorization chokepoint. Per-request authorization at the MCP layer — using OPA, Kong, or cloud-native IAM — provides the strongest security guarantee. Connection-time authorization is insufficient.
Fail closed. When the policy engine is unreachable, deny the request. When the token is expired, require re-attestation. When the task is complete, revoke the delegation. Every uncertainty should resolve in favor of denial.
Every access decision must be auditable. Structured logging of every authorization decision — allowed or denied — is essential for incident response, compliance, and policy refinement. If you cannot explain why an agent was allowed to perform an action, your authorization model is broken.
Start with ephemeral tokens and per-request MCP authorization. These two patterns provide the highest security improvement for the lowest implementation effort. Build on top of existing infrastructure (SPIFFE, OAuth, OPA) rather than inventing new systems.

References

Cloud Security Alliance, “Agentic AI Identity Management Approach,” Ken Huang, March 2025. cloudsecurityalliance.org
IETF, “draft-klrc-aiagent-auth-00: Authentication and Authorization for AI Agents,” March 2026. datatracker.ietf.org
Obsidian Security, “Non-Human Identity Security Report,” 2025. obsidiansecurity.com
SPIFFE/SPIRE, “Secure Production Identity Framework for Everyone,” spiffe.io
WIMSE Working Group, “Web Identity, Machine, Service, and Endpoint,” IETF, datatracker.ietf.org/wg/wimse/
HashiCorp, “Agent-to-Agent Identity with Vault OIDC,” hashicorp.com
Oasis Labs, “Agentic Access Management (AAM),” oasislabs.com
Model Context Protocol, “MCP Specification,” Anthropic, modelcontextprotocol.io
Open Policy Agent, “OPA Policy Engine,” openpolicyagent.org
Prabhu Kalyan Samal, “Zero Trust Architecture for AI Systems,” hmmnm.com
Prabhu Kalyan Samal, “Cybersecurity AI Framework (CAI),” hmmnm.com