Every CISO and VP of Engineering we talk to is wrestling with the same three pressures right now:
Move faster with AI. Your business stakeholders want outcomes yesterday. Developers are shipping with Copilot, Claude Code, and a dozen other AI tools. Agents are running in production. The pressure to show ROI is real and escalating.
Stay secure. AI introduces attack surfaces that didn’t exist 18 months ago — prompt injection, data exfiltration through model outputs, PII flowing through third-party inference APIs, shadow agents nobody approved. Your security posture was built for a world where you knew what software was running and who was using it.
Control the cost. Token spend is consumption-based, unpredictable, and already showing up on quarterly reviews. Ramp recently reported that average monthly AI token spend across their enterprise customers has grown 13× since January 2025 — not 13%, thirteen times. Uber gave 5,000 engineers access to Claude Code in late 2025. Within four months, they had burned through their entire annual AI budget.
The conventional wisdom says these three forces are in tension. Move fast, and you introduce security risk. Lock things down, and you slow outcomes. Govern tightly, and you add friction. Pick two.
That framing is wrong. And it’s leading organizations into a trap.
The Real Problem: You’re Managing Symptoms, Not the System
When token costs spike, the instinct is to add spending controls. When a security incident surfaces, the instinct is to restrict access. When teams complain AI is too slow or too constrained, leadership loosens the guardrails.
This is whack-a-mole governance. You’re reacting to whichever pressure is loudest this quarter, without addressing the underlying issue: you don’t have a unified picture of what your AI is doing, who is using it, and what policies are in effect across the stack.
Consider what’s actually happening inside a typical enterprise today:
- Developers are using AI coding assistants — some approved, some shadow-adopted — with no consistent policy on what codebases or credentials they can access.
- Security teams are using AI-powered tools for threat analysis and incident response — often appearing on token usage leaderboards despite not writing a single line of code.
- Agents are running autonomously: fetching data, writing files, calling APIs, triggering workflows — without a clear audit trail of what they did or why.
- Finance can’t attribute AI spend to a P&L line because nobody built the cost attribution layer. The bill arrives as a single invoice from OpenAI or Anthropic, and finance is left reverse-engineering who spent what.
Each of these is framed as a different problem — a cost problem, a security problem, a compliance problem. But strip them down and they’re all the same question: Who authorized this action, under what policy, with what constraints, and what happened?
That’s a governance question.
Why “SaaS v1” Mental Models Break Here
Enterprise software has operated on a simple premise for two decades: seat-based licensing, predictable per-unit costs, defined user roles. You bought 500 Salesforce seats. You knew roughly what you’d spend. You could audit who logged in.
AI inference breaks every assumption in that model.
Cost is consumption-based, not seat-based. A single agent running overnight can consume more compute than a team of developers in a month. Two employees doing nominally similar work can generate wildly different token volumes depending on which model they’re routing to and how their prompts are constructed.
Access control is identity + action, not just identity. Knowing that a user is authenticated tells you nothing about whether they should be allowed to ask an AI agent to pull customer PII and summarize it into an email. Traditional IAM stops at the door. AI governance has to operate inside the room.
Security perimeters are porous by design. AI agents call external APIs, retrieve from vector databases, write to filesystems, and invoke tools — often across trust boundaries that your existing security stack wasn’t built to observe. The attack surface isn’t the model. It’s the action space the model can reach.
The CFO wants cost predictability. The CISO wants auditability and enforcement. The VP of Engineering wants speed and autonomy for their teams. These are different languages for the same underlying need: make AI behavior legible and governable.
What Governance Actually Looks Like in Practice
Governance in this context isn’t a dashboard. It’s an enforcement layer — something that sits in the path of AI activity and can observe, attribute, and act on policy in real time.
Concretely, this means:
Cost attribution at the identity level. Every token consumed maps back to an identity, a team, a use case, and a model. You can answer “which team spent $40K in tokens last month and on what?” before the CFO asks. You can set budgets by team or by use case and enforce them — not alert on them after the fact, enforce them.
Policy enforcement at the action level. Before an agent exfiltrates PII in a model output, a policy check should intercept it. Before a model call goes to an expensive frontier model for a task that a cheaper model handles equally well, a routing policy should redirect it. These aren’t audits after the fact — they’re guardrails at execution time.
Audit trails that actually reconstruct what happened. When a security incident involves an AI agent, you need to answer: what prompt was sent, what tools were invoked, what data was accessed, what was returned, and was there a human-in-the-loop decision point? Immutable audit logs tied to session and identity aren’t a compliance checkbox. They’re the difference between a recoverable incident and an uncontrollable one.
Visibility into shadow AI. The agents you know about aren’t your biggest risk. The agents your teams spun up without a security review — running on personal API keys, calling external services, processing customer data — are. An AI SBOM (bill of materials) approach, borrowed from software supply chain security, gives you a scannable inventory of what’s running across your environment.
Speed Is a Governance Outcome, Not a Governance Casualty
Here’s the counterintuitive part: the organizations moving fastest with AI aren’t the ones with the fewest guardrails. They’re the ones whose developers and security teams trust the environment enough to move without fear.
When every AI action is observable and attributable, security teams don’t need to restrict access broadly — they can restrict precisely. When cost attribution is automatic, finance doesn’t need to intervene — teams self-govern because the signal is visible. When audit trails are immutable, compliance teams don’t need to slow down deployments for manual reviews — the evidence is already there.
Governance doesn’t create the trilemma. The absence of governance creates it.
The Question to Ask Your Team This Week
Not “how do we control AI costs” or “how do we secure our AI stack” or “how do we measure AI ROI” — those are the wrong entry points because they treat the symptoms independently.
The right question is: Can we see, attribute, and enforce policy across everything our AI is doing right now?
If the answer is no — or if the answer is “partially, through a custom-built internal tool we’re not sure scales” — you’re managing the trilemma one fire at a time.
The companies that will get the most value from AI over the next 24 months aren’t the ones that adopted the most tools. They’re the ones that built governance as infrastructure, not as an afterthought.
At Nirmata, we built AIControls to be exactly this enforcement layer — sitting in the path of AI agent activity to give security and engineering teams the cost attribution, policy enforcement, and audit trail they need to govern AI without slowing it down. If this framing resonates, we’d love to show you what it looks like in practice.

