Thesis Model & Context The AI-Ready Enterprise Stack

The AI-Ready Enterprise Stack

Most enterprise AI conversations start at the model layer. They should start several layers lower.

Mar 15, 2026 9 min

Core argument

The short version of the piece before you go deeper.

Most enterprise AI conversations start at the model layer. They should start several layers lower.

Models sit in the middle of the stack, not at the beginning of it.

Identity, governance, and integration are prerequisites to usable autonomy.

Agent commerce only works when the lower layers already hold.

Watch the video

Inside the AI-Ready Enterprise Stack

14 min

Enterprise AI is usually discussed as if the model were the product. In practice, the model is just one layer in a much larger operating stack.

I keep seeing teams rush to deploy GPT-4 or Claude without thinking through the infrastructure beneath it. They demo a chatbot that can answer questions brilliantly, then watch it fail spectacularly when it tries to update a CRM record or approve an expense report. The problem isn't the model. It's everything else.

Foundation before autonomy

Identity sits at the base because agents cannot act safely without a clear access model. Which role is the agent assuming? What systems can it touch? Which actions can it take on behalf of a user, a team, or a process?

I've watched three different Fortune 500 companies learn this lesson the hard way. One had their AI assistant accidentally expose salary data because they mapped it to a service account with admin privileges. Another had agents creating duplicate tickets in ServiceNow because they couldn't distinguish between user contexts. The third shut down their entire agent program after realizing their identity model couldn't handle delegation properly.

The right approach starts with extending your existing identity infrastructure. If you're using Okta or Azure AD, your agents need to exist within that same framework. Microsoft's Entra ID now supports workload identities specifically for this use case. AWS IAM has role assumption patterns that work well for agent delegation. The key insight: treat your AI agents like microservices, not like users.

Governance comes next. Policy, approval paths, audit trails, and usage controls are not cleanup work after the system ships. They are part of what makes the system deployable in the first place.

I think most teams underestimate how much governance infrastructure they need. You need policy engines that can evaluate agent actions in real time. Open Policy Agent (OPA) works well here, especially when paired with a proper observability stack. You need approval workflows that understand AI-specific risks. Temporal or Camunda can handle the orchestration, but you'll need custom logic for AI safety checks.

Integration is the bridge into reality. MCP servers, APIs, event streams, and application connectors determine whether the agent sees fresh context and can do useful work without becoming an uncontrolled operator.

Anthropic's Model Context Protocol (MCP) is interesting here because it standardizes how agents connect to external systems. But in practice, I'm seeing more teams build on established patterns like GraphQL federations or event-driven architectures. The winning pattern seems to be treating agents as specialized API consumers with strict schemas and rate limits.

The Identity Crisis Nobody Talks About

Here's what I think most architects miss: AI agents break traditional identity models in three fundamental ways.

First, agents need temporal identities. An agent working on a financial forecast might need elevated permissions for 30 minutes, then revert to read-only access. Traditional RBAC systems aren't built for this. You need something like CyberArk's Privileged Access Management or HashiCorp Boundary that can handle just-in-time access.

Second, agents need composite identities. When an agent acts on behalf of a user within a team context for a specific workflow, you're dealing with nested authorization scopes. I've seen teams try to flatten this into simple service accounts and it always fails. You need an identity model that can represent "Claude-as-Jay-in-Finance-for-Budgeting" as a distinct security principal.

Third, agents need auditable delegation. Every action an agent takes must trace back to a human decision to delegate that authority. This isn't just for compliance. It's for debugging when things go wrong. And things will go wrong.

Platform Layers That Actually Matter

Data platforms and AI platforms are often discussed together, but they solve different problems. The data layer provides context, memory, telemetry, and historical grounding. The AI platform provides model access, evals, prompt infrastructure, and guardrails.

Let me break down what I'm seeing in production:

Layer	Traditional Stack	AI-Ready Stack	Key Difference
Data Platform	Snowflake, Databricks	Snowflake + Cortex, Databricks + MLflow	Native vector search and ML operations
AI Platform	(Didn't exist)	Bedrock, Vertex AI, Azure OpenAI	Model routing, safety filters, usage tracking
Integration	REST APIs, ESB	GraphQL + MCP, Event streams	Semantic understanding of data contracts
Runtime	Kubernetes, Lambda	LangGraph, Temporal, Prefect	Stateful execution with memory
Observability	Datadog, Splunk	Langfuse, Phoenix, Honeycomb	Prompt/response tracking, semantic traces

The data platform evolution is particularly interesting. Snowflake Cortex and Databricks' acquisition of MosaicML show that traditional data platforms are racing to add AI-native capabilities. But I think the real innovation is happening in specialized vector databases like Pinecone and Weaviate. They understand that AI workloads need different indexing strategies than traditional analytics.

On the AI platform side, I'm bullish on AWS Bedrock's approach. They're not trying to be the best at any single model. They're trying to be the best at model operations: switching between Claude and Llama based on workload, implementing safety filters consistently, tracking token usage across teams. That's what enterprises actually need.

Runtime Complexity and State Management

Above that sits the runtime. This is where state, retries, branching, validation, and failure recovery live. Without a runtime, most agent systems are just prompt calls connected to hope.

I've evaluated most of the agent frameworks out there. LangGraph gets state management right with its checkpoint system. CrewAI makes multi-agent coordination accessible but struggles with complex error handling. AutoGen from Microsoft has impressive research credentials but feels academic in production settings.

The pattern I'm recommending to clients is to separate the agent logic from the execution runtime. Use LangGraph or similar for defining agent behaviors, but run them on battle-tested workflow engines like Temporal or Prefect. This gives you proper durability, retry logic, and observability without reinventing distributed systems.

Here's a concrete example. One of my clients built a procurement agent that needs to:

Parse incoming purchase requests
Check budget availability across three systems
Route for appropriate approvals
Create POs in SAP
Notify stakeholders

The LangGraph agent handles the intelligence: understanding requests, determining approval paths, formatting data for SAP. But Temporal handles the execution: retrying failed API calls, maintaining state across system boundaries, rolling back on failures. This separation of concerns is crucial for production reliability.

Agent Commerce and Transaction Authority

At the top is what I think of as agent commerce: the point where AI begins participating in workflows, decisions, approvals, and transactions. That layer only works when everything underneath it is already holding.

We're starting to see early examples of this. Klarna's AI assistant doesn't just answer questions; it processes refunds. Intercom's Fin doesn't just route tickets; it resolves them. But these are still narrow implementations with heavy guardrails.

The next wave will be agents with real transaction authority. I'm tracking several startups building "AI CFOs" that can approve expenses, manage vendor payments, and optimize cash flow. The technology is ready. The challenge is trust and liability.

Consider the legal implications. If an AI agent approves a $50,000 invoice that turns out to be fraudulent, who's liable? The software vendor? The company using it? The employee who delegated the authority? These questions don't have clear answers yet, which is why most enterprises limit agents to advisory roles.

Building Your Stack: A Practical Roadmap

The useful question for enterprise leaders is not whether they have an AI strategy. It is which layers of the stack are real, which are missing, and which are still being faked with manual intervention.

Based on what I'm seeing across dozens of enterprise AI initiatives, here's the build order that works:

Phase 1: Foundation (3-6 months)

Start with identity and governance. Extend your existing IAM to support AI workloads. Build approval workflows for agent actions. Set up comprehensive audit logging. This isn't exciting work, but it's what makes everything else possible.

Phase 2: Read-Only Integration (2-3 months)

Connect agents to your systems in read-only mode. Let them query databases, read from APIs, consume event streams. Build the semantic layer that helps agents understand your data. Tools like Cube.dev or dbt Semantic Layer work well here.

Phase 3: Controlled Writes (3-4 months)

Start with low-risk write operations. Let agents create draft emails, propose calendar invites, generate reports. Use human-in-the-loop patterns aggressively. Every write operation needs an approval flow initially.

Phase 4: Autonomous Operations (6+ months)

Gradually expand agent authority based on demonstrated reliability. Start with operations that have clear rollback paths. Monitor everything obsessively. Build kill switches that can disable agent actions instantly.

What This Means for Engineering Leaders

I think we're about 18 months away from AI agents being table stakes for enterprise software. Not having an agent strategy will be like not having a mobile strategy in 2012. The companies that build proper foundations now will have massive advantages.

Your immediate actions should be:

Audit your identity infrastructure. Can it handle non-human principals? Can it do temporal access control? If not, start planning upgrades.
Evaluate your observability stack. Traditional APM tools don't understand prompt/response patterns or semantic traces. You'll need specialized tools like Langfuse or Arize Phoenix.
Pick your AI platform bet. I'm seeing most enterprises standardize on either AWS Bedrock or Azure OpenAI Service. Google's Vertex AI is technically superior in some ways but lacks enterprise features. Avoid building your own model serving infrastructure unless you have Netflix-scale requirements.
Start small but think big. Your first agent should do something useful but not critical. But architect it as if you'll have hundreds of agents within two years. Because you will.

The companies that get this right will have AI agents handling 30-40% of routine operations by 2026. The companies that don't will be competing against organizations that move 10x faster. The stack I've outlined isn't just about deploying AI. It's about building the infrastructure for a fundamentally different way of working.

The model isn't the product. The stack is. Build accordingly.

The short version of the piece before you go deeper.

Foundation before autonomy

The Identity Crisis Nobody Talks About

Platform Layers That Actually Matter

Runtime Complexity and State Management

Agent Commerce and Transaction Authority

Building Your Stack: A Practical Roadmap

What This Means for Engineering Leaders

Related ideas from the same body of work.

What Every Agent Runtime Should Share

Why MCP Is a Trust-Boundary Problem

Decision Automation Needs Three Layers, Not One