Introducing the Stacklok AI Gateway

Navigating the AI Infrastructure Landscape: Lessons from Kubernetes Creators

Earlier this week, Stacklok’s CEO, Craig McLuckie and CTO, Joe Beda, joined the Founded and Funded podcast hosted by Tim Porter, Managing Director at Madrona Venture Group. You can watch the full interview here. Following is a summary of some of the key issues discussed during the session.

The enterprise AI landscape is evolving at breakneck speed. Companies are deploying agents at scale, extending use cases from developers to knowledge workers, and grappling with permissions, identity, and security challenges that took the cloud-native ecosystem years to solve. Meanwhile, vendors are pushing single-stack solutions while enterprises seek the interoperability they enjoyed with Kubernetes.

The Power of Ignoring Convention

“Don’t underestimate what can be done when the people doing it don’t know what they’re doing is impossible,” McLuckie observes. The spirit of a startup requires ignoring conventional wisdom to truly innovate. Yet certain lessons from the cloud-native era remain invaluable.

Open source continues to be an enormous enabler of communities. Things move both faster and slower than expected, especially in enterprise contexts. Beda points to SPIFFE, an open source identity project he started a decade ago, which is only now coming into its own in the AI era. The old adage holds true: it happens slow, then it happens fast.

Developers as the Canary in the Coal Mine

The future of knowledge work is becoming visible through the lens of developers. Modern development teams at companies like Stacklok are generating tremendous value through agents, but the workflow has fundamentally changed. Developers are no longer operating solo in an IDE. They’re taking responsibility for the work product of a broad cross-section of specialized agents, becoming the center of orchestration for these agentic systems.

They’re curating agents, understanding the work being produced, ensuring accountability for agent behavior, and directing productive engagement toward company goals. This pattern will naturally cascade into the knowledge worker space, but significant challenges remain.

Developers have flexibility in their thinking and a relatively high pain threshold for configuring their local environment. Knowledge workers cannot be expected to make those same investments. The starting point is enabling AI to be a natural part of knowledge workers’ work, providing access to necessary systems, and gradually increasing the richness of capabilities until agents become a natural part of the workforce.

MCP: The Docker Moment for AI

When McLuckie first saw Docker, he could see two things coexisting in the same space: a technology that unlocked containers for a new use case, enabling developer productivity and application portability, but also a glimpse of Kubernetes on the other side. He had a similar experience with the Model Context Protocol (MCP) specification.

MCP promises two things. It suggests what the future of AI-native applications might look like, with the LLM as the presentation layer and view model for modern apps. But it also represents an opportunity to control the aperture whereby agents access systems.

Having a formalized protocol creates a selectively permeable membrane around existing systems, ensuring value flows in both directions while preventing harm. This controlled access to resources is incredibly important for organizations.

Humans are poor at scrutinizing the work of agents. Developers asked to reason about every access request to a resource simply don’t work that way. The gateway to productivity is having the right guardrails in place so you can set a tool or agent to work knowing it can’t cause harm and can’t egress data inappropriately. By creating those guardrails, you actually enable velocity and productivity.

The Three Overlapping Concerns

Enterprises attempting to make AI work are discovering that simply “rubbing some AI on” isn’t sufficient. They need to connect to internal systems, which surfaces three overlapping concerns driving the ecosystem of tools:

Safety and governance: Ensuring bad things don’t happen when coupling stochastic systems with business-critical functions. AI can move faster than humans, access more information before you understand what’s happening, and doesn’t always follow the handbook. It’s not afraid of going to jail.

Effectiveness: Making agents actually deliver on their promises.

Cost control: As deployment scales, managing expenses becomes critical.

MCP is one lens where you can exert control around safety and capability. The next piece is the LLM gateway, standing between workloads and upstream LLM providers. This becomes a strategic choke point for cost control, preventing sensitive data egress, and maintaining flexibility as new models emerge and self-hosting of open-weight models becomes more tenable.

Left of the LLM: Where Innovation Happens

When ChatGPT first appeared, focus centered on the LLM itself. But between a user’s intent and the LLM sits a bunch of stuff in the middle, and that’s where tremendous innovation is occurring.

With coding agents like Claude Code, the LLM is important, but what happens within the agent itself is equally significant. As knowledge workers use these systems more deeply, it won’t be a direct line to the LLM. Context management, memory management, document organization and sharing, skill building and peer-to-peer enablement, agent definition; all of this happens left of the LLM.

There’s enormous appetite to create something both open and aimed at enterprises to run behind their firewall, providing those powerful moments for the entire company.

The Platform Question: Mainframe or Kubernetes?

The big question facing the ecosystem is whether we’re moving back into a mainframe era of highly vertical integration, or whether a platform will emerge enabling organizations to create specialized value and maintain optionality.

The same question applies to AI infrastructure. Frontier models run at the absolute forefront, but open-weight models offer incredible price-performance advantages for some organizations. Holding option value on that is important, but it won’t work if too much value is coupled in what sits left of the LLM.

A platform architecture needs to emerge in open source. MCP is a seed crystal offering an opportunity to grow this ecosystem, but it’s insufficient alone. Kubernetes itself offers hints, not just in community building but literally as a principled control plane with nice properties around reconciliation-driven action. The community will naturally introduce first-class primitives that unlock this class of use case and value.

The Lock-In Consideration

Developers are good at switching tools. They do it constantly. But when delivering tools to knowledge workers, they won’t be as nimble. Training an entire sales team on how to use Claude or OpenAI ties you not just to that LLM provider but to the entire experience delivered through those front-end tools.

This vertical integration leads to lock-in that doesn’t serve enterprises well. They want to own the destiny of how they train, inform, and build tools for knowledge workers beyond developers. They want that insulated from the upstream LLM provider.

Identity and Transaction Tokens: The Missing Layer

Traditional corporate controls rely on human-level constructs: company handbooks, legal consequences, the threat of arrest. These don’t apply to AI. An AI-empowered banker interacting with a customer needs proof that there’s a live customer on the other end when accessing sensitive data.

That transaction information must carry through the entire chain—from agent to interface to other agents to MCP server to database. At the database level, you may want to restrict results to only that single customer. This entire stack requires carrying information through in ways enterprises typically don’t do today.

A few places exist where enterprises can implement this type of transaction token, but it’s rare. Plumbing this through the entire stack is key to building trust that you can give knowledge workers access to sensitive information through agents in a way that prevents things from going wild and doing something unsafe.

Where to Start: The Developer Path

The developer space is highly contested and notoriously hard to monetize, but it’s where organizations can understand and learn the full value of the system while creating leverage.

During the Kubernetes days, field engineers would partner with organizations to help them get over the learning curve. Now, forward-deployed engineers are outperforming expectations massively because they’re not just doing work, they’re bringing skills and educating teams on how to operate their own agents.

Getting the development posture right from the start is critical because it accelerates use of the technology itself, operationalization of platform pieces, and the organization’s path to value. For most organizations, it’s about getting the engineering flow right first, making the transition so engineers use agents to generate asynchronous value and feel like they’re flying.

Once there, you’ve established patterns you can extrapolate for knowledge workers. You can also identify what’s working for developers that won’t work for knowledge workers. Technologies like Claude do much work locally, and the developer’s desktop is the anchor point for integration, memory management, and session management. That probably won’t work for knowledge workers who need experiences on phones.

The Trust Optimization

When working with development teams, the key optimization is: how long can an agent do useful work before I have to intervene? The factor is trust.

The first step is reaching the point where an agent can run for some time without constant watching and approval of every tool access. Start with coarse-grain controls: configure the agent to run in a relatively isolated environment, be judicious about tool access, ensure tools have read-only access, and limit the system’s ability to egress sensitive data.

Once at that point, you’re creating efficiencies. The gateway to that outcome is MCP as a natural starting point, providing guardrails that enable speed.

After that first transition, organizations often experience sticker shock with token consumption. An LLM gateway becomes the immediate next question, enabling intelligent routing so certain requests use cheaper models instead of the most expensive option. You build the more complete system over time.

Diversity of Perspective

Every successful company brings together a diverse set of people and perspectives. Today’s engineering teams need multiple forms of diversity:

Depth and taste: People with sophistication to know what good looks like, to construct the ontology and views used to generate code. The code can always be regenerated from the ontology, but you need to know what good looks like.

Curiosity and generalist mindset: Willingness to be unfettered in thinking about value creation, to challenge the boundaries of roles and conventional wisdom, bringing innovative flair.

AI-native perspective: Some people were born in AI. It’s their first language. They bring different capabilities than those who learned to build software first and are now adapting.

This versatility isn’t constrained to any one function. What’s most surprising is how versatile people become when they have access to this class of tools. Recruiters are now effective in sales organizations because the work translates and they have access to new tools. They’re constantly building agents creating value in different domains, even self-identifying as developers despite not writing code.

The Path Forward

The challenge facing every organization is deploying AI agents to make work more efficient, productive, and enjoyable while doing it safely and securely.

The lessons from the cloud-native era provide valuable guideposts, but this moment is different. The pace is unprecedented. The technology is fundamentally different. The implications for how we work are more profound.

Success requires balancing the blank-sheet-of-paper thinking that enables solving problems the right way with first-principle historical perspective. It requires building the right guardrails not to slow things down but to enable velocity. It requires platforms that enable specialization and optionality rather than vertical integration and lock-in.

Most importantly, it requires closing out the arc with customers, not just delivering a platform and walking away, but getting development teams operating excellently, identifying initial use cases, and working them to completion before expanding the program. Patience in that process, combined with the right technical foundation, creates the conditions for enterprises to truly transform how they work.

The AI infrastructure landscape is chaotic, but for those who’ve navigated infrastructure transformations before, the patterns are familiar even as the technology is novel. The question isn’t whether this transformation will happen, but whether it will happen in a way that serves enterprises well or locks them into vertically integrated systems. The answer to that question will shape the next decade of enterprise technology.

June 13, 2026

Company

Product Updates

Stacklok now supports Enterprise-Managed Authorization

How-To