ToolHive grows up: What the CRD graduation to v1beta1 means for your cluster

Running MCP servers in Kubernetes used to mean accepting that the API you wrote manifests against could change under you at any time. As of ToolHive v0.23.0, that changes, and with it, seven months of capability improvements are ready for production.

Here’s what you’ll learn in this post:

What the graduation from v1alpha1 to v1beta1 actually means, and how to migrate with zero downtime
How ToolHive’s configuration model shifted from large inline blocks to reusable, referenced CRDs, and why that matters for operators at scale
What’s new across auth, discovery, observability, and scalability

The headline: v1beta1 is a stability commitment

Seven months ago, we wrote about wiring OpenTelemetry and Prometheus around a fleet of MCP servers running under the ToolHive operator. You could run an MCP server in a pod, give it an OIDC config, ship its traces somewhere useful, and that was the headline.

It worked. People used it. And then the questions started coming in.

“I have twenty MCP servers in this namespace and I’m copy-pasting the same telemetry block into every one of them. Can I just reference it?”

“Half of the servers I want my agents to talk to aren’t something I run — they’re hosted, behind someone else’s gateway. Do I really need a proxy pod for each one?”

“I want my LLM to see a curated set of tools across all my servers, not twenty separate connections. Is there a way to compose them?”

“Can I write authorization policies that read upstream Entra group claims?”

These weren’t bugs. They were the next layer of the product asking to be built. And every honest answer started with “yes, eventually, and the API shape will probably change while we figure it out.”

That’s exactly what v1alpha1 is for in Kubernetes — a contract that says “we reserve the right to break this.” We used it. A lot.

The story of the last seven months is the story of using up that runway, landing the features that those questions implied, and arriving at a CRD surface stable enough to put a stability stamp on.

Every ToolHive CRD now ships at toolhive.stacklok.dev/v1beta1.

This post is how we got there.

The first realization: configuration wants to be referenced

The first big shift was admitting that inline configuration didn’t scale.

The original MCPServer spec was a self-contained object. OIDC, telemetry, and tool filter configurations were all defined inline. For one or two servers, it was fine. For a real deployment with shared identity provider settings, a shared OTel collector, and a shared list of approved tools, it was a copy-paste exercise that turned every change into a twenty-file pull request.

So we broke the inline blocks out. MCPOIDCConfig landed as a standalone CRD that any MCPServer, MCPRemoteProxy, or VirtualMCPServer could point at via oidcConfigRef. MCPTelemetryConfig followed the same week, with proper CA-bundle support for OTLP-over-HTTPS collectors that the inline form had never really handled. MCPToolConfig and MCPExternalAuthConfig made tool filters and external auth strategies similarly reusable.

The deprecation arc that followed was deliberate. Inline oidcConfig, telemetry, and tools blocks first started emitting warnings, then stopped being honored, then disappeared from the schema entirely. By the time we cut v1beta1, the only way to wire identity, observability, and tool gating was through a referenced CRD — which is exactly the shape we wanted to commit to.

The second realization: not every MCP server is something you run

The next pressure came from how people were actually using MCP. The operator was great if you wanted to run an MCP server inside your cluster in a hardened pod. It was awkward if the server you cared about was hosted elsewhere — your vendor’s gateway, a SaaS, an internal team’s already-fronted endpoint.

You could model that as an MCPServer with a remoteURL, but you ended up paying for a proxy pod you didn’t need, and the surface didn’t quite fit. The server wasn’t something the operator was managing. It was something the operator was cataloging.

So we added MCPServerEntry — a CRD explicitly for remote MCP servers with no infrastructure footprint.

apiVersion: toolhive.stacklok.dev/v1beta1
kind: MCPServerEntry
metadata:
  name: hosted-search
spec:
  remoteURL: https://search.example.com/mcp
  transport: streamable-http
  groupRef:
    name: default
  externalAuthConfigRef:
    name: hosted-search-auth
  caBundleRef:
    name: corp-ca

apiVersion: toolhive.stacklok.dev/v1beta1
kind: MCPServerEntry
metadata:
  name: hosted-search
spec:
  remoteURL: https://search.example.com/mcp
  transport: streamable-http
  groupRef:
    name: default
  externalAuthConfigRef:
    name: hosted-search-auth
  caBundleRef:
    name: corp-ca

No pod. No proxy. Just a governable, discoverable record of a server that exists somewhere out there. Every remoteURL runs through SSRF validation that rejects loopback, RFC 1918, link-local, cloud metadata addresses, IPv6 ULA, and kubernetes.default.svc, so a malicious manifest can’t turn the operator into an internal-network probe.

MCPServerEntry mattered for its own sake, but it mattered more because it gave the next piece something to bind to.

The third realization: many servers want a front door

Once a cluster has fifteen MCP servers — some running locally, some hosted remotely, some shared across teams — the agent talking to them shouldn’t have to know about all fifteen. It should see one.

That’s what VirtualMCPServer (vMCP) had been gesturing at when it first shipped as a scaffolded primitive. Over the following months, it became the most substantial piece of new functionality in ToolHive.

vMCP works in two modes. Static lists explicit backends. Dynamic discovers them from the cluster via label selectors and the new BackendReconciler, which means you can add an MCPServer or MCPServerEntry with the right labels and have it picked up by the right vMCP without touching the vMCP manifest. The same backend reconciler made MCPServerEntry and VirtualMCPServer complementary by design — catalog a remote server, label it, and a vMCP picks it up.

But aggregation was just the start. The harder feature was composition. VirtualMCPCompositeToolDefinition describes a multi-step workflow that the client sees as a single tool. Inside that definition you can use forEach steps for fan-out, you can relay prompts and resources between steps with their content arrays and content-level annotations preserved, and you can mark steps as skippable with defaultResults so partial-success workflows have a sensible fallback. The composite tool isn’t a script the agent has to orchestrate — it’s a single tool call that the agent gets a clean answer from.

Then there’s the optimizer. With many backends and many tools, the LLM’s context window doesn’t want a dump of every tool from every server. Tier 1 shipped a lightweight matcher exposed as find_tool / call_tool meta-tools. Tier 2 is the production path: a vector-embedding plus hybrid-search optimizer backed by a HuggingFace TEI container, run as an opt-in EmbeddingServer workload that clusters only stand up when they want it. The optimizer is policy-enforceable through Cedar, which matters because the meta-tools become the surface that policy lives on.

Horizontal scaling came along the way. vMCP is now Redis-backed for cross-pod session restore, with session-aware backend routing and per-backend session-ID persistence.

The fourth realization: composition demands real authorization

A virtual MCP server that aggregates fifteen backends behind one endpoint is exactly the kind of object that needs serious authorization. The auth surface in the original blog post was a single OIDC config check on a single endpoint. By the time of graduation, it had roughly tripled.

The biggest piece is the embedded OAuth authorization server. ToolHive grew a first-class authorization-server implementation with Dynamic Client Registration, persisted client credentials, and well-known discovery — meaning an MCPServer, MCPRemoteProxy, or VirtualMCPServer can host its own AS. The new authServerRef cleanly separates “who issues tokens for this resource” from “what we do with upstream tokens we receive,” which used to be the same code path and was the source of most auth-config confusion.

Cedar policy enforcement got significantly more powerful. The optimizer integrates with Cedar so that policy still applies end-to-end – call_tool invocations are authorized against the resolved backend tool, and find_tool results are filtered to only tools the caller is permitted to see. They evaluate against upstream IdP token claims, with dual-claim and dot-notation group extraction that works across Entra, Keycloak, Auth0, and Okta without per-provider hacks. THVGroup parent entities let you write principal in THVGroup::"<group>"-style scoping for policies that span many MCP servers under a logical group.

For the cases where the answer to “is this caller authorized?” needs to go to a backend MCP server in its native identity, token exchange, and upstream injection added reusable token refresh and an upstream_inject strategy that forwards upstream IdP tokens to backend servers — with the identity object enriched with upstream claims so downstream policy decisions have the full picture.

For organizations that already run a policy engine of their own, webhook authorization middleware lets you delegate. A validating webhook lets you say yes/no. A mutating webhook with JSONPatch responses lets you say “yes, but rewrite the request this way.” Webhook URL, timeout, and failure policy are all configurable on the workload spec.

A handful of OIDC and external-auth ergonomic improvements rounded it out: oidcConfigRef.resourceUrl for servers exposed through Ingress with a different external hostname, protectedResourceAllowPrivateIP and jwksAllowPrivateIP toggles on the OIDC config for clusters that need to reach a private IdP, and support for non-standard OAuth scope parameters (Slack’s user_scope, for example) configurable through MCPExternalAuthConfig.

The fifth realization: discovery needs a real registry

Once you have catalog records (MCPServerEntry), composition (VirtualMCPServer), and authorization (Cedar + embedded AS), the remaining question is where the catalog comes from in the first place.

Earlier versions of ToolHive shipped the Registry Server exclusively as a standalone Helm chart — a separate deployment you installed and configured independently of the Operator. The MCPRegistry CRD adds a second option: declare a single MCPRegistry resource and let the Operator handle the deployment, configuration, and upgrades alongside the rest of your ToolHive stack. The standalone Helm chart is still available if you prefer to keep the registry modular or manage it on a separate lifecycle.

The CRD spec is a deliberately thin wrapper: a raw configYAML passthrough plus volumes / volumeMounts / pgpassSecretRef for registries that need real credentials (Postgres-backed catalogs, mostly). The operator doesn’t parse or validate that config — the registry server owns its own schema. Legacy typed fields were retired along the way.

The registry also grew spec-aligned browse endpoints matching the upstream MCP registry specification — /registry/{name}/v0.1/servers and /versions/latest. That’s the moment ToolHive’s registry stopped being a ToolHive concept and started being an interoperable one.

`v1beta1`: the graduation

Twelve CRDs moved together to v1beta1:

MCPServer
MCPRemoteProxy
MCPGroup
MCPRegistry
MCPServerEntry
MCPToolConfig
MCPExternalAuthConfig
MCPOIDCConfig
MCPTelemetryConfig
VirtualMCPServer
VirtualMCPCompositeToolDefinition
EmbeddingServer

The graduation changes at 0.23.0 were a deliberately boring cut. The schemas didn’t change. The hard work — removing inline oidcConfig, deprecating telemetry, retiring legacy typed registry fields, dropping enforceServers from MCPGroup — all happened in the months before graduation. By the time we cut v1beta1, the v1alpha1 and v1beta1 schemas were structurally identical.

That bought us two properties we actually wanted:

No conversion webhook. Because the v1alpha1 and v1beta1 schemas are byte-compatible, conversion: None is sufficient — the apiserver returns stored objects under whichever apiVersion the client requests without translating fields. No webhook pod, no extra TLS, no failure domain to operate.
Zero-downtime migration. Existing v1alpha1 objects keep being reconciled. Nothing breaks the moment the chart upgrades.

Migrating in practice is three steps:

Upgrade the toolhive-operator-crds chart. The new release publishes both versions, with v1beta1 as storage and v1alpha1 deprecated.
Re-apply your manifests with apiVersion: toolhive.stacklok.dev/v1beta1. This rewrites the stored representation.
Watch kubectl get crd <name> -o jsonpath='{.status.storedVersions}'. Once v1alpha1 falls off that list across your fleet, a future release can drop the old version entirely.

You’ll see deprecation warnings on every kubectl get against v1alpha1. That’s intentional. They’re hints, not errors.

What happens to `v1alpha1`?

v1alpha1 will be removed entirely in a future release. Today, it’s served-and-deprecated to give clusters time to migrate.

The wrinkle is etcd. Even after every consumer has stopped writing v1alpha1, objects that were originally stored under it remain in etcd at that version until something rewrites them, and a CRD can’t drop a version that still appears in status.storedVersions. The standard fix is to script a no-op kubectl get … -o yaml | kubectl apply -f - migration across every cluster, which is the kind of operational chore nobody enjoys.

Stacklok is tracking a StorageVersion controller (#4969) that does the migration in-cluster for you. It walks each CRD, rewrites objects at the current storage version, and prunes deprecated entries from status.storedVersions so the old API version can be safely removed in the next chart upgrade, meaning the eventual v1alpha1 removal will just be a simple chart bump rather than a runbook.

What’s next

v1beta1 is a stability milestone, not a freeze. The work continues. We’d expect a v1 graduation only once we have meaningful production deployment data telling us the v1beta1 (and possibly v1beta2) shape held up.

If you want to follow along, the stacklok/toolhive releases page is the source of truth, and per-release changelogs are detailed enough to drive your own upgrade plan. We also have a version-by-version migration guide.

Thanks for reading, and thanks for using ToolHive.

Want to see what Stacklok can do for your organization? Book a demo or get started right away with ToolHive, our open source project. Join the conversation and engage directly with our team on Discord.

April 29, 2026

Last modified on May 05, 2026

Product Updates

Chris Burns

Software Engineer

Chris is a hybrid Site Reliability Engineer and Software Engineer, splitting his time equally between cloud infrastructure and core platform development. His focus is our Kubernetes runtime that enables MCP operations at scale.

More by Chris Burns

How-To

Why Enterprises Move Beyond LiteLLM: The Case for an Enterprise MCP Platform in 2026