Tool annotations are becoming the risk vocabulary for agentic systems. That matters more than it might seem.

The MCP community dropped an update a couple weeks ago that I have been thinking about quite a bit: https://blog.modelcontextprotocol.io/posts/2026-03-16-tool-annotations/. It is an update on the state of tool annotations.

The MCP specification’s extensions include a formal set of tool annotations: readOnlyHint, destructiveHint, idempotentHint, openWorldHint. These are metadata hints that let servers describe the behavioral properties of their tools. Whether a tool modifies its environment, whether that modification is destructive, whether it reaches into an open world of external entities.

The defaults are deliberately conservative: a tool with no annotations is assumed to be potentially destructive, non-idempotent, and open-world. The spec assumes the worst until the server says otherwise.

The trust problem with hints

The MCP blog post is refreshingly honest about the core tension. Annotations are hints, not contracts. An untrusted server can claim readOnlyHint: true and delete your files anyway. The spec explicitly says clients must treat annotations from untrusted servers as untrusted. That’s the right default, but it raises an obvious question: how do you get to a place where annotations are actually trustworthy enough to drive policy?

This is a supply chain problem. It is a class of problem that we’ve been working on at Stacklok since long before MCP existed.

The registry as trust broker

Stacklok already provides a curated registry of MCP servers with provenance verification built in. When you deploy a server from the registry, Stacklok examines the container image, extracts its cryptographic fingerprint, searches for signatures and attestations, verifies them against trusted certificate authorities, and validates build provenance to ensure it matches the expected source code. That gives you a verified chain from source to running artifact.

The natural next step is to extend that vetting process to include annotation accuracy. When we evaluate an MCP server for inclusion in the registry, we can scrutinize the source code to confirm that the declared annotations match the actual behavior of the tools. A tool that claims readOnlyHint: true should be proven to not write to any external state. A tool that claims openWorldHint: false should be proven to not make network calls outside its domain.

This turns the registry into a trust broker for annotations. Organizations that consume servers from a vetted registry can configure Stacklok to treat those annotations as policy inputs with confidence, because the provider has verified the claims against the code. Servers from unvetted sources still get the pessimistic defaults. The policy engine does not need to solve the trust problem on its own because the registry has already resolved it upstream.

From hints to enforceable policy

Once you have trustworthy annotations flowing through a governance layer, the policy surface becomes significantly richer. “No destructive tools without human approval.” “Open-world tools are blocked in sessions that have accessed private data.” “Read-only tools from registry-verified servers can be auto-approved.” These are policies that platform teams need to express, and they require two things: structured metadata on the tools, and a reason to believe that metadata is accurate.

The more interesting design point from the original blog post is about combinations. A tool’s risk depends on what else is in the session, not just its own properties. search_emails is not inherently dangerous on its own. Pair it with an open-world communication tool and you have the conditions for data exfiltration. Trustworthy annotations make combinatorial policy practical, because the governance layer can reason about the full set of tools in a session without having to assume the worst about every one of them.

The adoption gap is the opportunity

Annotation coverage today is uneven. Many servers ship without them, and clients vary in how strictly they honor the pessimistic defaults. That unevenness is the gap that platform teams should be paying attention to. As the annotation vocabulary expands and server authors adopt it, the teams that already have a governance layer and a trust model in place will be positioned to consume that metadata immediately. Teams that don’t will be retrofitting.

If you are building or operating MCP servers, start annotating your tools now. If you are a platform team thinking about agentic governance, take a look at how Stacklok lets you build policy around structured, verified tool metadata. You can get started right now with our open source project, ToolHive: github.com/stacklok/toolhive.

Want to better understand tool annotations? Here are some additional questions to explore:

MCP tool annotations are metadata hints attached to individual tools in an MCP server. The MCP specification defines four: readOnlyHint, destructiveHint, idempotentHint, and openWorldHint. These describe behavioral properties — whether a tool modifies state, whether that modification is reversible, and whether the tool reaches external systems. Clients and governance layers use annotations to make risk-based decisions about tool approval and access policy.

The MCP spec defaults to the most pessimistic posture for unannotated tools. A tool with no annotations is assumed to be potentially destructive, non-idempotent, and open-world. This means clients that honor the spec will treat unannotated tools with maximum caution. Server authors who want their tools to receive lighter-touch handling must explicitly annotate them since the spec does not assume safety.

A curated registry that performs provenance verification can extend its vetting process to include annotation accuracy, inspecting server source code to confirm that declared annotations match actual behavior. A tool claiming readOnlyHint: true can be verified to write no external state. A tool claiming openWorldHint: false can be verified to make no out-of-domain network calls. This turns the registry into a trust broker: governance layers consuming servers from a vetted registry can treat those annotations as verified policy inputs rather than unverified claims.

Annotations are hints, not contracts. An untrusted server can declare readOnlyHint: true and still modify or delete data. The MCP specification explicitly requires that clients treat annotations from untrusted servers as untrusted. Annotations only become reliable policy inputs when the server that declares them has been independently vetted, either through a curated registry that verifies annotation accuracy against source code, or through organization-internal review.

A single tool’s risk profile changes depending on what other tools are active in the same session. A tool that reads emails is low-risk in isolation. Paired with an open-world communication tool (one that can send data to external endpoints) it creates the conditions for data exfiltration. Trustworthy annotations make combinatorial session-level policy practical: a governance layer can reason across the full tool set in a session rather than evaluating each tool independently.

Stacklok’s vMCP gateway can apply access policies using annotation metadata as inputs, for example, auto-approving read-only tools from registry-verified servers while requiring human approval for destructive or open-world tools. When servers are deployed from Stacklok’s curated registry, annotation accuracy has been verified against source code, making those annotations reliable policy inputs. You can start with ToolHive, an open source project under Apache 2.0 and available at github.com/stacklok/toolhive.

April 02, 2026

Insights

Craig McLuckie

CEO

More by Craig McLuckie