chroma-mcp

Official
Local
472
Signed
GitHub Repo

Overview

The chroma-mcp server is a Model Context Protocol (MCP) server that enables AI assistants and agents to interact directly with Chroma, an open-source vector database designed for embeddings and similarity search. It allows AI workflows to store, retrieve, and query vector embeddings alongside metadata, making it easy to build retrieval-augmented generation (RAG), semantic search, and memory-driven applications without custom database integration.

This server is especially useful for AI systems that need long-term memory, document retrieval, or semantic similarity capabilities as part of their reasoning loop.

Transport

stdio

Tools

  • chroma_list_collections
  • chroma_create_collection
  • chroma_peek_collection
  • chroma_get_collection_info
  • chroma_get_collection_count
  • chroma_modify_collection
  • chroma_delete_collection
  • chroma_add_documents
  • chroma_query_documents
  • chroma_get_documents
  • chroma_update_documents
  • chroma_delete_documents

Key Capabilities

  • Vector embedding storage — Persist embeddings and associated metadata for long-term use.
  • Semantic similarity search — Query collections using embeddings or text to find the most relevant results.
  • Retrieval-augmented generation (RAG) — Power RAG workflows by supplying relevant context to AI models at runtime.
  • Metadata-aware filtering — Combine vector similarity with structured metadata filters.
  • Local or embedded operation — Run Chroma locally or as part of an embedded application for fast iteration.

How It Works

The chroma-mcp server runs as a local or containerized MCP service and connects to a Chroma database instance (typically running locally or embedded in the same environment). Once configured, it exposes Chroma’s collection and query operations as MCP tools that AI clients can invoke.

By abstracting away Chroma’s client libraries and persistence details, the server lets AI assistants treat vector memory as a native capability. This enables workflows such as “retrieve relevant notes from prior conversations,” “search a document corpus semantically,” or “maintain long-term contextual memory” without bespoke integration code.