Introduction to vMCP
Overview
Virtual MCP Server (vMCP) is a feature of the ToolHive Kubernetes Operator that acts as an aggregation proxy, consolidating multiple backend MCP servers into a single unified interface. Instead of configuring clients to connect to each MCP server individually, you connect once to vMCP and access all backend tools through a single endpoint.
vMCP supports two types of backends:
- MCPServer: Container-based MCP servers running in your cluster
- MCPRemoteProxy: Proxies to external remote MCP servers (such as Notion, analytics platforms, or other SaaS MCP endpoints)
vMCP can discover MCPRemoteProxy backends in a group, but authentication between vMCP and MCPRemoteProxy is not yet fully implemented. MCPServer backends work fully with vMCP. See Proxy remote MCP servers for details on the current limitations.
Core capabilities
- Multi-server aggregation: Connect to one endpoint instead of many, including both local container-based servers and remote MCP proxies
- Tool conflict resolution: Automatic namespacing when backend MCP servers have overlapping tool names
- Centralized authentication: Single sign-on with per-backend token exchange
- Composite workflows: Multi-step operations across backend MCP servers with parallel execution, approval gates, and error handling
- Tool optimization: Replace all individual tool definitions with two
lightweight primitives (
find_toolandcall_tool) to reduce token usage and improve tool selection. See Optimize tool discovery and the underlying concepts
When to use vMCP
Good fit
- You manage 5+ MCP servers (local or remote)
- You need cross-system workflows requiring coordination
- You have centralized authentication and authorization requirements
- You need reusable workflow definitions
- You want to aggregate external SaaS MCP servers with internal tools
- You want to reduce token usage and improve tool selection accuracy across many backends with the optimizer
Not needed
- You use a single MCP server
- You have simple, one-step operations
- You have no orchestration requirements
Architecture overview
How it works
- You define an MCPGroup (a resource that organizes related MCP servers)
- Backend resources reference the group using
groupRef:- MCPServer resources for container-based MCP servers
- MCPRemoteProxy resources for external remote MCP servers
- You create a VirtualMCPServer that references the group
- The operator discovers all MCPServer and MCPRemoteProxy backends in the group and aggregates their capabilities
- Clients connect to the VirtualMCPServer endpoint and see a unified view of all tools from both local and remote backends
Optimize tool discovery
As the number of aggregated backends grows, clients receive a large number of
tool definitions that consume tokens and can degrade tool selection accuracy.
The vMCP optimizer addresses this by replacing all individual tool definitions
with two lightweight primitives (find_tool and call_tool) and using hybrid
semantic and keyword search to surface only the most relevant tools per request.
To enable the optimizer, add an embeddingServerRef to your VirtualMCPServer
resource. See Optimize tool discovery for the full setup
guide.