Why mcpzip?

Every MCP server you add to Claude dumps all its tool schemas into the context window. That sounds harmless until you realize what happens at scale.

The Problem in Numbers

Say you use 5 MCP servers, each exposing 30 tools. That is 150 tool schemas loaded into Claude's context on every single message.

	Value
Average tool schema size	~350 tokens
Tools across 5 servers	150
Total tool overhead	52,500 tokens
Claude's context window	200,000 tokens
Context consumed by tools alone	26.3%

Now add 5 more servers. You are at 300 tools, 105,000 tokens, and over half your context is gone before the conversation starts.

The Real Cost

Context window tokens are not free. They increase latency, reduce the space available for your actual conversation, and degrade the model's tool selection accuracy. Research shows LLMs make worse tool choices when presented with more than ~60 options.

The Analogy

Think of it this way:

Without mcpzip: Every employee in a 500-person company introduces themselves to every visitor, reciting their full job description and qualifications. The visitor forgets most of it, gets confused, and ends up talking to the wrong person.

With mcpzip: A receptionist greets the visitor. "Who are you looking for?" The visitor says "someone who can help with payroll." The receptionist directs them to exactly the right person.

mcpzip is the receptionist. It replaces hundreds of self-introductions with three simple interactions: search, describe, execute.

What mcpzip Does

🗜

Context Compression

Replace hundreds of tool schemas with just 3 meta-tools. 99%+ token savings.

🔍

Smart Search

Keyword + LLM-powered semantic search across all your tools.

⚡

Instant Startup

Serves from disk cache immediately. Background refresh is non-blocking.

🔌

All Transports

stdio, HTTP (Streamable HTTP), and legacy SSE. Connect to any MCP server.

🔒

OAuth 2.1

Browser-based PKCE flow with token persistence. Reuses mcp-remote tokens.

🔄

Connection Pool

Lazy connects, idle timeout, automatic reconnection. Zero manual management.

📦

Single Binary

~5.8MB static binary. No runtime dependencies. Just download and run.

🔑

Auto-Migration

Import your existing Claude Code MCP config in one command.

Try the Calculator

See how many tokens mcpzip saves for your setup:

Token Savings Calculator

MCP Servers

Tools per Server

Without mcpzip

43,750

tokens

With mcpzip

1,200

tokens

Savings

97%

42,550 tokens saved

Without

125 tools

With

3 tools

Before and After

	Without mcpzip	With mcpzip
Tools loaded per message	All 500+	Just 3
Token overhead	175,000+	1,200
Startup time	2-10 seconds	< 5 milliseconds
Adding a new server	More context bloat	Zero impact
Tool selection accuracy	Degrades with scale	Consistent
Search	✗	✓
Connection pooling	✗	✓
Idle timeout management	✗	✓
Background catalog refresh	✗	✓

How does semantic search work?

When you configure a Gemini API key, mcpzip runs two search strategies in parallel:

Keyword search -- fast, token-based matching against tool names, descriptions, and parameters. Great for direct queries like "slack send message".
LLM semantic search -- sends the query and a compact tool catalog to Gemini, which understands natural language intent. Great for queries like "help me schedule a meeting" or "find something to track my tasks".

Results from both are merged, deduplicated, and cached. The semantic search adds ~200-500ms latency but dramatically improves result quality for natural language queries.

What is context window compression?

Context window compression means reducing the number of tokens consumed by tool definitions in the AI's context window.

Without compression, every tool's full JSON Schema is sent to the model on every message. mcpzip compresses this by replacing all tool schemas with 3 meta-tools, and serving full schemas on demand via the describe_tool meta-tool.

The result: instead of 175,000 tokens for 500 tools, you use ~1,200 tokens. That is a 99.3% reduction.

Who Is mcpzip For?

You should use mcpzip if...	You might not need mcpzip if...
You have 3+ MCP servers	You use only 1-2 servers
Your servers have 50+ tools total	Your servers have fewer than 20 tools total
You want faster response times	Context overhead is not a concern
You use Claude Code daily	You rarely use MCP tools
You value clean context windows	You prefer direct tool access

Even With 2 Servers

Even with just 2 servers, mcpzip's search capability, connection pooling, and instant startup can be worthwhile. The main question is whether your total tool count is high enough to benefit from context compression.

The Problem in Numbers​

The Analogy​

What mcpzip Does​

Try the Calculator​

Before and After​

Who Is mcpzip For?​