Deployment Guide
9.1 Local Deployment (stdio)
Suitable for personal tools and team internal use.npm Package Method
Publish your Server as an npm package; users install it globally and use it directly:Direct Execution Method
Without publishing an npm package, specify the script path directly:Python Server Local Deployment
9.2 Remote Deployment (Streamable HTTP)
Suitable for SaaS services serving external users.Complete Express.js Streamable HTTP Server
Docker Deployment
Cloud Platform Deployment
| Platform | Approach | Advantages | Considerations |
|---|---|---|---|
| Vercel | Edge Function | Global CDN, auto-scaling | Stateless; requires external session storage |
| Railway | Docker container | Simple deployment, persistent connections | Good for stateful Servers |
| AWS Lambda | Serverless | Pay-per-invocation | Cold start latency; mind the timeout |
| Cloudflare Workers | Edge Worker | Ultra-low latency | Stateless; requires Durable Objects for session management |
| Self-hosted | PM2 + Nginx | Full control | Requires self-maintenance |
9.3 Session Management
Streamable HTTP is a stateful protocol; theMcp-Session-Id header maintains sessions. Production deployments must address the following:
Stateless Deployment (Multiple Instances)
If the Server is deployed across multiple instances (e.g., K8s Pods), external session storage is needed:Session Cleanup
Set reasonable session timeouts and cleanup policies:9.4 Client Connection Configuration
Remote Server (Streamable HTTP)
Remote Server with OAuth
When the Server implements the full OAuth 2.1 flow, the Host application handles authentication automatically; the user only needs to provide the Server URL:9.5 Monitoring and Logging
Recommended for production environments:Health Check
Request Logging
Monitoring Metrics
| Metric | Description | Alert Threshold |
|---|---|---|
| P95 response time | 95th percentile latency for tool calls | Depends on business; typically under 5s |
| Error rate | Proportion of failed tool calls | Alert above 5% |
| Active sessions | Number of currently connected Clients | Alert when approaching server capacity |
| Initialization success rate | MCP handshake success rate | Investigate below 99% |
Reconnection Recovery
Streamable HTTP supports reconnection recovery via SSE’sLast-Event-ID. In production, the Server should maintain an event ID sequence so Clients can resume from the disconnection point:
Next Chapter: Case Studies — MCP Server implementations for different scenarios