Production Deployment Guide
Deploy Shannon to your infrastructure with confidence. This section covers deployment patterns, cloud platform integrations, and operational best practices.Shannon is currently in active development. Production deployment guides are being finalized. For production use, we recommend:
- Thorough testing in staging environments
- Monitoring all services closely
- Joining our Discord for deployment support
Deployment Options
Docker Compose
Production-ready Docker Compose configuration
Status: 🚧 Phase 3
Status: 🚧 Phase 3
Kubernetes
Kubernetes manifests and Helm charts
Status: 🚧 Phase 3
Status: 🚧 Phase 3
AWS
Deploy to Amazon Web Services (ECS, RDS, ElastiCache)
Status: 🚧 Phase 3
Status: 🚧 Phase 3
Azure
Deploy to Microsoft Azure (AKS, PostgreSQL, Redis)
Status: 🚧 Phase 3
Status: 🚧 Phase 3
Operations
Monitoring
Prometheus metrics, Grafana dashboards, and alerting
Status: 🚧 Phase 3
Status: 🚧 Phase 3
Performance Tuning
Optimize throughput, latency, and resource usage
Status: 🚧 Phase 3
Status: 🚧 Phase 3
Security
Production security hardening and best practices
Status: 🚧 Phase 3
Status: 🚧 Phase 3
Quick Start: Local Development
For development and testing, use Docker Compose:Architecture Overview
Shannon consists of multiple services that need to be deployed:Core Services
| Service | Purpose | Scaling |
|---|---|---|
| Gateway | REST API, authentication | Horizontal (stateless) |
| Orchestrator | Task coordination, gRPC | Horizontal (stateful via Temporal) |
| Agent Core | Agent execution, Rust runtime | Horizontal |
| LLM Service | LLM provider gateway | Horizontal |
| Dashboard | Real-time monitoring UI | Horizontal (stateless) |
Data Stores
| Store | Purpose | Scaling |
|---|---|---|
| PostgreSQL | Task metadata, events, sessions | Vertical + read replicas |
| Redis | Caching, pub/sub, sessions | Cluster mode |
| Qdrant | Vector embeddings, semantic memory | Horizontal |
| Temporal | Workflow state, durable execution | Cluster mode |
Production Checklist
Before deploying to production:Security
- Enable authentication (
GATEWAY_SKIP_AUTH=0) - Configure TLS/SSL for all services
- Rotate API keys regularly
- Set up OPA policies for access control
- Enable audit logging
- Configure network policies/firewalls
Reliability
- Set up health checks and readiness probes
- Configure auto-scaling policies
- Implement circuit breakers
- Set resource limits (CPU, memory)
- Configure backup and disaster recovery
- Test failover scenarios
Observability
- Deploy Prometheus and Grafana
- Configure alerting rules
- Set up log aggregation (ELK/Loki)
- Enable distributed tracing (OpenTelemetry)
- Create runbooks for common issues
Performance
- Tune Temporal worker concurrency
- Optimize database connections
- Configure Redis caching
- Set appropriate resource limits
- Load test before production launch
Resource Requirements
Minimum (Development)
- CPU: 4 cores
- RAM: 8GB
- Storage: 20GB SSD
Recommended (Production - Small)
- CPU: 16 cores total (distributed across services)
- RAM: 32GB total
- Storage: 100GB SSD
- Network: 1Gbps
Recommended (Production - Large)
- CPU: 64+ cores
- RAM: 128GB+
- Storage: 500GB+ SSD
- Network: 10Gbps
- Load Balancer: Required
- Multi-AZ: Recommended
What’s Next?
Quick Start
Install Shannon locally first
Configuration
Understand environment variables
Architecture
Learn system architecture
Monitoring
Set up monitoring
Get Help
- Discord: Join our community for deployment help
- GitHub: File deployment issues or questions
- Docs: Check Troubleshooting for common problems