Overview
This guide covers common configuration issues, how to diagnose them, and proven solutions.Quick Diagnostics
Check Environment Variables
Verify Configuration Files
Check Service Health
Common Issues
1. Services Won’t Start
Missing Environment Variables
Symptoms:- Service crashes immediately
- Logs show “variable not set” errors
- Container exits with code 1
- At least one LLM provider key (OPENAI_API_KEY, ANTHROPIC_API_KEY, etc.)
- Database credentials (POSTGRES_*)
- Redis connection (REDIS_*)
Invalid Configuration Syntax
Symptoms:- “Failed to parse config” errors
- YAML syntax errors
- Service fails to start
2. Authentication Failures
Gateway Returns 401 Unauthorized
Symptoms:- All requests return 401
- “Unauthorized” error
- API key rejected
JWT Secret Not Set
Symptoms:- “JWT secret not configured” error
- Authentication middleware fails
3. Database Connection Issues
Cannot Connect to PostgreSQL
Symptoms:- “connection refused” errors
- “dial tcp: connect: connection refused”
- Services crash on startup
Database Schema Not Initialized
Symptoms:- “table does not exist” errors
- “column not found” errors
- SQL errors in logs
4. Redis Connection Issues
Cannot Connect to Redis
Symptoms:- “connection refused” to Redis
- Session state not persisting
- Cache misses
Redis Authentication Failed
Symptoms:- “NOAUTH Authentication required”
- Connection works but commands fail
5. LLM Provider Issues
API Key Invalid or Expired
Symptoms:- “Invalid API key” errors
- 401 from LLM provider
- Tasks fail immediately
Rate Limit Exceeded
Symptoms:- 429 errors from LLM provider
- “Rate limit exceeded” in logs
- Tasks timeout or fail
Quota Exceeded
Symptoms:- “insufficient_quota” errors
- “You exceeded your current quota”
- All LLM calls fail
6. Model Configuration Issues
Model Not Found
Symptoms:- “model not found” errors
- “invalid model” errors
- Tasks fail with model errors
7. Budget and Cost Issues
Tasks Exceed Budget
Symptoms:- “Budget exceeded” errors
- Tasks fail with cost errors
MAX_COST_PER_REQUESTexceeded
Budget Enforcement Not Working
Symptoms:- Costs exceed limits
- No budget errors
8. Performance Issues
Slow Task Execution
Symptoms:- Tasks take 2-3x expected time
- High latency
- Timeouts
High Memory Usage
Symptoms:- OOM errors
- Container restarts
- High swap usage
9. Streaming Issues
SSE Connection Drops
Symptoms:- SSE stream disconnects
- Events stop mid-task
- “Connection closed” errors
Events Not Received
Symptoms:- No events in stream
- Empty SSE response
- Stream connects but no data
10. Tool Execution Issues
Python Code Execution Fails
Symptoms:- “WASI interpreter not found”
- Python code tools fail
- Sandbox errors
Tool Timeout
Symptoms:- “Tool execution timeout” errors
- Tools hang indefinitely
- WASI timeout errors
Configuration Validation
Validate All Settings
Best Practices
1. Use Environment-Specific Configs
2. Document Custom Settings
3. Version Control
4. Regular Validation
5. Monitor Configuration
Quick Fixes Checklist
When things go wrong, try these in order:- Restart all services:
docker compose restart - Check logs:
docker compose logs --tail=50 - Verify .env file exists and has required variables
- Test database connection:
docker compose exec postgres pg_isready - Test Redis:
docker compose exec redis redis-cli ping - Verify at least one LLM API key is set
- Check disk space:
df -h - Check memory:
docker stats - Full reset (last resort):
docker compose down -v && docker compose up -d
Getting Help
If issues persist:-
Collect logs:
-
Export configuration:
- Check GitHub issues: https://github.com/Kocoro-lab/Shannon/issues
- Join Discord: https://discord.gg/NB7C2fMcQR