Authentication

Overview

For HTTP header details and request examples, see /en/api/rest/authentication.

Shannon supports API key authentication to secure access to the orchestration platform. Authentication is disabled by default for easy local development and can be enabled for production deployments.

Authentication Modes

Development Mode (Default)

# In .env file
GATEWAY_SKIP_AUTH=1

In development mode:

No API key required
All requests are accepted
Useful for local testing and development

Production Mode

# In .env file
GATEWAY_SKIP_AUTH=0

In production mode:

API key required for all requests
Invalid keys return 401 Unauthorized
Rate limiting enforced per key

API Key Format

Shannon uses prefixed API keys:

sk_test_1234567890abcdef    # Test keys
sk_live_1234567890abcdef    # Production keys

Never commit API keys to version control or share them publicly.

Creating API Keys

Via Command Line

# Create a test API key (run from repo root)
make seed-api-key

# Output
✅ Test API key created. Use 'sk_test_123456' for testing.
Note: Authentication is disabled by default (GATEWAY_SKIP_AUTH=1)

Via API (Future)

curl -X POST http://localhost:8080/api/v1/keys \
  -H "X-Admin-Token: admin-secret" \
  -d '{
    "name": "My Application",
    "user_id": "user-123"
  }'

Rate Limiting

Shannon enforces rate limits per API key using a fixed-window counter:

Default Limits

Limit Type	Default Value
Requests per minute	60
Burst allowance	10

Token budgets and concurrent task limits are enforced at the workflow level by the orchestrator, not at the gateway layer.

Rate Limit Headers

Responses include current window limits. On 429, Retry-After is set:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1730001234

Rate Limit Exceeded

When you exceed the rate limit:

HTTP/1.1 429 Too Many Requests
Retry-After: 30

{
  "error": "Rate limit exceeded",
  "message": "Too many requests. Please retry after the rate limit window resets."
}

Handling Rate Limits

Implement exponential backoff:

import time
import requests
from requests.exceptions import HTTPError

def submit_with_retry(query, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = requests.post(
                "http://localhost:8080/api/v1/tasks",
                headers={"X-API-Key": "sk_test_123456"},
                json={"query": query}
            )
            response.raise_for_status()
            return response.json()

        except HTTPError as e:
            if e.response.status_code == 429:
                retry_after = int(e.response.headers.get('Retry-After', 60))
                print(f"Rate limited. Retrying after {retry_after}s...")
                time.sleep(retry_after)
            else:
                raise

    raise Exception("Max retries exceeded")

Multi-Tenancy

Shannon supports multi-tenant deployments with tenant isolation:

Tenant ID

Include tenant ID in requests:

curl -X POST http://localhost:8080/api/v1/tasks \
  -H "X-API-Key: sk_test_123456" \
  -H "X-Tenant-ID: org-acme" \
  -H "Content-Type: application/json" \
  -d '{"query": "Process data"}'

Tenant Isolation

Each tenant has:

Isolated session storage
Per-tenant isolation via payload filters (tenant_id) within shared Qdrant collections
Independent budget tracking
Dedicated metrics

OPA Policy Enforcement

Shannon uses Open Policy Agent for fine-grained access control:

Policy Structure

package shannon.auth

import future.keywords.if

# Default deny
default allow = false

# Allow if user has valid API key and appropriate permissions
allow if {
    input.api_key_valid
    input.user.team in ["engineering", "data-science"]
    input.task.mode in allowed_modes[input.user.team]
}

# Team-specific allowed modes
allowed_modes := {
    "engineering": ["simple", "standard", "complex"],
    "data-science": ["standard", "complex"]
}

# Token budget limits by team
max_tokens[team] := 50000 if team == "data-science"
max_tokens[team] := 10000 if team == "engineering"

Policy Modes

Configure in config/shannon.yaml:

policy:
  enabled: true
  mode: "enforce"  # dry-run, enforce, off
  path: "/app/config/opa/policies"
  fail_closed: false  # Allow on policy errors

Modes:

enforce: Deny requests that violate policies
dry-run: Log violations but allow requests
off: Disable policy enforcement

Example: Restricting Models

package shannon.models

# Only allow specific models per team
allowed_models[user.team] contains model if {
    user.team == "cost-sensitive"
    model in ["gpt-5-mini", "claude-haiku"]
}

allowed_models[user.team] contains model if {
    user.team == "premium"
    model in ["gpt-5-thinking", "claude-opus", "gpt-5", "claude-sonnet"]
}

Security Best Practices

1. Rotate Keys Regularly

# Create new key
make seed-api-key

# Update applications with new key
# Revoke old key after migration

2. Use Environment Variables

import os
from shannon import ShannonClient

client = ShannonClient(
    base_url=os.environ["SHANNON_BASE_URL"],
    api_key=os.environ["SHANNON_API_KEY"],
)

3. Enable HTTPS in Production

# config/shannon.yaml
gateway:
  tls:
    enabled: true
    cert_file: "/path/to/cert.pem"
    key_file: "/path/to/key.pem"

4. Monitor API Key Usage

Track usage in Prometheus metrics:

# Requests per API key
sum by (api_key_id) (rate(shannon_gateway_requests_total[5m]))

# Errors by API key
sum by (api_key_id) (rate(shannon_gateway_errors_total[5m]))

5. Implement IP Whitelisting

# config/shannon.yaml
gateway:
  ip_whitelist:
    enabled: true
    allowed_ips:
      - "10.0.0.0/8"
      - "192.168.1.100"

Troubleshooting

401 Unauthorized

Cause: Missing or invalid API keySolution:

# Verify auth is enabled
grep GATEWAY_SKIP_AUTH .env

# If enabled, check your API key
curl -H "X-API-Key: sk_test_123456" \
  http://localhost:8080/api/v1/tasks

403 Forbidden

Cause: Valid API key but insufficient permissions (OPA policy)Solution: Check OPA policy logs:

docker compose logs orchestrator | grep "policy"

429 Rate Limited

Cause: Exceeded rate limitsSolution: Implement retry logic with exponential backoff:

retry_after = response.headers.get('Retry-After', 60)
time.sleep(int(retry_after))

API key not working

Cause: Key may be expired or revokedSolution: Create a new test key:

make seed-api-key

Next Steps

Submit Tasks

Learn how to submit tasks with authentication

Rate Limiting

Understand rate limits and quotas

Python SDK

Use SDK for automatic authentication

REST API Reference

Complete REST API documentation

Overview

Authentication & Headers

Tasks

Sessions

Streaming

Models

Overview

Authentication Modes

Development Mode (Default)

Production Mode

API Key Format

Creating API Keys

Via Command Line

Via API (Future)

Rate Limiting

Default Limits

Rate Limit Headers

Rate Limit Exceeded

Handling Rate Limits

Multi-Tenancy

Tenant ID

Tenant Isolation

OPA Policy Enforcement

Policy Structure

Policy Modes

Example: Restricting Models

Security Best Practices

1. Rotate Keys Regularly

2. Use Environment Variables

3. Enable HTTPS in Production

4. Monitor API Key Usage

5. Implement IP Whitelisting

Troubleshooting

Next Steps

Submit Tasks

Rate Limiting

Python SDK

REST API Reference

Overview

Authentication & Headers

Tasks

Sessions

Streaming

Models

​Overview

​Authentication Modes

​Development Mode (Default)

​Production Mode

​API Key Format

​Creating API Keys

​Via Command Line

​Via API (Future)

​Rate Limiting

​Default Limits

​Rate Limit Headers

​Rate Limit Exceeded

​Handling Rate Limits

​Multi-Tenancy

​Tenant ID

​Tenant Isolation

​OPA Policy Enforcement

​Policy Structure

​Policy Modes

​Example: Restricting Models

​Security Best Practices

​1. Rotate Keys Regularly

​2. Use Environment Variables

​3. Enable HTTPS in Production

​4. Monitor API Key Usage

​5. Implement IP Whitelisting

​Troubleshooting

​Next Steps

Submit Tasks

Rate Limiting

Python SDK

REST API Reference

Overview

Authentication Modes

Development Mode (Default)

Production Mode

API Key Format

Creating API Keys

Via Command Line

Via API (Future)

Rate Limiting

Default Limits

Rate Limit Headers

Rate Limit Exceeded

Handling Rate Limits

Multi-Tenancy

Tenant ID

Tenant Isolation

OPA Policy Enforcement

Policy Structure

Policy Modes

Example: Restricting Models

Security Best Practices

1. Rotate Keys Regularly

2. Use Environment Variables

3. Enable HTTPS in Production

4. Monitor API Key Usage

5. Implement IP Whitelisting

Troubleshooting

Next Steps