Skip to main content

Introduction

An AI agent is an autonomous system that can perceive its environment, make decisions, and take actions to achieve goals. In Shannon, agents are the fundamental execution units that process tasks using Large Language Models (LLMs) and tools.

Agent Capabilities

Shannon agents can:

Reason

Use LLMs to analyze tasks, plan solutions, and make decisions

Execute Tools

Call functions, run code, search the web, and interact with APIs

Collaborate

Work with other agents to solve complex multi-step problems

Learn

Improve over time through pattern learning and cached results

Agent Lifecycle

Here’s how an agent processes a task:
1

Task Analysis

Agent receives task and analyzes what needs to be done
2

Planning

LLM creates a plan, potentially breaking down into steps
3

Tool Selection

Agent identifies which tools are needed
4

Execution

Iteratively executes tools and processes results
5

Synthesis

Combines results into final answer

Agent Types in Shannon

Single Agent (Simple Mode)

A single agent handles the entire task without decomposition. Best for:
  • Simple queries
  • Fact retrieval
  • Basic calculations
  • Quick responses
Example:
client.submit_task(
    query="What is the capital of France?"
)
Shannon automatically selects single-agent mode for simple queries.

Multi-Agent (Standard/Complex Mode)

Multiple specialized agents work together, coordinated by Shannon’s orchestrator. Best for:
  • Complex research
  • Multi-step workflows
  • Tasks requiring different expertise
  • Tasks needing parallel processing
Example:
client.submit_task(
    query="Research top 5 AI trends, analyze market impact, and create summary"
)
Shannon automatically decomposes complex queries into multi-agent workflows based on query complexity.

Agent Components

1. LLM Brain

The decision-making core powered by language models:
  • Model Selection: Shannon auto-selects models based on task complexity
    • Small tasks → gpt-5-mini, claude-haiku
    • Complex tasks → gpt-5, claude-opus
  • Context Management: Automatically manages context windows
  • Caching: Reuses previous LLM responses when applicable

2. Tool System

Agents can execute various tools: Built-in Tools:
  • Python code execution (WASI sandboxed)
  • Web search (Google/Serper/Bing/Exa/Firecrawl)
  • Document retrieval
  • Mathematical calculations
MCP Tools: Shannon supports Model Context Protocol for external tool integration. Custom Tools: Add your own tools via OpenAPI specs or Python implementations.

3. Memory System

Agents maintain two types of memory: Session Memory:
  • Short-term context within a conversation
  • Stored in Redis with configurable TTL (default 30 days)
  • Enables multi-turn dialogues
Vector Memory:
  • Long-term semantic memory in Qdrant
  • Cross-session retrieval
  • MMR diversity for relevant context

Platform Configuration

Shannon behavior is configured via environment variables. Common examples:
# In .env (examples)
DEFAULT_MODEL_TIER=small           # small | medium | large
MAX_TOKENS_PER_REQUEST=10000       # Per-request token budget (LLM service)
MAX_COST_PER_REQUEST=0.50          # Per-request cost limit (USD, LLM service)
AGENT_TIMEOUT_SECONDS=600          # Agent execution timeout (orchestrator)

# Apply changes
docker compose restart

Model Tiers

Shannon automatically selects models based on the configured tier:
TierModelsUse CaseCost
SMALLgpt-5-mini, claude-haikuSimple queries, high volume$
MEDIUMgpt-5, claude-sonnetGeneral purpose$$
LARGEgpt-5-thinking, claude-opusComplex reasoning, critical tasks$$$
Shannon’s intelligent router selects the most cost-effective model for each task, often yielding 60–90% savings compared to always using premium models (workload‑dependent).
See Configuration Guide for all available environment variables.

Agent Coordination Patterns

Shannon uses proven cognitive patterns for multi-agent coordination:

Chain-of-Thought (CoT)

Sequential reasoning where each step builds on the previous:
Task: "Calculate compound interest"
Step 1: Identify formula
Step 2: Gather inputs
Step 3: Calculate result
Step 4: Format output

Tree-of-Thoughts (ToT)

Exploration with backtracking for complex problem-solving:
Task: "Design system architecture"
Branch 1: Microservices approach
  ├─ Evaluate pros/cons
  └─ Estimate complexity
Branch 2: Monolithic approach
  ├─ Evaluate pros/cons
  └─ Estimate complexity
Select: Best option based on criteria

ReAct (Reasoning + Acting)

Interleaved reasoning and action for dynamic tasks:
Think: "I need to find the latest stock price"
Act: Search web for "AAPL stock price"
Observe: "$150.25"
Think: "Now calculate 10% gain"
Act: Calculate 150.25 * 1.10
Result: "$165.28"

Security and Isolation

Shannon agents run in secure environments:

WASI Sandbox

All code execution happens in WebAssembly System Interface sandboxes with:
  • No network access
  • Read-only filesystem
  • Memory limits
  • Execution timeouts

OPA Policy Enforcement

Control what agents can do:
# Example policy: Restrict models by team
package shannon.teams.datascience

allow {
    input.team == "data-science"
    input.model in ["gpt-5", "claude-sonnet"]
}

max_tokens = 50000 {
    input.team == "data-science"
}

Best Practices

1. Choose the Right Mode

  • Simple: Single-step tasks, fast responses
  • Standard: Multi-step tasks, moderate complexity
  • Complex: Research, analysis, advanced reasoning

2. Set Budget Limits

Configure token and cost limits at the platform level to prevent unexpected charges:
# In .env file
MAX_TOKENS_PER_REQUEST=5000
MAX_COST_PER_REQUEST=1.0
Monitor costs via task status:
status = client.get_status(task_id, include_details=True)
if status.metrics:
    print(f"Cost: ${status.metrics.cost_usd:.4f}")
    print(f"Tokens: {status.metrics.tokens_used}")

3. Use Sessions for Context

For multi-turn conversations, use consistent session_id:
session_id = "user-123-conversation"
client.submit_task(query="...", session_id=session_id)

4. Monitor Performance

Check metrics to optimize:
# Wait for task to complete
status = client.wait(handle.task_id, timeout=300)
if status.metrics:
    print(f"Tokens used: {status.metrics.tokens_used}")
    print(f"Cost: ${status.metrics.cost_usd:.4f}")
    print(f"Duration: {status.metrics.duration_seconds:.2f}s")

Next Steps