What are AI Agents?

Introduction

An AI agent is an autonomous system that can perceive its environment, make decisions, and take actions to achieve goals. In Shannon, agents are the fundamental execution units that process tasks using Large Language Models (LLMs) and tools.

Agent Capabilities

Shannon agents can:

Reason

Use LLMs to analyze tasks, plan solutions, and make decisions

Execute Tools

Call functions, run code, search the web, and interact with APIs

Collaborate

Work with other agents to solve complex multi-step problems

Learn

Improve over time through pattern learning and cached results

Agent Lifecycle

Here’s how an agent processes a task:

Task Analysis

Agent receives task and analyzes what needs to be done

Planning

LLM creates a plan, potentially breaking down into steps

Tool Selection

Agent identifies which tools are needed

Execution

Iteratively executes tools and processes results

Synthesis

Combines results into final answer

Agent Types in Shannon

Single Agent (Simple Mode)

A single agent handles the entire task without decomposition. Best for:

Simple queries
Fact retrieval
Basic calculations
Quick responses

Example:

client.submit_task(
    query="What is the capital of France?"
)

Shannon automatically selects single-agent mode for simple queries.

Multi-Agent (Standard/Complex Mode)

Multiple specialized agents work together, coordinated by Shannon’s orchestrator. Best for:

Complex research
Multi-step workflows
Tasks requiring different expertise
Tasks needing parallel processing

Example:

client.submit_task(
    query="Research top 5 AI trends, analyze market impact, and create summary"
)

Shannon automatically decomposes complex queries into multi-agent workflows based on query complexity.

Agent Components

1. LLM Brain

The decision-making core powered by language models:

Model Selection: Shannon auto-selects models based on task complexity
- Small tasks → gpt-5-mini, claude-haiku
- Complex tasks → gpt-5, claude-opus
Context Management: Automatically manages context windows
Caching: Reuses previous LLM responses when applicable

2. Tool System

Agents can execute various tools: Built-in Tools:

Python code execution (WASI sandboxed)
Web search (Google/Serper/Bing/Exa/Firecrawl)
Document retrieval
Mathematical calculations

MCP Tools: Shannon supports Model Context Protocol for external tool integration. Custom Tools: Add your own tools via OpenAPI specs or Python implementations.

3. Memory System

Agents maintain two types of memory: Session Memory:

Short-term context within a conversation
Stored in Redis with configurable TTL (default 30 days)
Enables multi-turn dialogues

Vector Memory:

Long-term semantic memory in Qdrant
Cross-session retrieval
MMR diversity for relevant context

Platform Configuration

Shannon behavior is configured via environment variables. Common examples:

# In .env (examples)
DEFAULT_MODEL_TIER=small           # small | medium | large
MAX_TOKENS_PER_REQUEST=10000       # Per-request token budget (LLM service)
MAX_COST_PER_REQUEST=0.50          # Per-request cost limit (USD, LLM service)
AGENT_TIMEOUT_SECONDS=600          # Agent execution timeout (orchestrator)

# Apply changes
docker compose restart

Model Tiers

Shannon automatically selects models based on the configured tier:

Tier	Models	Use Case	Cost
SMALL	gpt-5-mini, claude-haiku	Simple queries, high volume	$
MEDIUM	gpt-5, claude-sonnet	General purpose	$$
LARGE	gpt-5-thinking, claude-opus	Complex reasoning, critical tasks	$$$

Shannon’s intelligent router selects the most cost-effective model for each task, often yielding 60–90% savings compared to always using premium models (workload‑dependent).

See Configuration Guide for all available environment variables.

Agent Coordination Patterns

Shannon uses proven cognitive patterns for multi-agent coordination:

Chain-of-Thought (CoT)

Sequential reasoning where each step builds on the previous:

Task: "Calculate compound interest"
Step 1: Identify formula
Step 2: Gather inputs
Step 3: Calculate result
Step 4: Format output

Tree-of-Thoughts (ToT)

Exploration with backtracking for complex problem-solving:

Task: "Design system architecture"
Branch 1: Microservices approach
  ├─ Evaluate pros/cons
  └─ Estimate complexity
Branch 2: Monolithic approach
  ├─ Evaluate pros/cons
  └─ Estimate complexity
Select: Best option based on criteria

ReAct (Reasoning + Acting)

Interleaved reasoning and action for dynamic tasks:

Think: "I need to find the latest stock price"
Act: Search web for "AAPL stock price"
Observe: "$150.25"
Think: "Now calculate 10% gain"
Act: Calculate 150.25 * 1.10
Result: "$165.28"

Security and Isolation

Shannon agents run in secure environments:

WASI Sandbox

All code execution happens in WebAssembly System Interface sandboxes with:

No network access
Read-only filesystem
Memory limits
Execution timeouts

OPA Policy Enforcement

Control what agents can do:

# Example policy: Restrict models by team
package shannon.teams.datascience

allow {
    input.team == "data-science"
    input.model in ["gpt-5", "claude-sonnet"]
}

max_tokens = 50000 {
    input.team == "data-science"
}

Best Practices

1. Choose the Right Mode

Simple: Single-step tasks, fast responses
Standard: Multi-step tasks, moderate complexity
Complex: Research, analysis, advanced reasoning

2. Set Budget Limits

Configure token and cost limits at the platform level to prevent unexpected charges:

# In .env file
MAX_TOKENS_PER_REQUEST=5000
MAX_COST_PER_REQUEST=1.0

Monitor costs via task status:

status = client.get_status(task_id, include_details=True)
if status.metrics:
    print(f"Cost: ${status.metrics.cost_usd:.4f}")
    print(f"Tokens: {status.metrics.tokens_used}")

3. Use Sessions for Context

For multi-turn conversations, use consistent session_id:

session_id = "user-123-conversation"
client.submit_task(query="...", session_id=session_id)

4. Monitor Performance

Check metrics to optimize:

# Wait for task to complete
status = client.wait(handle.task_id, timeout=300)
if status.metrics:
    print(f"Tokens used: {status.metrics.tokens_used}")
    print(f"Cost: ${status.metrics.cost_usd:.4f}")
    print(f"Duration: {status.metrics.duration_seconds:.2f}s")

Next Steps

Workflows

Learn about workflow patterns

Cost Control

Master budget management

API Reference

Explore agent APIs

Python SDK

Build with the SDK

Getting Started

Core Concepts

Guides

What are AI Agents?

Introduction

Agent Capabilities

Reason

Execute Tools

Collaborate

Learn

Agent Lifecycle

Agent Types in Shannon

Single Agent (Simple Mode)

Multi-Agent (Standard/Complex Mode)

Agent Components

1. LLM Brain

2. Tool System

3. Memory System

Platform Configuration

Model Tiers

Agent Coordination Patterns

Chain-of-Thought (CoT)

Tree-of-Thoughts (ToT)

ReAct (Reasoning + Acting)

Security and Isolation

WASI Sandbox

OPA Policy Enforcement

Best Practices

1. Choose the Right Mode

2. Set Budget Limits

3. Use Sessions for Context

4. Monitor Performance

Next Steps

Workflows

Cost Control

API Reference

Python SDK

Getting Started

Core Concepts

Guides

​Introduction

​Agent Capabilities

Reason

Execute Tools

Collaborate

Learn

​Agent Lifecycle

​Agent Types in Shannon

​Single Agent (Simple Mode)

​Multi-Agent (Standard/Complex Mode)

​Agent Components

​1. LLM Brain

​2. Tool System

​3. Memory System

​Platform Configuration

​Model Tiers

​Agent Coordination Patterns

​Chain-of-Thought (CoT)

​Tree-of-Thoughts (ToT)

​ReAct (Reasoning + Acting)

​Security and Isolation

WASI Sandbox

​OPA Policy Enforcement

​Best Practices

​1. Choose the Right Mode

​2. Set Budget Limits

​3. Use Sessions for Context

​4. Monitor Performance

​Next Steps

Workflows

Cost Control

API Reference

Python SDK

Introduction

Agent Capabilities

Agent Lifecycle

Agent Types in Shannon

Single Agent (Simple Mode)

Multi-Agent (Standard/Complex Mode)

Agent Components

1. LLM Brain

2. Tool System

3. Memory System

Platform Configuration

Model Tiers

Agent Coordination Patterns

Chain-of-Thought (CoT)

Tree-of-Thoughts (ToT)

ReAct (Reasoning + Acting)

Security and Isolation

OPA Policy Enforcement

Best Practices

1. Choose the Right Mode

2. Set Budget Limits

3. Use Sessions for Context

4. Monitor Performance

Next Steps