Skip to main content

What are Workflows?

Workflows in Shannon are durable, stateful processes that orchestrate AI agents to complete complex tasks. Built on Temporal, they provide:
  • Durability: Workflows survive service restarts and failures
  • Determinism: Can be replayed for debugging
  • Visibility: Full execution history and state inspection
  • Reliability: Automatic retries and error handling

Workflow Architecture

Cognitive Patterns

Shannon implements several proven cognitive patterns for different task types:

Chain-of-Thought (CoT)

Sequential reasoning where each step builds logically on the previous one. Best for:
  • Mathematical problems
  • Step-by-step analysis
  • Linear workflows
Example:
client.submit_task(
    query="Calculate the ROI of a $100k investment with 8% annual return over 5 years"
)
Shannon automatically applies Chain-of-Thought reasoning for sequential mathematical and analytical tasks.
Execution:
Step 1: Identify formula (ROI = (Final Value - Initial Value) / Initial Value)
Step 2: Calculate year-by-year growth
Step 3: Compute final value ($146,933)
Step 4: Calculate ROI (46.93%)

Tree-of-Thoughts (ToT)

Explores multiple solution paths simultaneously, evaluates them, and selects the best. Best for:
  • Design decisions
  • Strategic planning
  • Problems with multiple approaches
Example:
client.submit_task(
    query="Design a scalable message queue system. Evaluate multiple architectures and recommend the best approach."
)
Shannon automatically applies Tree-of-Thoughts exploration for design and planning tasks with multiple solution paths.
Execution:
Branch 1: Redis-based solution
  ├─ Pros: Fast, simple, familiar
  ├─ Cons: Limited persistence, memory-bound
  └─ Score: 7/10

Branch 2: Kafka-based solution
  ├─ Pros: High throughput, durable, battle-tested
  ├─ Cons: Operational complexity, resource-heavy
  └─ Score: 9/10

Branch 3: RabbitMQ-based solution
  ├─ Pros: Feature-rich, good balance
  ├─ Cons: Lower throughput than Kafka
  └─ Score: 8/10

Selected: Kafka (highest score for requirements)

ReAct (Reasoning + Acting)

Interleaves thinking and acting in loops, ideal for dynamic problem-solving. Best for:
  • Research tasks
  • Debugging problems
  • Adaptive workflows
Example:
client.submit_task(
    query="Find the latest Rust async runtime benchmarks"
)
Shannon automatically applies ReAct (Reasoning + Acting) for research and iterative discovery tasks.
Execution:
Think: "I need to search for recent benchmarks"
Act: web_search("Rust async runtime benchmarks 2024")
Observe: Found 3 articles from 2024
Think: "Let me get detailed comparison data"
Act: web_search("tokio vs async-std vs smol benchmark comparison")
Observe: Found performance data
Think: "Now I can summarize findings"
Act: Synthesize results into report

Debate Pattern

Multiple agents argue different perspectives, then synthesize consensus. Best for:
  • Decision making
  • Evaluating trade-offs
  • Balanced analysis
Example:
client.submit_task(
    query="Should we migrate our monolith to microservices? Provide arguments for and against, then recommend."
)
Shannon automatically applies the Debate pattern for decision-making tasks requiring multiple perspectives.
Execution:
Agent 1 (Pro-Microservices):
  - Better scalability
  - Independent deployment
  - Technology flexibility

Agent 2 (Pro-Monolith):
  - Simpler operations
  - Easier debugging
  - Lower latency

Agent 3 (Synthesizer):
  - Evaluates both arguments
  - Considers context
  - Provides recommendation

Task Decomposition

For complex tasks, Shannon automatically decomposes them into subtasks:

DAG (Directed Acyclic Graph) Execution

Parallel Execution: Subtasks without dependencies run in parallel, reducing latency. Example:
# This query will be automatically decomposed into 3 parallel subtasks
client.submit_task(
    query="Research: 1) GPT-5 capabilities, 2) Claude 3 features, 3) Gemini updates"
)

Decomposition Strategy

Shannon analyzes tasks and creates execution plans internally. While the decomposition structure isn’t directly exposed in the SDK response, you can observe the workflow execution through events:
# Stream events to see workflow execution in action
for event in client.stream(workflow_id):
    if event.type == "PROGRESS":
        print(f"Progress: {event.message}")
    elif event.type == "AGENT_STARTED":
        print(f"Agent started: {event.agent_id} - {event.message}")
    elif event.type == "AGENT_COMPLETED":
        print(f"Agent completed: {event.agent_id}")
Internal Structure (for understanding, not directly accessible):
  • Subtasks run in parallel when no dependencies exist
  • Synthesis task waits for all subtasks to complete
  • Each subtask is assigned to specialized agents

Workflow Activities

Temporal workflows are composed of activities - discrete units of work:
ActivityPurpose
DecomposeTaskAnalyzes task and creates subtasks
ExecuteAgentRuns a single agent task
SynthesizeResultsCombines outputs from multiple agents
UpdateSessionResultPersists session state and appends the assistant message to session history (fix 2025‑11‑05)
RecordQueryStores in vector memory
FetchSessionMemoryRetrieves relevant context

Monitoring Workflows

Via Python SDK

# Get task status
status = client.get_status(task_id)

print(f"Status: {status.status}")
print(f"Progress: {status.progress}")
if status.result:
    print(f"Result: {status.result}")
Task decomposition happens internally in Shannon. Use event streaming to observe workflow execution in real-time.

Via Temporal UI

Visit http://localhost:8088 to see:
  • Workflow execution timeline
  • Activity statuses
  • Input/output payloads
  • Error traces
  • Replay history

Deterministic Replay

Shannon workflows are deterministic - they produce the same result when replayed with the same inputs. Use cases:
  • Debugging: Replay failed workflows to find bugs
  • Testing: Validate code changes don’t break existing workflows
  • Auditing: Understand exactly what happened
Example:
# Export workflow history
make replay-export WORKFLOW_ID=task-123 OUT=history.json

# Replay against current code
make replay HISTORY=history.json

# If code changed in non-deterministic way, replay fails

Workflow Configuration

Workflow behavior is configured via environment variables (examples):
# In .env
MAX_AGENTS_PER_TASK=5          # Max parallel agents (if enabled)
MAX_TOKENS_PER_REQUEST=50000   # LLM token budget per request (LLM service)
MAX_COST_PER_REQUEST=5.0       # LLM cost limit per request (USD)
AGENT_TIMEOUT_SECONDS=600      # Agent execution timeout (orchestrator)
Monitor workflow execution via streaming events:
handle = client.submit_task(query="Complex analysis task")

# Stream workflow events
for event in client.stream(handle.workflow_id):
    print(f"[{event.type}] {event.message}")
See Configuration Guide for all environment variables.

Error Handling

Workflows automatically handle failures:
Activities retry automatically with exponential backoff:
Attempt 1: Immediate
Attempt 2: After 1s
Attempt 3: After 2s
Attempt 4: After 4s
...
Max: 5 attempts
If an LLM provider is failing, circuit breaker opens and routes to fallback:
Primary: OpenAI (failing)
Fallback: Anthropic (healthy)
If complex mode fails, automatically falls back to simpler execution.
Tasks halt immediately when budget limits are reached, preventing cost overruns.

Best Practices

1. Choose the Right Strategy

Match the cognitive strategy to your task:
Task TypeRecommended Strategy
Simple Q&ADIRECT (Single agent)
ResearchREACT (Web search + synthesis)
AnalysisDECOMPOSE (Break into parts)
DesignEXPLORATORY (ToT evaluation)

2. Use Appropriate Mode

  • simple: Direct execution, no overhead
  • standard: Task decomposition, multi-agent
  • complex: Full cognitive patterns (CoT, ToT, etc.)

3. Monitor Progress

Stream events to track workflow execution:
for event in client.stream(workflow_id):
    if event.type == "LLM_PROMPT":
        print(f"Prompt: {event.message}")
    elif event.type == "AGENT_COMPLETED":
        print(f"Agent done: {event.agent_id}")

4. Set Timeouts

Prevent workflows from running indefinitely via environment variables:
# In .env file
AGENT_TIMEOUT_SECONDS=300  # 5 minute per-agent execution limit

Next Steps