Endpoint
Description
Returns all models currently configured in Shannon, organized by provider. This endpoint queries the Python LLM service directly and reflects the models defined inconfig/models.yaml.
Authentication
Required: No (internal service endpoint) For production deployments, access should be restricted to internal networks only.Request
Query Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
tier | string | No | Filter by tier: small, medium, or large |
Headers
None required for internal access.Response
Success Response
Status:200 OK
Body:
Response Structure
Response is organized by provider, with each provider returning an array of model objects:| Field | Type | Description |
|---|---|---|
id | string | Model identifier (canonical name) |
name | string | Display name (same as id) |
tier | string | Size tier: small, medium, or large |
context_window | integer | Maximum context length in tokens |
cost_per_1k_prompt_tokens | float | Cost per 1K input tokens (USD) |
cost_per_1k_completion_tokens | float | Cost per 1K output tokens (USD) |
supports_tools | boolean | Function calling support |
supports_streaming | boolean | Real-time streaming support |
available | boolean | Currently available for use |
Examples
List All Models
Filter by Tier
Python Example
Model Tiers
Models are organized into three tiers based on capability and cost:Small Tier (Priority for 50% of workload)
Fast, cost-optimized models for basic tasks:- OpenAI: gpt-5-nano-2025-08-07
- Anthropic: claude-haiku-4-5-20251001
- xAI: grok-4-fast-non-reasoning
- Google: gemini-2.5-flash-lite
- DeepSeek: deepseek-chat
Medium Tier (Priority for 40% of workload)
Balanced capability/cost models:- OpenAI: gpt-5-2025-08-07
- Anthropic: claude-sonnet-4-5-20250929
- xAI: grok-4
- Google: gemini-2.5-flash
- Meta: llama-4-scout
Large Tier (Priority for 10% of workload)
Heavy reasoning models for complex tasks:- OpenAI: gpt-4.1-2025-04-14, gpt-5-pro-2025-10-06
- Anthropic: claude-opus-4-1-20250805
- Google: gemini-2.5-pro
- DeepSeek: deepseek-r1
- xAI: grok-4-fast-reasoning
Configuration Source
Models are defined inconfig/models.yaml under model_catalog:
pricing.models:
Use Cases
1. Discover Available ModelsNotes
- Static Configuration: Models are loaded from
config/models.yaml, not dynamically discovered from provider APIs - Hot Reload: Changes to
models.yamlrequire service restart to take effect - Empty Providers: If a provider returns
[], check that the API key is set in.env - Pricing Centralization: All costs come from
pricingsection in YAML, ensuring consistency across Go/Rust/Python services - Internal Endpoint: This endpoint is on the LLM service (port 8000), not the Gateway API (port 8080)
Environment Variables
Override model selections with environment variables:Troubleshooting
Empty provider arrays- Verify API key is set:
OPENAI_API_KEY,ANTHROPIC_API_KEY, etc. - Check
config/models.yamlhas entries undermodel_catalog.<provider>
- Ensure
MODELS_CONFIG_PATHpoints to correct file - Verify YAML syntax is valid
- Check for typos in model IDs
- Pricing comes from
pricing.models.<provider>section - Update
config/models.yamland restart services - Verify Go/Rust services also read same config file