Skip to main content

Why Multi-Agent Teams?

Single Agent Limitations:
  • Complex prompts become unwieldy (2000+ lines)
  • Hard to test and maintain
  • Generic responses for diverse tasks
Multi-Agent Benefits:
  • Specialization: Each agent focuses on one domain
  • Coordination: Coordinator delegates to specialists
  • Maintainability: Small, focused prompts
  • Testability: Test specialists independently

Team Structure

Incident Coordinator (Orchestrator)
├── Kubernetes Expert
├── Log Analyzer
├── Metrics Analyzer
├── Database Troubleshooter
└── Remediation Executor

How It Works

1. User Reports Issue

"API latency spiked from 200ms to 5s starting 10 minutes ago."

2. Coordinator Analyzes & Delegates

Coordinator:
  - Analyzes: Performance degradation
  - Delegates to: Metrics, Logs, K8s specialists

3. Specialists Execute

Metrics Analyzer:
  - Queries Prometheus
  - Finds: DB query time increased 10x

Log Analyzer:
  - Searches application logs
  - Finds: "connection timeout" errors

Kubernetes Expert:
  - Checks pod status
  - Finds: App pods restarting

4. Coordinator Synthesizes

ROOT CAUSE: Database connection pool exhausted
RECOMMENDATION: Scale DB connection pool

Creating Teams

Via MCP Tools

The easiest way to create multi-agent teams:
"Create an incident response team with a coordinator that delegates to 
kubernetes, logs, metrics, and database specialists"
Station uses these MCP tools:
ToolPurpose
create_agentCreate coordinator or specialist agents
add_agent_as_toolLink specialist to coordinator
remove_agent_as_toolUnlink specialist from coordinator
Example conversation:
You: Create a kubernetes-expert agent
Station: ✅ Created kubernetes-expert (ID: 12)

You: Create an incident-coordinator that uses kubernetes-expert
Station: ✅ Created incident-coordinator (ID: 13)
         ✅ Added kubernetes-expert as tool __agent_kubernetes_expert

Via .prompt File

The coordinator agent calls specialist agents as tools:
---
metadata:
  name: "Incident Coordinator"
  description: "Orchestrates incident response"
model: gpt-5-mini
max_steps: 15
agents:
  - "kubernetes_expert"
  - "log_analyzer"
  - "metrics_analyzer"
  - "database_troubleshooter"
---

{{role "system"}}
You are an Incident Response Coordinator.

When investigating incidents:
1. Gather initial context from the report
2. Delegate to appropriate specialists
3. Synthesize findings into root cause
4. Recommend remediation steps

Available specialists:
- kubernetes_expert: Pod, deployment, service issues
- log_analyzer: Application and system logs
- metrics_analyzer: Prometheus/Grafana metrics
- database_troubleshooter: DB performance issues

{{role "user"}}
{{userInput}}

Creating Specialists

Specialists are regular agents focused on one domain:
---
metadata:
  name: "Kubernetes Expert"
  description: "Diagnoses Kubernetes issues"
model: gpt-5-mini
max_steps: 8
tools:
  - "__kubectl_get_pods"
  - "__kubectl_describe"
  - "__kubectl_logs"
---

{{role "system"}}
You are a Kubernetes expert. Diagnose pod, deployment, and service issues.

Focus on:
- Pod status and restarts
- Resource constraints
- Network policies
- Recent deployments

{{role "user"}}
{{userInput}}

Agent-as-Tool Naming

When an agent is used as a tool, its name becomes:
__agent_{agent_name_snake_case}
Example: “Kubernetes Expert” → __agent_kubernetes_expert

Best Practices

Each specialist should do one thing well. 5-8 tools max per specialist.
Coordinator prompts should clearly explain when to use each specialist.
Test each specialist before testing the full team.
Avoid coordinators calling other coordinators. Keep hierarchy flat.

Next Steps