The Agentic Harness is an alternative execution engine for Station agents that provides advanced capabilities beyond the standard Genkit-based execution.
Overview
Add harness: agentic to any agent’s dotprompt file to enable:
Manual agentic loop - Step-by-step control over agent execution
Doom loop detection - Prevents agents from getting stuck in repetitive patterns
Context compaction - Automatically summarizes history when approaching context limits
Git integration - Auto-branch creation and commit management
Workspace isolation - Sandboxed file system access
Built-in tools - File and bash tools that work independently of MCP
When to Use
Use Case Harness Simple queries, quick responses Default (Genkit) Long-running coding tasks agenticMulti-step file operations agenticTasks requiring git branches agenticComplex debugging/investigation agentic
Agent Configuration
Enable the harness in your agent’s dotprompt frontmatter:
---
model : anthropic/claude-sonnet-4-20250514
harness : agentic
harness_config :
max_steps : 50
doom_loop_threshold : 3
timeout : 10m
sandbox : # Optional: isolated execution environment
mode : docker # host | docker | e2b (experimental)
image : python:3.11-slim
tools :
- read
- write
- edit
- bash
- glob
- grep
---
You are a code analysis agent. Analyze the codebase and suggest improvements.
Agent-Level Options
Option Default Description max_steps50 Maximum tool call iterations doom_loop_threshold3 Consecutive identical calls to trigger loop detection timeout10m Maximum execution time sandboxnull Isolated execution environment (see below)
Sandbox Configuration
The sandbox option under harness_config controls WHERE tools execute:
harness_config :
sandbox :
mode : docker # Execution mode
image : python:3.11-slim # Docker image (for docker mode)
network : false # Disable network access
timeout : 5m # Per-command timeout
memory : 4g # Memory limit
cpu : 2 # CPU limit
environment : # Environment variables
PYTHONPATH : /workspace
Sandbox Modes:
Mode Description File Persistence hostTools execute directly on host machine Yes dockerTools execute in Docker containers Yes (volume mounted) e2bTools execute in cloud VMs No (experimental)
Docker mode is recommended for production. Files persist across container restarts via volume mounting.
E2B mode is experimental - data doesn’t persist between sandbox destroys.
Global Configuration
Configure harness defaults in config.yaml. Running stn init sets sensible defaults:
harness :
workspace :
path : ./workspace
mode : host # "host" or "sandbox"
compaction :
enabled : true
threshold : 0.85 # Compact at 85% of context window
protect_tokens : 40000 # Keep last N tokens from compaction
git :
auto_branch : true
branch_prefix : agent/
auto_commit : false
require_approval : true
workflow_branch_strategy : shared
nats :
enabled : true
kv_bucket : harness-state
object_bucket : harness-files
max_file_size : 100MB
ttl : 24h
permissions :
external_directory : deny
Key Features
Doom Loop Detection
Detects when an agent is stuck repeating the same action:
# In dotprompt frontmatter
harness_config :
doom_loop_threshold : 3 # Trigger after 3 identical tool calls
When detected, the harness interrupts the loop and prompts the agent to try a different approach.
Context Compaction
Automatically summarizes conversation history when approaching context limits:
harness :
compaction :
enabled : true
threshold : 0.85 # Start compacting at 85% of window
protect_tokens : 40000 # Never compact the last 40k tokens
The compactor uses the same model to create a summary, preserving important context while freeing up space for new interactions.
Git Integration
Automatic branch creation for agent work:
harness :
git :
auto_branch : true # Create branches automatically
branch_prefix : agent/ # Branch naming: agent/task-name-timestamp-id
auto_commit : false # Require explicit commits
require_approval : true # Human approval before push
When enabled, the harness:
Creates a new branch when execution starts
Tracks all file changes
Can commit changes with generated messages
Supports push with approval workflow
Workspace Isolation
Control where agents can read/write files:
harness :
workspace :
path : ./workspace # Root directory for agent operations
mode : host # "host" for direct access, "sandbox" for isolation
permissions :
external_directory : deny # Block access outside workspace
The harness provides built-in tools that work independently of MCP servers:
Tool Description Example readRead file contents read(path: "src/main.go")writeWrite file contents write(path: "out.txt", content: "...")editString replacement editing edit(path: "file.go", old: "foo", new: "bar")bashExecute shell commands bash(command: "ls -la")globFind files by pattern glob(pattern: "**/*.go")grepSearch file contents grep(pattern: "TODO", path: "src/")git_statusGet git status git_status()git_diffGet git diff git_diff()git_logGet recent commits git_log(count: 10)
Fine-grained control over tool capabilities:
harness :
permissions :
bash :
allow_write : false # Read-only commands
allowed_commands : # Whitelist
- ls
- cat
- grep
- find
blocked_commands : # Blacklist
- rm
- sudo
Workflow Integration
Harness agents work seamlessly with Station workflows:
# workflow.yaml
id : code-review-pipeline
name : Code Review Pipeline
states :
- name : analyze
type : agent
agent : code-analyzer # Has harness: agentic
transition : report
- name : report
type : agent
agent : report-generator
transition : end
Shared Git Branches in Workflows
When multiple agents collaborate on the same codebase:
harness :
git :
workflow_branch_strategy : shared # All workflow steps share one branch
Example: Code Review Agent
---
model : anthropic/claude-sonnet-4-20250514
harness : agentic
harness_config :
max_steps : 100
timeout : 15m
tools :
- read
- glob
- grep
- bash
---
You are a senior code reviewer. Analyze the codebase for :
- Code quality issues
- Security vulnerabilities
- Performance problems
- Missing tests
Use glob to find relevant files, read to examine them, and grep to search for patterns.
Provide a detailed report with specific line numbers and suggested fixes.
Example: Refactoring Agent
---
model : anthropic/claude-sonnet-4-20250514
harness : agentic
harness_config :
max_steps : 200
doom_loop_threshold : 5
tools :
- read
- write
- edit
- glob
- grep
- git_status
- git_diff
- bash
---
You are a refactoring specialist. Your task is to :
1. Understand the current code structure
2. Make targeted improvements
3. Ensure tests still pass
4. Commit changes with clear messages
Always run tests after changes : ` bash(command : "go test ./...")`
Check your changes : ` git_diff()`
Observability
Harness executions are fully traced with OpenTelemetry:
# View traces in Jaeger
stn jaeger up
# Open http://localhost:16686
Traces include:
Each agentic loop iteration
Tool calls with inputs/outputs
Doom loop detection events
Compaction events
Git operations
Testing
Run harness tests:
# Unit tests
go test ./pkg/harness/... -v
# E2E tests with real LLM
HARNESS_E2E_TEST = 1 go test ./pkg/harness/... -v -run "E2E" -timeout 5m
Next Steps
Workflows Chain harness agents into multi-step workflows
Git Integration Version control your agents and configurations
Sandbox Isolated container execution for untrusted code
Observability Monitor agent performance with Jaeger tracing