Skip to main content
The Agentic Harness is an alternative execution engine for Station agents that provides advanced capabilities beyond the standard Genkit-based execution.

Overview

Add harness: agentic to any agent’s dotprompt file to enable:
  • Manual agentic loop - Step-by-step control over agent execution
  • Doom loop detection - Prevents agents from getting stuck in repetitive patterns
  • Context compaction - Automatically summarizes history when approaching context limits
  • Git integration - Auto-branch creation and commit management
  • Workspace isolation - Sandboxed file system access
  • Built-in tools - File and bash tools that work independently of MCP

When to Use

Use CaseHarness
Simple queries, quick responsesDefault (Genkit)
Long-running coding tasksagentic
Multi-step file operationsagentic
Tasks requiring git branchesagentic
Complex debugging/investigationagentic

Agent Configuration

Enable the harness in your agent’s dotprompt frontmatter:
---
model: anthropic/claude-sonnet-4-20250514
harness: agentic
harness_config:
  max_steps: 50
  doom_loop_threshold: 3
  timeout: 10m
  sandbox:                    # Optional: isolated execution environment
    mode: docker              # host | docker | e2b (experimental)
    image: python:3.11-slim
tools:
  - read
  - write
  - edit
  - bash
  - glob
  - grep
---
You are a code analysis agent. Analyze the codebase and suggest improvements.

Agent-Level Options

OptionDefaultDescription
max_steps50Maximum tool call iterations
doom_loop_threshold3Consecutive identical calls to trigger loop detection
timeout10mMaximum execution time
sandboxnullIsolated execution environment (see below)

Sandbox Configuration

The sandbox option under harness_config controls WHERE tools execute:
harness_config:
  sandbox:
    mode: docker              # Execution mode
    image: python:3.11-slim   # Docker image (for docker mode)
    network: false            # Disable network access
    timeout: 5m               # Per-command timeout
    memory: 4g                # Memory limit
    cpu: 2                    # CPU limit
    environment:              # Environment variables
      PYTHONPATH: /workspace
Sandbox Modes:
ModeDescriptionFile Persistence
hostTools execute directly on host machineYes
dockerTools execute in Docker containersYes (volume mounted)
e2bTools execute in cloud VMsNo (experimental)
Docker mode is recommended for production. Files persist across container restarts via volume mounting. E2B mode is experimental - data doesn’t persist between sandbox destroys.

Global Configuration

Configure harness defaults in config.yaml. Running stn init sets sensible defaults:
harness:
  workspace:
    path: ./workspace
    mode: host           # "host" or "sandbox"
  
  compaction:
    enabled: true
    threshold: 0.85      # Compact at 85% of context window
    protect_tokens: 40000 # Keep last N tokens from compaction
  
  git:
    auto_branch: true
    branch_prefix: agent/
    auto_commit: false
    require_approval: true
    workflow_branch_strategy: shared
  
  nats:
    enabled: true
    kv_bucket: harness-state
    object_bucket: harness-files
    max_file_size: 100MB
    ttl: 24h
  
  permissions:
    external_directory: deny

Key Features

Doom Loop Detection

Detects when an agent is stuck repeating the same action:
# In dotprompt frontmatter
harness_config:
  doom_loop_threshold: 3  # Trigger after 3 identical tool calls
When detected, the harness interrupts the loop and prompts the agent to try a different approach.

Context Compaction

Automatically summarizes conversation history when approaching context limits:
harness:
  compaction:
    enabled: true
    threshold: 0.85       # Start compacting at 85% of window
    protect_tokens: 40000 # Never compact the last 40k tokens
The compactor uses the same model to create a summary, preserving important context while freeing up space for new interactions.

Git Integration

Automatic branch creation for agent work:
harness:
  git:
    auto_branch: true         # Create branches automatically
    branch_prefix: agent/     # Branch naming: agent/task-name-timestamp-id
    auto_commit: false        # Require explicit commits
    require_approval: true    # Human approval before push
When enabled, the harness:
  1. Creates a new branch when execution starts
  2. Tracks all file changes
  3. Can commit changes with generated messages
  4. Supports push with approval workflow

Workspace Isolation

Control where agents can read/write files:
harness:
  workspace:
    path: ./workspace    # Root directory for agent operations
    mode: host           # "host" for direct access, "sandbox" for isolation
  
  permissions:
    external_directory: deny  # Block access outside workspace

Built-in Tools

The harness provides built-in tools that work independently of MCP servers:
ToolDescriptionExample
readRead file contentsread(path: "src/main.go")
writeWrite file contentswrite(path: "out.txt", content: "...")
editString replacement editingedit(path: "file.go", old: "foo", new: "bar")
bashExecute shell commandsbash(command: "ls -la")
globFind files by patternglob(pattern: "**/*.go")
grepSearch file contentsgrep(pattern: "TODO", path: "src/")
git_statusGet git statusgit_status()
git_diffGet git diffgit_diff()
git_logGet recent commitsgit_log(count: 10)

Tool Permissions

Fine-grained control over tool capabilities:
harness:
  permissions:
    bash:
      allow_write: false       # Read-only commands
      allowed_commands:        # Whitelist
        - ls
        - cat
        - grep
        - find
      blocked_commands:        # Blacklist
        - rm
        - sudo

Workflow Integration

Harness agents work seamlessly with Station workflows:
# workflow.yaml
id: code-review-pipeline
name: Code Review Pipeline
states:
  - name: analyze
    type: agent
    agent: code-analyzer  # Has harness: agentic
    transition: report
    
  - name: report
    type: agent
    agent: report-generator
    transition: end

Shared Git Branches in Workflows

When multiple agents collaborate on the same codebase:
harness:
  git:
    workflow_branch_strategy: shared  # All workflow steps share one branch

Example: Code Review Agent

---
model: anthropic/claude-sonnet-4-20250514
harness: agentic
harness_config:
  max_steps: 100
  timeout: 15m
tools:
  - read
  - glob
  - grep
  - bash
---
You are a senior code reviewer. Analyze the codebase for:
- Code quality issues
- Security vulnerabilities
- Performance problems
- Missing tests

Use glob to find relevant files, read to examine them, and grep to search for patterns.
Provide a detailed report with specific line numbers and suggested fixes.

Example: Refactoring Agent

---
model: anthropic/claude-sonnet-4-20250514
harness: agentic
harness_config:
  max_steps: 200
  doom_loop_threshold: 5
tools:
  - read
  - write
  - edit
  - glob
  - grep
  - git_status
  - git_diff
  - bash
---
You are a refactoring specialist. Your task is to:
1. Understand the current code structure
2. Make targeted improvements
3. Ensure tests still pass
4. Commit changes with clear messages

Always run tests after changes: `bash(command: "go test ./...")`
Check your changes: `git_diff()`

Observability

Harness executions are fully traced with OpenTelemetry:
# View traces in Jaeger
stn jaeger up
# Open http://localhost:16686
Traces include:
  • Each agentic loop iteration
  • Tool calls with inputs/outputs
  • Doom loop detection events
  • Compaction events
  • Git operations

Testing

Run harness tests:
# Unit tests
go test ./pkg/harness/... -v

# E2E tests with real LLM
HARNESS_E2E_TEST=1 go test ./pkg/harness/... -v -run "E2E" -timeout 5m

Next Steps