Agentic Harness

The Agentic Harness is an alternative execution engine for Station agents that provides advanced capabilities beyond the standard Genkit-based execution.

Overview

Add harness: agentic to any agent’s dotprompt file to enable:

Manual agentic loop - Step-by-step control over agent execution
Doom loop detection - Prevents agents from getting stuck in repetitive patterns
Context compaction - Automatically summarizes history when approaching context limits
Git integration - Auto-branch creation and commit management
Workspace isolation - Sandboxed file system access
Built-in tools - File and bash tools that work independently of MCP

When to Use

Use Case	Harness
Simple queries, quick responses	Default (Genkit)
Long-running coding tasks	`agentic`
Multi-step file operations	`agentic`
Tasks requiring git branches	`agentic`
Complex debugging/investigation	`agentic`

Agent Configuration

Enable the harness in your agent’s dotprompt frontmatter:

---
model: anthropic/claude-sonnet-4-20250514
harness: agentic
harness_config:
  max_steps: 50
  doom_loop_threshold: 3
  timeout: 10m
  sandbox:                    # Optional: isolated execution environment
    mode: docker              # host | docker | e2b (experimental)
    image: python:3.11-slim
tools:
  - read
  - write
  - edit
  - bash
  - glob
  - grep
---
You are a code analysis agent. Analyze the codebase and suggest improvements.

Agent-Level Options

Option	Default	Description
`max_steps`	50	Maximum tool call iterations
`doom_loop_threshold`	3	Consecutive identical calls to trigger loop detection
`timeout`	10m	Maximum execution time
`sandbox`	null	Isolated execution environment (see below)

Sandbox Configuration

The sandbox option under harness_config controls WHERE tools execute:

harness_config:
  sandbox:
    mode: docker              # Execution mode
    image: python:3.11-slim   # Docker image (for docker mode)
    network: false            # Disable network access
    timeout: 5m               # Per-command timeout
    memory: 4g                # Memory limit
    cpu: 2                    # CPU limit
    environment:              # Environment variables
      PYTHONPATH: /workspace

Sandbox Modes:

Mode	Description	File Persistence
`host`	Tools execute directly on host machine	Yes
`docker`	Tools execute in Docker containers	Yes (volume mounted)
`e2b`	Tools execute in cloud VMs	No (experimental)

Docker mode is recommended for production. Files persist across container restarts via volume mounting. E2B mode is experimental - data doesn’t persist between sandbox destroys.

Global Configuration

Configure harness defaults in config.yaml. Running stn init sets sensible defaults:

harness:
  workspace:
    path: ./workspace
    mode: host           # "host" or "sandbox"
  
  compaction:
    enabled: true
    threshold: 0.85      # Compact at 85% of context window
    protect_tokens: 40000 # Keep last N tokens from compaction
  
  git:
    auto_branch: true
    branch_prefix: agent/
    auto_commit: false
    require_approval: true
    workflow_branch_strategy: shared
  
  nats:
    enabled: true
    kv_bucket: harness-state
    object_bucket: harness-files
    max_file_size: 100MB
    ttl: 24h
  
  permissions:
    external_directory: deny

Key Features

Doom Loop Detection

Detects when an agent is stuck repeating the same action:

# In dotprompt frontmatter
harness_config:
  doom_loop_threshold: 3  # Trigger after 3 identical tool calls

When detected, the harness interrupts the loop and prompts the agent to try a different approach.

Context Compaction

Automatically summarizes conversation history when approaching context limits:

harness:
  compaction:
    enabled: true
    threshold: 0.85       # Start compacting at 85% of window
    protect_tokens: 40000 # Never compact the last 40k tokens

The compactor uses the same model to create a summary, preserving important context while freeing up space for new interactions.

Git Integration

Automatic branch creation for agent work:

harness:
  git:
    auto_branch: true         # Create branches automatically
    branch_prefix: agent/     # Branch naming: agent/task-name-timestamp-id
    auto_commit: false        # Require explicit commits
    require_approval: true    # Human approval before push

When enabled, the harness:

Creates a new branch when execution starts
Tracks all file changes
Can commit changes with generated messages
Supports push with approval workflow

Workspace Isolation

Control where agents can read/write files:

harness:
  workspace:
    path: ./workspace    # Root directory for agent operations
    mode: host           # "host" for direct access, "sandbox" for isolation
  
  permissions:
    external_directory: deny  # Block access outside workspace

Built-in Tools

The harness provides built-in tools that work independently of MCP servers:

Tool	Description	Example
`read`	Read file contents	`read(path: "src/main.go")`
`write`	Write file contents	`write(path: "out.txt", content: "...")`
`edit`	String replacement editing	`edit(path: "file.go", old: "foo", new: "bar")`
`bash`	Execute shell commands	`bash(command: "ls -la")`
`glob`	Find files by pattern	`glob(pattern: "*/.go")`
`grep`	Search file contents	`grep(pattern: "TODO", path: "src/")`
`git_status`	Get git status	`git_status()`
`git_diff`	Get git diff	`git_diff()`
`git_log`	Get recent commits	`git_log(count: 10)`

Tool Permissions

Fine-grained control over tool capabilities:

harness:
  permissions:
    bash:
      allow_write: false       # Read-only commands
      allowed_commands:        # Whitelist
        - ls
        - cat
        - grep
        - find
      blocked_commands:        # Blacklist
        - rm
        - sudo

Workflow Integration

Harness agents work seamlessly with Station workflows:

# workflow.yaml
id: code-review-pipeline
name: Code Review Pipeline
states:
  - name: analyze
    type: agent
    agent: code-analyzer  # Has harness: agentic
    transition: report
    
  - name: report
    type: agent
    agent: report-generator
    transition: end

Shared Git Branches in Workflows

When multiple agents collaborate on the same codebase:

harness:
  git:
    workflow_branch_strategy: shared  # All workflow steps share one branch

Example: Code Review Agent

---
model: anthropic/claude-sonnet-4-20250514
harness: agentic
harness_config:
  max_steps: 100
  timeout: 15m
tools:
  - read
  - glob
  - grep
  - bash
---
You are a senior code reviewer. Analyze the codebase for:
- Code quality issues
- Security vulnerabilities
- Performance problems
- Missing tests

Use glob to find relevant files, read to examine them, and grep to search for patterns.
Provide a detailed report with specific line numbers and suggested fixes.

Example: Refactoring Agent

---
model: anthropic/claude-sonnet-4-20250514
harness: agentic
harness_config:
  max_steps: 200
  doom_loop_threshold: 5
tools:
  - read
  - write
  - edit
  - glob
  - grep
  - git_status
  - git_diff
  - bash
---
You are a refactoring specialist. Your task is to:
1. Understand the current code structure
2. Make targeted improvements
3. Ensure tests still pass
4. Commit changes with clear messages

Always run tests after changes: `bash(command: "go test ./...")`
Check your changes: `git_diff()`

Observability

Harness executions are fully traced with OpenTelemetry:

# View traces in Jaeger
stn jaeger up
# Open http://localhost:16686

Traces include:

Each agentic loop iteration
Tool calls with inputs/outputs
Doom loop detection events
Compaction events
Git operations

Testing

Run harness tests:

# Unit tests
go test ./pkg/harness/... -v

# E2E tests with real LLM
HARNESS_E2E_TEST=1 go test ./pkg/harness/... -v -run "E2E" -timeout 5m

Next Steps

Workflows

Chain harness agents into multi-step workflows

Git Integration

Version control your agents and configurations

Sandbox

Isolated container execution for untrusted code

Observability

Monitor agent performance with Jaeger tracing

Get Started

Station CLI

Building Agents

Station Lattice

Deployment

Advanced

Overview

When to Use

Agent Configuration

Agent-Level Options

Sandbox Configuration

Global Configuration

Key Features

Doom Loop Detection

Context Compaction

Git Integration

Workspace Isolation

Built-in Tools

Tool Permissions

Workflow Integration

Shared Git Branches in Workflows

Example: Code Review Agent

Example: Refactoring Agent

Observability

Testing

Next Steps

Workflows

Git Integration

Sandbox

Observability

Get Started

Station CLI

Building Agents

Station Lattice

Deployment

Advanced

​Overview

​When to Use

​Agent Configuration

​Agent-Level Options

​Sandbox Configuration

​Global Configuration

​Key Features

​Doom Loop Detection

​Context Compaction

​Git Integration

​Workspace Isolation

​Built-in Tools

​Tool Permissions

​Workflow Integration

​Shared Git Branches in Workflows

​Example: Code Review Agent

​Example: Refactoring Agent

​Observability

​Testing

​Next Steps

Workflows

Git Integration

Sandbox

Observability

Overview

When to Use

Agent Configuration

Agent-Level Options

Sandbox Configuration

Global Configuration

Key Features

Doom Loop Detection

Context Compaction

Git Integration

Workspace Isolation

Built-in Tools

Tool Permissions

Workflow Integration

Shared Git Branches in Workflows

Example: Code Review Agent

Example: Refactoring Agent

Observability

Testing

Next Steps