Sandbox Execution - CloudShip AI

Station provides sandboxed environments for agents to execute code safely. Two modes are available depending on your use case.

Sandbox Modes

Compute Mode

Ephemeral containers for quick calculations and data processing. Each call runs in a fresh container.

Code Mode

Persistent Linux sandbox for iterative development. Full shell access, package installation, and file persistence across calls.

Feature	Compute Mode	Code Mode
Lifecycle	Fresh container per call	Persistent per workflow
Use case	Calculations, transformations	Development, compilation
Tool	`sandbox_run`	`sandbox_open`, `sandbox_exec`, etc.
Files	Via `files` param only	Full filesystem access

Enabling Sandbox

Via MCP Tools

"Create a data-processor agent with Python sandbox enabled"

The create_agent and update_agent tools accept sandbox configuration:

{
  "name": "data-processor",
  "description": "Process data with Python",
  "prompt": "You process data. Use sandbox_run to execute Python code.",
  "environment_id": "1",
  "sandbox": "{\"runtime\": \"python\", \"pip_packages\": [\"pandas\"]}"
}

Enable Code Mode:

{
  "agent_id": "42",
  "sandbox": "{\"mode\": \"code\", \"session\": \"workflow\"}"
}

Disable sandbox:

{
  "agent_id": "42",
  "sandbox": "{}"
}

Via Frontmatter (.prompt file)

Add sandbox: to your agent’s frontmatter:

---
model: openai/gpt-4o
metadata:
  name: "Data Processor"
sandbox: python
---

Process data using Python. Use sandbox_run to execute code.

The agent receives a sandbox_run tool for executing code:

{
  "code": "print(sum(range(1, 101)))",
  "runtime": "python"
}

Code Mode

Enable persistent Linux sandbox with mode: code:

---
model: openai/gpt-4o
metadata:
  name: "Developer Agent"
sandbox:
  mode: code
  session: workflow
---

You have access to a full Linux sandbox. Install packages,
compile code, and run any shell commands.

Code Mode Tools

When mode: code is enabled, agents receive these tools:

Tool	Description
`sandbox_open`	Get or create a sandbox session
`sandbox_exec`	Execute any shell command
`sandbox_fs_write`	Write files to the sandbox
`sandbox_fs_read`	Read files from the sandbox
`sandbox_fs_list`	List directory contents
`sandbox_fs_delete`	Delete files or directories
`sandbox_close`	Close the session (optional)

Example: Full Development Workflow

# Open sandbox (ubuntu:22.04)
sandbox_open({})

# Install dependencies
sandbox_exec({"command": "apt-get update && apt-get install -y build-essential curl"})

# Write source code
sandbox_fs_write({
  "path": "main.c",
  "content": "#include <stdio.h>\nint main() { printf(\"Hello!\\n\"); return 0; }"
})

# Compile and run
sandbox_exec({"command": "gcc -o main main.c && ./main"})
# Output: Hello!

# Install Python packages
sandbox_exec({"command": "pip install pandas numpy"})

# Run Python script
sandbox_fs_write({
  "path": "analyze.py", 
  "content": "import pandas as pd\nprint(pd.__version__)"
})
sandbox_exec({"command": "python analyze.py"})

Configuration Reference

Compute Mode

sandbox:
  runtime: python           # python, node, or bash
  image: "python:3.11-slim" # optional: custom image
  timeout_seconds: 120      # execution timeout
  allow_network: false      # network access
  pip_packages:             # pre-install packages
    - pandas
    - requests

Code Mode

sandbox:
  mode: code
  session: workflow         # workflow or agent
  runtime: linux            # linux, python, node, or custom image
  timeout_seconds: 300      # per-command timeout
  limits:
    max_file_size_bytes: 10485760  # 10MB
    max_files: 100

Runtime Images

Runtime	Docker Image
`linux` (default)	`ubuntu:22.04`
`python`	`python:3.11-slim`
`node`	`node:20-slim`
Custom	Any Docker image

Session Scoping (Code Mode)

Sessions can be scoped to share containers:

Workflow Scope
Agent Scope

Container shared across all agents in a workflow:

sandbox:
  mode: code
  session: workflow

Workflow: build-and-test
├── Agent 1: writes code → sandbox_fs_write
├── Agent 2: runs tests → sandbox_exec (files still there!)
└── Complete → container destroyed

Fresh container per agent run:

sandbox:
  mode: code
  session: agent

Deployment Requirements

Compute Mode
Code Mode

Requires Dagger (auto-managed):

# Docker socket access
docker run -v /var/run/docker.sock:/var/run/docker.sock ...

Requires Docker:

# Docker must be available
export DOCKER_HOST=unix:///var/run/docker.sock

# Or remote Docker
export DOCKER_HOST=tcp://docker-host:2375

Fly Machines Backend

When running Station on Fly.io, use the Fly Machines backend instead of Docker (Fly’s Firecracker VMs don’t support Docker-in-Docker).

Configuration

sandbox:
  enabled: true
  backend: fly_machines
  fly_machines:
    org_slug: your-org        # Or use FLY_ORG env var
    region: ord               # Fly.io region
    image: python:3.11-slim   # Container image
    memory_mb: 256
    cpu_kind: shared
    cpus: 1

Required environment variables:

FLY_API_TOKEN - Your Fly.io API token
FLY_ORG - Organization slug (if not in config)

Custom Images

Create a Dockerfile with your required tools:

FROM python:3.11-slim

RUN apt-get update && apt-get install -y git curl unzip && rm -rf /var/lib/apt/lists/*

# Install Terraform
RUN curl -fsSL https://releases.hashicorp.com/terraform/1.7.0/terraform_1.7.0_linux_amd64.zip -o /tmp/tf.zip \
    && unzip /tmp/tf.zip -d /usr/local/bin/ && rm /tmp/tf.zip

# Install kubectl
RUN curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl" \
    && install -o root -g root -m 0755 kubectl /usr/local/bin/kubectl && rm kubectl

WORKDIR /workspace

Build and push:

docker build -t yourusername/station-sandbox:latest .
docker push yourusername/station-sandbox:latest

Then configure:

sandbox:
  fly_machines:
    image: yourusername/station-sandbox:latest

Private Registry Authentication

For private images, add registry credentials:

sandbox:
  fly_machines:
    image: ghcr.io/your-org/station-sandbox:latest
    registry_auth:
      username: your-username
      password: ${GITHUB_TOKEN}
      server_address: ghcr.io

GitHub (ghcr.io)
Docker Hub
AWS ECR

registry_auth:
  username: your-github-username
  password: ${GITHUB_TOKEN}  # PAT with read:packages
  server_address: ghcr.io

registry_auth:
  username: your-dockerhub-username
  password: ${DOCKER_TOKEN}
  server_address: docker.io

registry_auth:
  username: AWS
  password: ${ECR_TOKEN}  # aws ecr get-login-password
  server_address: 123456789.dkr.ecr.us-east-1.amazonaws.com

Environment Variable Injection

Inject secrets and configuration into sandbox containers using the STN_CODE_ prefix. This is essential for CLI tools like Terraform, AWS CLI, kubectl, etc.

How It Works

Environment variables on the Station host with the STN_CODE_ prefix are automatically injected into sandbox containers with the prefix stripped:

# Set on Station host (or in Fly secrets):
export STN_CODE_AWS_ACCESS_KEY_ID="AKIAXXXXXXXXXXXX"
export STN_CODE_AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG..."
export STN_CODE_AWS_DEFAULT_REGION="us-east-1"
export STN_CODE_TERRAFORM_TOKEN="tf-xxxxxxxxxxxxxxxx"

# Inside sandbox container, agents see:
AWS_ACCESS_KEY_ID="AKIAXXXXXXXXXXXX"
AWS_SECRET_ACCESS_KEY="wJalrXUtnFEMI/K7MDENG..."
AWS_DEFAULT_REGION="us-east-1"
TERRAFORM_TOKEN="tf-xxxxxxxxxxxxxxxx"

Setting Variables

Fly.io Secrets
Docker
Local Development

# Set secrets on your Fly app
fly secrets set \
  STN_CODE_AWS_ACCESS_KEY_ID="AKIAXXXX" \
  STN_CODE_AWS_SECRET_ACCESS_KEY="secret" \
  STN_CODE_TERRAFORM_TOKEN="tf-xxx" \
  --app your-station-app

docker run -d \
  -e STN_CODE_AWS_ACCESS_KEY_ID="AKIAXXXX" \
  -e STN_CODE_AWS_SECRET_ACCESS_KEY="secret" \
  -e STN_CODE_TERRAFORM_TOKEN="tf-xxx" \
  your-station-image

# In your shell or .env file
export STN_CODE_AWS_ACCESS_KEY_ID="AKIAXXXX"
export STN_CODE_AWS_SECRET_ACCESS_KEY="secret"

# Then start Station
stn serve

Common Use Cases

Tool	Environment Variables
AWS CLI	`STN_CODE_AWS_ACCESS_KEY_ID`, `STN_CODE_AWS_SECRET_ACCESS_KEY`, `STN_CODE_AWS_DEFAULT_REGION`
Terraform	`STN_CODE_TF_TOKEN_app_terraform_io`, `STN_CODE_AWS_*`
kubectl	`STN_CODE_KUBECONFIG` (base64 encoded)
GitHub CLI	`STN_CODE_GITHUB_TOKEN`
Google Cloud	`STN_CODE_GOOGLE_APPLICATION_CREDENTIALS`

Example: Terraform Agent

---
model: openai/gpt-4o
metadata:
  name: "Terraform Agent"
sandbox:
  mode: code
  session: workflow
  image: hashicorp/terraform:latest
---

You can run Terraform commands. The AWS credentials and Terraform token 
are pre-configured in the environment.

Example workflow:
1. sandbox_open({})
2. sandbox_fs_write({"path": "main.tf", "content": "..."})
3. sandbox_exec({"command": "terraform init"})
4. sandbox_exec({"command": "terraform plan"})

Never hardcode secrets in agent prompts or workflow definitions. Always use STN_CODE_* environment variables.

Session Persistence (Fly Machines)

When running on Fly.io, sandbox sessions are persisted to NATS KV store. This enables sessions to survive Station restarts.

How It Works

Station Start
     │
     ▼
┌─────────────────────────────────────┐
│  NATS KV Store (sessions bucket)    │
│  ┌─────────────────────────────┐    │
│  │ fly_abc123 → Machine ID     │    │
│  │ fly_def456 → Machine ID     │    │
│  └─────────────────────────────┘    │
└─────────────────────────────────────┘
     │
     ▼
SessionManager loads from KV
     │
     ▼
Fly Machines still running ✓

Benefits

Scenario	Without Persistence	With NATS KV
Station restart	Sessions lost, orphan machines	Sessions recovered
Workflow continues	Fails (session not found)	Works seamlessly
Multi-step workflows	Must complete before restart	Can span restarts

Configuration

Session persistence is automatic when using Fly Machines backend. No additional configuration required.

sandbox:
  enabled: true
  backend: fly_machines
  # Session persistence enabled automatically

Session Recovery

On Station startup, existing Fly Machines are automatically recovered:

[INFO] Session store: Initialized with NATS KV (bucket: sandbox_sessions)
[INFO] Recovered 3 sandbox sessions from NATS KV

Sessions are keyed by workflow run ID for workflow-scoped sessions, ensuring the same sandbox is reused across workflow steps even after Station restarts.

Security

Both modes provide isolation:

Unprivileged containers (no --privileged)
No Docker socket access from within sandbox
Network disabled by default
Resource limits enforced
Timeout protection

Enable allow_network: true only when necessary. Network access allows the sandbox to reach external services.

File Staging

Stage files between your local system and sandbox containers via NATS Object Store.

File staging solves the challenge of getting large files into sandboxes without passing content through LLM context (which is slow, expensive, and size-limited).

How It Works

Local Files  ────▶  NATS Object Store  ────▶  Sandbox Container
             upload                    stage_file
                         ◀────────────────────
                         download     publish_file

File Staging Tools

When Code Mode is enabled, agents get two additional tools:

Tool	Description
`sandbox_stage_file`	Fetch file from store → write to sandbox
`sandbox_publish_file`	Read from sandbox → upload to store

Example Workflow

# 1. Upload input file
$ stn files upload data.csv
Uploaded: files/f_01JGXYZ123ABC (2.4 MB)

# 2. Run workflow with file reference
$ stn workflow run csv-pipeline --input '{"input_file": "files/f_01JGXYZ123ABC"}'

# 3. Agent stages file into sandbox
sandbox_stage_file({
  "sandbox_id": "ses_abc123",
  "file_key": "files/f_01JGXYZ123ABC",
  "destination": "input/data.csv"
})

# 4. Agent processes file
sandbox_exec({"cmd": ["python", "transform.py", "input/data.csv", "output/result.csv"]})

# 5. Agent publishes output
sandbox_publish_file({
  "sandbox_id": "ses_abc123",
  "source": "output/result.csv"
})
# Returns: {"file_key": "files/f_01JGDEF456XYZ"}

# 6. Download result
$ stn files download f_01JGDEF456XYZ -o result.csv

Tool Reference

sandbox_stage_file
sandbox_publish_file

Fetch a file from NATS Object Store and write it to the sandbox.

Parameter	Type	Required	Description
`sandbox_id`	string	Yes	Session ID from `sandbox_open`
`file_key`	string	Yes	File key (e.g., `files/f_abc123`)
`destination`	string	Yes	Path relative to `/workspace`

{
  "sandbox_id": "ses_abc123",
  "file_key": "files/f_01JGXYZ123ABC",
  "destination": "input/data.csv"
}

Read a file from the sandbox and upload to NATS Object Store.

Parameter	Type	Required	Description
`sandbox_id`	string	Yes	Session ID from `sandbox_open`
`source`	string	Yes	Path relative to `/workspace`
`file_key`	string	No	Custom key (auto-generated if omitted)

{
  "sandbox_id": "ses_abc123",
  "source": "output/result.csv"
}

File Key Conventions

Pattern	Description	Lifecycle
`files/{file_id}`	User uploads	Permanent until deleted
`runs/{run_id}/output/*`	Workflow outputs	Auto-cleanup after TTL
`sessions/{session_id}/*`	Session artifacts	Cleanup with session

CLI File Management

See the full stn files CLI reference for upload, download, list, and delete commands.

Get Started

Station CLI

Building Agents

Station Lattice

Deployment

Advanced

​Sandbox Modes

Compute Mode

Code Mode

​Enabling Sandbox

​Via MCP Tools

​Via Frontmatter (.prompt file)

​Code Mode

​Code Mode Tools

​Example: Full Development Workflow

​Configuration Reference

​Compute Mode

​Code Mode

​Runtime Images

​Session Scoping (Code Mode)

​Deployment Requirements

​Fly Machines Backend

​Configuration

​Custom Images

​Private Registry Authentication

​Environment Variable Injection

​How It Works

​Setting Variables

​Common Use Cases

​Example: Terraform Agent

​Session Persistence (Fly Machines)

​How It Works

​Benefits

​Configuration

​Session Recovery

​Security

​File Staging

​How It Works

​File Staging Tools

​Example Workflow

​Tool Reference

​File Key Conventions

CLI File Management

​Next Steps

Agent Development

Workflows

Sandbox Modes

Enabling Sandbox

Via MCP Tools

Via Frontmatter (.prompt file)

Code Mode

Code Mode Tools

Example: Full Development Workflow

Configuration Reference

Compute Mode

Code Mode

Runtime Images

Session Scoping (Code Mode)

Deployment Requirements

Fly Machines Backend

Configuration

Custom Images

Private Registry Authentication

Environment Variable Injection

How It Works

Setting Variables

Common Use Cases

Example: Terraform Agent

Session Persistence (Fly Machines)

How It Works

Benefits

Configuration

Session Recovery

Security

File Staging

How It Works

File Staging Tools

Example Workflow

Tool Reference

File Key Conventions

Next Steps