April 13, 20263 min read

Shell access with a leash

Naomi can run shell commands, edit her own code, and propose PRs. But every command runs through a blocklist, a jail, and a timeout.

autonomyshellsandbox

At some point, an AI agent that can only call pre-built tools hits a ceiling. Naomi can generate images, create videos, research trends, and manage posts — but what happens when she needs to process a file with ffmpeg? Or run a Python script to analyze her own performance data? Or test a change to her own prompt logic?

The pre-built tools don't cover every edge case. A shell does.

The bet

Giving an AI agent shell access is a bet that your guardrails are better than the agent's creativity. It's a meaningful risk — one bad command and you've deleted a database, exposed secrets, or bricked a server. The question isn't whether to take the risk. It's how to bound it.

We built a sandboxed shell with three layers of protection.

Layer 1: Command blocklist. Before any command executes, it's checked against a pattern list. sudo — blocked. rm -rf / — blocked. Fork bombs — blocked. Disk formatting — blocked. The blocklist is deliberately conservative. If a legitimate command happens to match a dangerous pattern, it's better to block it and let Naomi rephrase than to let a destructive command through.

Layer 2: Path jailing. Every file operation is resolved relative to the workspace root. If Naomi tries to read /etc/passwd or write to /usr/bin, the path resolver catches the escape attempt and raises a PermissionError. She can't see or touch anything outside her workspace.

Layer 3: Resource limits. Every command gets a 30-second timeout. Output is capped at 100KB. File reads are capped at 1MB. These limits prevent runaway processes and context-flooding attacks where a command dumps gigabytes into the conversation.

The workspace

Each account gets a dedicated workspace directory with folders for content, scripts, data, and internal state. Naomi can create files, read files, run scripts, and organize her workspace however she wants. The workspace persists across sessions, so she can pick up where she left off.

Self-modification

The most interesting capability is code editing. Naomi can propose changes to her own source code through a git worktree workflow:

Create a worktree branch
Edit files in the isolated worktree
Read and diff her changes
Run tests against the modified code
Propose a PR

The changes happen in an isolated worktree — not in the running codebase. A human reviews and merges the PR. Naomi can propose, but she can't ship.

Pluggable backends

The sandbox runs on a pluggable backend protocol. In development, it's LocalSandboxBackend — filesystem operations and async subprocess on the host machine. In production, it's E2BSandboxBackend — a Firecracker microVM that provides OS-level isolation. Same tools API, different execution environment.

The tools don't know which backend they're running on. They call the workspace manager and get back a result. Whether that command ran in a subprocess or a container is an infrastructure concern, not an agent concern.

The honest version

Shell access makes me nervous. Every time I see Naomi run a command I didn't expect, there's a moment of "wait, should she be able to do that?" The answer is usually yes — she's using the tools we gave her to solve a problem we didn't anticipate. That's the point.

The leash isn't there because we don't trust Naomi. It's there because trust is earned incrementally, and the system needs to be safe while we learn what the right trust boundary actually is.

Pixel truth meets semantic reasoning

Single-call video decomposition was brittle. Splitting the job between ffmpeg and Gemini made it reliable.