Claude Code In a Box

I want to think of Claude Code itself as a deployable general agent.

Dot Claude

Claude Code, as the harness, can run any combination of commands, agents, skills, output-styles, and hooks, until it returns the desired output, in any format.

The instructions that drive this output, written mostly in natural language, reside in the .claude/ folder. This can be viewed as a complete, portable agent definition.

portable-agent-definition

.claude/
├── settings.json      # Model, permissions, hooks, output-style
├── commands/          # Slash commands
├── agents/            # Subagent definitions
├── hooks/             # Event triggers
├── output-styles/     # Persona overrides, generalization
└── skills/
    └── any-skill/       # Behaviors, workflows
        ├── SKILL.md       # Instructions, workflows
        ├── references/    # Context
        ├── templates/     # Response templates
        └── scripts/       # Scripts, tools

This means you can draft, iterate, and test locally on your machine. Once it is stable, you can structure a simple workflow like this:

process-or-behavior

workflow/
├── .claude/ # complex tasks to be performed and verified
├── inputs/  # request, data
├── outputs/ # response
└── CLAUDE.md # guidance, memory

You can run the agent something like this:

claude -p "follow these instructions"

Which are of course all contained in the .claude/. Through prompt chaining, tool calling, progressive disclosure (as sequenced instructions, in, for example, a Skill), or as asynchronous, multi-threaded processes (parallel agents), Claude Code completes the work and writes it to outputs/.

Think of Docker as an analogy (however imprecise and flawed): As your agent runs locally, so it will in the cloud.

Following that analogy, once a workflow is stable and verified, you can deploy Claude Code + ./claude/ in a wrapper on a virtual machine, a sandbox. A box.

Box

In the example that follows, the box runs on an e2b sandbox,¹ much like "Claude on the Web" runs, but with a custom .claude/.

All of the prompts, context, and tools reside in the .claude/ folder, so we can simply deploy Claude Code as the application, here with a React UI, a thin FastAPI layer, and a BOX – a remote sandbox running Claude Code with a copied .claude/ folder.

box-architecture

Example

Mortals is a character generator for the game Cupid.

To generate characters for this game, I developed a system of sequenced prompts (progressive disclosure) and tools (uv in-line scripts) into a workflow: Claude Code generates birthdate information at random, with which it produces an astrological natal chart with SVG. It then writes a complete biography of the character based on the chart, and generates a portrait with Nano Banana.

Like this:

This is a sped-up screen recording of a roughly 2 minute process using this .claude/:

mortals

When you initiate a session, the server uploads the .claude/ folder to the e2b (box) instance, manages its lifecycle, pipes the streams to the UI, and logs the session as files.² Webhooks trigger UI state changes when the files save to the box.

This makes the application scalable, with on-demand cost.³

Change the .claude/ folder, and the application assumes a completely different role, function, skills. Same infrastructure.

What Makes This Distinct

(assessment by Opus 4.5)

Existing Work	This Approach
Claude Code as coding assistant	Claude Code as any agentic workflow
`.claude/` for dev workflow optimization	`.claude/` as complete portable agent definition
Configuration as implementation detail	Configuration as the application itself
Technical infrastructure focus	Agent-native philosophy (NL → outcomes)

Existing implementations are infrastructure-focused ("run Claude Code remotely"). This framing is architectural: applications expressed in natural language via .claude/ folders. It's a development paradigm, not just a deployment pattern.

I first discovered e2b in this video by IndyDevDan. He claimed it was best in class; I looked no further. In his practice, he's scaling compute for development tasks, but here it's presented as a scalable deployment architecture for agent-native applications. ↩
This can be made completely private. All interactions could stay on the remote secure box without writing to the main server. A "ballot-box," for example. (We are currently logging for the observability of the agent's work.) ↩
A note on cost: Yes. This is a bit more expensive. But to build out the system above as a DAG would require many more engineering resources, which could be more costly than inference. Also, it emphasizes English over code, which makes the approach more accessible to non-coders. This is intended to work for small to medium size deployments. If, and only if, your agent becomes more broadly useful, the orchestration and, importantly, prompts, context, and tools, that make up the .claude/ folder serve as a template for more "traditional" agentic systems like Datalumina's GenAI Launchpad. ↩

Dot Claude

Box

Example

What Makes This Distinct

Footnotes