The code sandbox that isolates execution of the agent's own generated code (the run_code tool).

Code Sandbox

The code sandbox isolates agent-generated code execution — specifically the run_code tool that executes dynamically generated scripts. Not user-submitted code (there is no user code submission in Molecule AI) — the agent's own generated code is what needs sandboxing.

What Gets Sandboxed

	Runs in	Why
Agent-generated code execution	Sandbox	e.g. "write and run this script"
pip installs from skill requirements	Sandbox	Untrusted package code
Filesystem writes outside `/memory` and `/configs`	Sandbox	Prevent container escape
`SKILL.md` loading	Workspace container	Just file reads
LangChain `@tool` functions	Workspace container	Just Python function calls
A2A HTTP calls to peers	Workspace container	Network calls to known endpoints
Platform heartbeat/registry calls	Workspace container	Known endpoints

The sandbox only activates when the agent calls a run_code tool that executes dynamic code. Regular skill tools — API calls, file reads, data processing — run directly in the workspace container without sandbox overhead.

Configuration

# config.yaml
tier: 3
sandbox:
  backend: docker    # docker | firecracker | e2b | none
  memory_limit: 256m
  cpu_limit: 0.5
  network: false
  timeout: 30s

Sandbox by Tier

Tier	`sandbox.backend`	Reason
1, 2	`none`	No `run_code` tool available — tools are just API calls
3	`docker` (MVP), `firecracker` or `e2b` (production)	Agent can generate and run code
4	`none`	Full-host access tier — no extra sandbox boundary is added by default

Tier 4 doesn't add a second sandbox by default because the workspace already runs with host-level privileges. If you need isolated code execution at that tier, treat it as an explicit defense-in-depth decision rather than an assumption baked into the current provisioner.

How It Works (Tier 3)

Each code execution spawns a throwaway container:

Agent calls run_code(code="import pandas as pd; ...")
Sandbox creates a temporary Docker container (Docker-in-Docker)
Container runs with: network disabled, memory capped, read-only filesystem, CPU limited
Code executes inside the throwaway container
Output (stdout, stderr, return value) is captured
Throwaway container is destroyed immediately after

@tool(description="Execute code safely")
async def run_code(code: str) -> dict:
    result = docker.run(
        image="python:3.11-slim",
        command=["python", "-c", code],
        remove=True,
        network_disabled=True,
        mem_limit="256m",
        read_only=True,
    )
    return {"output": result.output}

The workspace container itself is never at risk — the generated code can't escape the sandbox.

Skill code never changes — only the backend config
Each execution is isolated — no shared state between runs
Containers are destroyed after every run
Network is disabled by default (can be enabled per-sandbox if needed)
Memory is capped to prevent resource exhaustion

Workspace Tiers — Which tiers need sandboxing
Config Format — Sandbox configuration in config.yaml
Provisioner — Container deployment details
Skills — Skill tools that may use the sandbox

Code Sandbox

Code Sandbox

What Gets Sandboxed

Configuration

Sandbox by Tier

How It Works (Tier 3)

Backends

docker (MVP)

firecracker

e2b

Key Properties

On this page