ArchitectureTechnical Reference
Workspace Lifecycle & Provisioning
Workspace state machine, provisioning and container lifecycle, and the workspace runtime.
Part of the Comprehensive Technical Documentation. Definitive reference based on a non-invasive scan of the molecule-core repository.
5. Workspace Lifecycle
State Machine
provisioning → online ↔ degraded
↓ ↓ ↓
failed offline offline
↓ ↓
retry (auto-restart)
↓ (any state)
paused → (user resumes) → provisioning
↓ (any state)
removedStatus Definitions
| Status | Meaning | Canvas Indicator |
|---|---|---|
provisioning | Waiting for first heartbeat | Spinner |
online | Heartbeat received, reachable | Green dot |
degraded | Online but error_rate ≥ 0.5 | Yellow node with warning |
offline | Heartbeat TTL expired, unreachable | Gray node |
paused | User paused, container stopped, config preserved | Indigo badge |
failed | Provisioning timeout or launch error | Red node + retry button |
removed | Deleted, kept for event log history | Node removed from Canvas |
Health Detection (Three Layers)
| Layer | Mechanism | Interval | Trigger |
|---|---|---|---|
| Passive | Redis TTL expiry | 60s heartbeat key | Liveness monitor callback |
| Proactive | Docker API poll | Every 15s | Health sweep goroutine |
| Reactive | A2A proxy connection error | On-demand | provisioner.IsRunning() check |
All three layers call onWorkspaceOffline() → broadcast WORKSPACE_OFFLINE + auto-restart.
Cascade Behavior
- Pause: Pausing a parent cascades to all children. Children of a paused parent cannot be individually resumed.
- Delete: Removes container, cleans memory (DB rows, Redis keys). Structure events and Agent Card history are never deleted.
11. Provisioning & Container Lifecycle
Docker Networking
- All containers join
molecule-monorepo-netprivate network - Container naming:
ws-{workspace_id[:12]} - Ephemeral host port binding:
127.0.0.1:0→8000/tcp
URL Resolution
| Caller | URL Type | Example |
|---|---|---|
| Workspace (container) | Docker-internal | http://ws-{id}:8000 |
| Canvas (browser) | Host-mapped | http://127.0.0.1:{ephemeral_port} |
Container Cleanup on Delete
- Docker container stopped and removed
- Memory cleaned (DB rows, Redis keys)
- Status set to
removed WORKSPACE_REMOVEDevent written to structure_events- Structure events and Agent Card history never deleted (audit trail)
12. Workspace Runtime
Entry Point: workspace/main.py
Startup Sequence (10 steps):
- Initialize telemetry (OpenTelemetry, no-op if packages absent)
- Load
config.yamlintoWorkspaceConfigdataclass - Run preflight validation (model availability, skills, configs)
- Create
HeartbeatLoopfor background task tracking - Resolve adapter from
runtimefield in config - Run adapter
setup()andcreate_executor() - Build Agent Card from loaded skills + runtime capabilities
- Register:
POST /registry/registerwith workspace ID + Agent Card - Start heartbeat loop (30s interval) + skill hot-reload watcher
- Serve A2A over Uvicorn on configured port
Runtime Configuration Schema (config.yaml)
name: "Workspace Name"
description: ""
version: "1.0.0"
tier: 2 # 1=sandboxed, 2=standard, 3=privileged, 4=full-host
model: "anthropic:claude-sonnet-4-6" # provider:model syntax
runtime: "langgraph" # claude-code | langgraph | autogen | openclaw | hermes | codex | google-adk | external
runtime_config: # Runtime-specific settings
command: "claude" # For CLI runtimes
args: []
auth_token_file: ".auth-token"
timeout: 0
model: "" # Override model just for this runtime
skills: ["skill1", "skill2"] # Folder names under skills/
tools: ["web_search", "filesystem"] # Built-in tool names
prompt_files: ["system-prompt.md"] # Additional prompt text files
shared_context: [] # Files from parent workspace
a2a:
port: 8000
streaming: true
push_notifications: true
delegation:
retry_attempts: 3
retry_delay: 5.0
timeout: 120.0
escalate: true
sandbox:
backend: "subprocess" # subprocess | docker
memory_limit: "256m"
timeout: 30
rbac:
roles: ["operator"]
allowed_actions: {}
hitl:
channels:
- type: "dashboard"
default_timeout: 300
bypass_roles: []
governance:
enabled: false
policy_mode: "audit" # audit | permissive | strict
policy_file: ""
security_scan:
mode: "warn" # warn | block | off
compliance:
mode: "owasp_agentic"
prompt_injection: "detect" # detect | block
max_tool_calls_per_task: 50
max_task_duration_seconds: 300Seven Runtime Adapters
| Adapter | Core Strength | Image Tag |
|---|---|---|
| LangGraph | Graph-based state machine, tool use, streaming | workspace-template:langgraph |
| Claude Code | Native coding workflows, CLI continuity, OAuth auth | workspace-template:claude-code |
| AutoGen | Multi-agent conversations, explicit strategies | workspace-template:autogen |
| OpenClaw | CLI-native runtime, own session model | workspace-template:openclaw |
| Hermes | Stacked system messages, native tool calls, Kimi | workspace-template:hermes |
| Codex | OpenAI Codex CLI, OAuth/API/platform arms | workspace-template:codex |
| Google ADK | Gemini 2.5 Pro on Vertex AI, keyless ADC/WIF | workspace-template:google-adk |
Branch-level WIP: NemoClaw (NVIDIA T4 + Docker socket) on feat/nemoclaw-t4-docker.
Each adapter implements setup() + create_executor(). The base adapter provides shared infrastructure: system prompt assembly, skill loading, tool registration, coordinator detection, plugin injection.