Workspace Lifecycle & Provisioning

Workspace state machine, provisioning and container lifecycle, and the workspace runtime.

Part of the Comprehensive Technical Documentation. Definitive reference based on a non-invasive scan of the molecule-core repository.

5. Workspace Lifecycle

State Machine

provisioning → online ↔ degraded
   ↓             ↓         ↓
 failed       offline    offline
   ↓             ↓
 retry      (auto-restart)

↓ (any state)
paused → (user resumes) → provisioning

↓ (any state)
removed

Status Definitions

Status	Meaning	Canvas Indicator
`provisioning`	Waiting for first heartbeat	Spinner
`online`	Heartbeat received, reachable	Green dot
`degraded`	Online but `error_rate ≥ 0.5`	Yellow node with warning
`offline`	Heartbeat TTL expired, unreachable	Gray node
`paused`	User paused, container stopped, config preserved	Indigo badge
`failed`	Provisioning timeout or launch error	Red node + retry button
`removed`	Deleted, kept for event log history	Node removed from Canvas

Health Detection (Three Layers)

Layer	Mechanism	Interval	Trigger
Passive	Redis TTL expiry	60s heartbeat key	Liveness monitor callback
Proactive	Docker API poll	Every 15s	Health sweep goroutine
Reactive	A2A proxy connection error	On-demand	`provisioner.IsRunning()` check

All three layers call onWorkspaceOffline() → broadcast WORKSPACE_OFFLINE + auto-restart.

Cascade Behavior

Pause: Pausing a parent cascades to all children. Children of a paused parent cannot be individually resumed.
Delete: Removes container, cleans memory (DB rows, Redis keys). Structure events and Agent Card history are never deleted.

11. Provisioning & Container Lifecycle

Docker Networking

All containers join molecule-monorepo-net private network
Container naming: ws-{workspace_id[:12]}
Ephemeral host port binding: 127.0.0.1:0→8000/tcp

URL Resolution

Caller	URL Type	Example
Workspace (container)	Docker-internal	`http://ws-{id}:8000`
Canvas (browser)	Host-mapped	`http://127.0.0.1:{ephemeral_port}`

Container Cleanup on Delete

Docker container stopped and removed
Memory cleaned (DB rows, Redis keys)
Status set to removed
WORKSPACE_REMOVED event written to structure_events
Structure events and Agent Card history never deleted (audit trail)

12. Workspace Runtime

Entry Point: `workspace/main.py`

Startup Sequence (10 steps):

Initialize telemetry (OpenTelemetry, no-op if packages absent)
Load config.yaml into WorkspaceConfig dataclass
Run preflight validation (model availability, skills, configs)
Create HeartbeatLoop for background task tracking
Resolve adapter from runtime field in config
Run adapter setup() and create_executor()
Build Agent Card from loaded skills + runtime capabilities
Register: POST /registry/register with workspace ID + Agent Card
Start heartbeat loop (30s interval) + skill hot-reload watcher
Serve A2A over Uvicorn on configured port

Runtime Configuration Schema (`config.yaml`)

name: "Workspace Name"
description: ""
version: "1.0.0"
tier: 2                                    # 1=sandboxed, 2=standard, 3=privileged, 4=full-host
model: "anthropic:claude-sonnet-4-6"       # provider:model syntax
runtime: "langgraph"                       # claude-code | langgraph | autogen | openclaw | hermes | codex | google-adk | external
runtime_config:                            # Runtime-specific settings
  command: "claude"                        # For CLI runtimes
  args: []
  auth_token_file: ".auth-token"
  timeout: 0
  model: ""                                # Override model just for this runtime
skills: ["skill1", "skill2"]               # Folder names under skills/
tools: ["web_search", "filesystem"]        # Built-in tool names
prompt_files: ["system-prompt.md"]         # Additional prompt text files
shared_context: []                         # Files from parent workspace

a2a:
  port: 8000
  streaming: true
  push_notifications: true

delegation:
  retry_attempts: 3
  retry_delay: 5.0
  timeout: 120.0
  escalate: true

sandbox:
  backend: "subprocess"                    # subprocess | docker
  memory_limit: "256m"
  timeout: 30

rbac:
  roles: ["operator"]
  allowed_actions: {}

hitl:
  channels:
    - type: "dashboard"
  default_timeout: 300
  bypass_roles: []

governance:
  enabled: false
  policy_mode: "audit"                     # audit | permissive | strict
  policy_file: ""

security_scan:
  mode: "warn"                             # warn | block | off

compliance:
  mode: "owasp_agentic"
  prompt_injection: "detect"               # detect | block
  max_tool_calls_per_task: 50
  max_task_duration_seconds: 300

Seven Runtime Adapters

Adapter	Core Strength	Image Tag
LangGraph	Graph-based state machine, tool use, streaming	`workspace-template:langgraph`
Claude Code	Native coding workflows, CLI continuity, OAuth auth	`workspace-template:claude-code`
AutoGen	Multi-agent conversations, explicit strategies	`workspace-template:autogen`
OpenClaw	CLI-native runtime, own session model	`workspace-template:openclaw`
Hermes	Stacked system messages, native tool calls, Kimi	`workspace-template:hermes`
Codex	OpenAI Codex CLI, OAuth/API/platform arms	`workspace-template:codex`
Google ADK	Gemini 2.5 Pro on Vertex AI, keyless ADC/WIF	`workspace-template:google-adk`

Branch-level WIP: NemoClaw (NVIDIA T4 + Docker socket) on feat/nemoclaw-t4-docker.

Each adapter implements setup() + create_executor(). The base adapter provides shared infrastructure: system prompt assembly, skill loading, tool registration, coordinator detection, plugin injection.

Workspace Lifecycle & Provisioning

On this page