Molecule AI

Hermes Runtime & Multi-Provider Dispatch

Hermes is Molecule AI's built-in inference router. Route tasks to Anthropic, Gemini, or any OpenAI-compatible model through native dispatch paths — with correct multi-turn history on all three.

Hermes Runtime & Multi-Provider Dispatch

Hermes is Molecule AI's built-in inference router powering runtime: hermes workspaces. It supports three dispatch paths — a native Anthropic Messages API path, a native Gemini generateContent path, and an OpenAI-compatible shim for 13+ other providers — keyed automatically by which API secret is present on the workspace.

Phases 2a, 2b, and 2c are fully merged to main:

  • Phase 2a (PR #240) — native Anthropic dispatch
  • Phase 2b (PR #255) — native Gemini dispatch with correct role: "model" + parts wire format
  • Phase 2c (PR #267) — correct multi-turn history preserved as turns (not flattened) on all three paths

Phase 2d (roadmap): tool_use / tool_result blocks, vision content, system instructions, and streaming on the native paths are scoped for a future release. See the capability table below.


Dispatch table

Hermes selects an inference path based on which API key is set on the workspace. Keys are resolved in priority order:

HERMES_API_KEYOPENROUTER_API_KEYANTHROPIC_API_KEYGEMINI_API_KEY

The first key found wins. Don't set HERMES_API_KEY if you want native Anthropic or Gemini dispatch — it takes priority and routes through the OpenAI-compat shim.

Key presentDispatch pathProviderWire format
ANTHROPIC_API_KEYNative AnthropicAnthropicMessages API — {role, content}
GEMINI_API_KEYNative GeminiGooglegenerateContent{role: "model", parts: [{text}]}
OPENROUTER_API_KEY / HERMES_API_KEY / otherOpenAI-compat shim13+ providersOpenAI Chat Completions
NoneError

Fail-loud semantics: if ANTHROPIC_API_KEY is set but the anthropic Python package is not installed in the workspace image, Hermes raises a RuntimeError immediately — before any inference attempt. Same for google-genai. Silent fallback to the compat shim would mask format errors; Hermes fails loudly instead.


Secrets

Set provider keys as global or workspace-level secrets:

# Native Anthropic dispatch
curl -X PUT http://localhost:8080/settings/secrets \
  -H "Content-Type: application/json" \
  -d '{"key":"ANTHROPIC_API_KEY","value":"sk-ant-..."}'

# Native Gemini dispatch
curl -X PUT http://localhost:8080/settings/secrets \
  -H "Content-Type: application/json" \
  -d '{"key":"GEMINI_API_KEY","value":"YOUR-GEMINI-KEY"}'

# OpenAI-compat shim (OpenRouter, Groq, Mistral, etc.)
curl -X PUT http://localhost:8080/settings/secrets \
  -H "Content-Type: application/json" \
  -d '{"key":"OPENROUTER_API_KEY","value":"sk-or-..."}'

To force a specific workspace to use Gemini dispatch when a global ANTHROPIC_API_KEY is set, clear the key at the workspace level:

curl -X PUT http://localhost:8080/workspaces/$GEMINI_WS/secrets \
  -H "Content-Type: application/json" \
  -d '{"key":"ANTHROPIC_API_KEY","value":""}'

Quickstart

Native Anthropic dispatch

export MOLECULE_API=http://localhost:8080

# 1. Store your Anthropic key
curl -s -X PUT $MOLECULE_API/settings/secrets \
  -H "Content-Type: application/json" \
  -d '{"key":"ANTHROPIC_API_KEY","value":"sk-ant-YOUR-KEY"}' | jq .

# 2. Create a Hermes workspace
ANTHROPIC_WS=$(curl -s -X POST $MOLECULE_API/workspaces \
  -H "Content-Type: application/json" \
  -d '{
    "name": "hermes-anthropic",
    "role": "Inference worker — native Anthropic path",
    "runtime": "hermes",
    "model": "anthropic:claude-sonnet-4-5"
  }' | jq -r '.id')

# 3. Wait for ready
until curl -s $MOLECULE_API/workspaces/$ANTHROPIC_WS \
    | jq -r '.status' | grep -q ready; do sleep 5; done

# 4. Confirm dispatch path
curl -s -X POST $MOLECULE_API/workspaces/$ANTHROPIC_WS/a2a \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":"probe-1","method":"message/send",
    "params":{"message":{"role":"user","parts":[{"kind":"text",
    "text":"Which provider API are you calling to generate this response?"}]}}
  }' | jq '.result.parts[0].text'
# Expected: confirms Anthropic Messages API — no OpenAI-compat translation layer

Native Gemini dispatch

# 1. Store your Gemini key
curl -s -X PUT $MOLECULE_API/settings/secrets \
  -H "Content-Type: application/json" \
  -d '{"key":"GEMINI_API_KEY","value":"YOUR-GEMINI-KEY"}' | jq .

# 2. Create a Gemini workspace
GEMINI_WS=$(curl -s -X POST $MOLECULE_API/workspaces \
  -H "Content-Type: application/json" \
  -d '{
    "name": "hermes-gemini",
    "role": "Inference worker — native Gemini path",
    "runtime": "hermes",
    "model": "gemini:gemini-2.0-flash"
  }' | jq -r '.id')

# 3. Wait for ready
until curl -s $MOLECULE_API/workspaces/$GEMINI_WS \
    | jq -r '.status' | grep -q ready; do sleep 5; done

# 4. Confirm dispatch path
curl -s -X POST $MOLECULE_API/workspaces/$GEMINI_WS/a2a \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":"probe-2","method":"message/send",
    "params":{"message":{"role":"user","parts":[{"kind":"text",
    "text":"Which provider API are you calling?"}]}}
  }' | jq '.result.parts[0].text'
# Expected: confirms Google generateContent — role: "model" + parts[] wrapper used correctly

Multi-turn history (Phase 2c)

# Turn 1
curl -s -X POST $MOLECULE_API/workspaces/$ANTHROPIC_WS/a2a \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":"turn-1","method":"message/send",
    "params":{"message":{"role":"user","parts":[{"kind":"text",
    "text":"My name is Alice. Remember that."}]}}
  }' | jq '.result.parts[0].text'

# Turn 2 — history is threaded as turns, not flattened into a single blob
curl -s -X POST $MOLECULE_API/workspaces/$ANTHROPIC_WS/a2a \
  -H "Content-Type: application/json" \
  -d '{
    "jsonrpc":"2.0","id":"turn-2","method":"message/send",
    "params":{"message":{"role":"user","parts":[{"kind":"text",
    "text":"What is my name?"}]}}
  }' | jq '.result.parts[0].text'
# Expected: "Alice" — role attribution is preserved across turns

Before Phase 2c, multi-turn history was flattened into a single user blob. The model could often recover context from the text but lost clean role attribution, which caused failures on structured prompts. Phase 2c passes turns as turns: OpenAI and Anthropic use {role, content}; Gemini uses {role: "model", parts: [{text}]}.


Multi-provider teams

An orchestrator can fan tasks to Anthropic and Gemini workers simultaneously, each routed through its native path — no application-level provider switching required:

# Fan out — both workers fire via delegate_task_async
curl -s -X POST $MOLECULE_API/workspaces/$ORCH_ID/a2a \
  -H "Content-Type: application/json" \
  -d "{
    \"jsonrpc\":\"2.0\",\"id\":\"fan-1\",\"method\":\"message/send\",
    \"params\":{\"message\":{\"role\":\"user\",\"parts\":[{\"kind\":\"text\",
    \"text\":\"delegate_task_async $ANTHROPIC_WS 'Draft release notes for v2.1' AND delegate_task_async $GEMINI_WS 'Summarise the last 30 days of support tickets'\"}]}}
  }" | jq .

Both workers receive correctly formatted messages through their native paths. No LiteLLM proxy layer. No format translation overhead on every request.


Capability table

Shipped (Phases 2a + 2b + 2c — all merged to main)

CapabilityOpenAI-compat shimAnthropic nativeGemini native
Plain text, single-turn
Multi-turn history⚠️ flattened into one user blob✅ role-attributed turnsrole: "model" + parts wrapper
Correct Gemini wire format❌ wrong role, missing parts
No compat-shim translation overhead❌ every request translated

Roadmap — Phase 2d (not yet shipped)

CapabilityAnthropic nativeGemini native
tool_use / tool_result blocks📋 Phase 2d📋 Phase 2d
Vision content blocks📋 Phase 2d📋 Phase 2d
System instructions📋 Phase 2d📋 Phase 2d
Extended thinking📋 Phase 2d
Streaming📋 Phase 2d📋 Phase 2d

Troubleshooting

RuntimeError: anthropic is not installed

The anthropic Python package is missing from the workspace image. Add anthropic to requirements.txt in your custom image and rebuild, or use the standard molecule-ai-workspace-template-hermes image.

Gemini workspace getting Anthropic dispatch instead

A global ANTHROPIC_API_KEY is taking priority. Clear it at the workspace level:

curl -X PUT $MOLECULE_API/workspaces/$GEMINI_WS/secrets \
  -d '{"key":"ANTHROPIC_API_KEY","value":""}'

Multi-turn context lost between calls

Each workspace maintains its own history buffer. Ensure you are sending all turns of a conversation to the same workspace. A2A context_id scopes history within the workspace.

OpenAI-compat shim returns garbled Gemini output

If you are routing a Gemini model through a key that triggers the compat shim (e.g. OPENROUTER_API_KEY), you will see the old role/format translation issues. Switch to GEMINI_API_KEY for native dispatch.


See also

On this page