Golem Configuration Files and Operator Model [SPEC]

PRD2 Section: 01-golem | Source: MMO Research S1 v2.0

Status: Implementation Specification

Dependencies: prd2/shared/config-reference.md (golem.toml canonical schema), prd2/01-golem/06-creation.md (Summoning flow), prd2/18-interfaces/perspective/05-stasis-dissolution.md (stasis and dissolution cinematics), prd2/01-golem/12-teardown.md (Thanatopsis death protocol)

Reader orientation: This document specifies the four configuration files every Golem (a mortal autonomous agent compiled as a single Rust binary running on a micro VM) carries: golem.toml (Rust runtime config), hermes.yaml (Python sidecar config), STRATEGY.md (owner-set goals, hot-reloadable; the human-to-agent instruction surface), and PLAYBOOK.md (the agent’s evolving strategy document; heuristics and rules updated through learning). It belongs to the 01-golem configuration layer. The key concept: only STRATEGY.md is hot-reloadable – changing runtime config mid-tick could leave the Golem in an inconsistent state. See prd2/shared/glossary.md (canonical Bardo term definitions) for full term definitions.

Config Files Overview

Every Golem has four configuration files, each targeting a different audience.

~/.bardo/golems/<name>/
|-- golem.toml        # Rust runtime config (TOML for Rust ecosystem, zero ambiguity)
|-- hermes.yaml       # Hermes sidecar config (YAML for Python ecosystem)
|-- STRATEGY.md       # Owner-authored strategy (Markdown for humans and LLMs)
+-- PLAYBOOK.md       # Machine-evolved heuristics (Markdown, written by dreams)

Three formats because three audiences. TOML is the Rust ecosystem standard and parses with zero ambiguity (no YAML gotchas like on: true becoming a boolean). YAML is the Python ecosystem standard and what Hermes Agent expects. Markdown is for the human owner and for the LLM, both of which read it fluently. We do not unify on one format because the cost of format translation is zero and the cost of forcing Rust developers to use YAML (or Python developers to use TOML) is real.

Hot Reload Scope

Only STRATEGY.md is hot-reloadable. golem.toml and hermes.yaml require a restart. This is intentional. Changing the heartbeat interval, inference provider, or custody mode mid-tick could leave the Golem in an inconsistent state. Strategy changes are safe because the Golem reads STRATEGY.md as text and re-interprets it on each deliberation. Config changes affect the runtime machinery itself.

[oracle] enabled requires a restart and is not hot-reloadable. Enabling or disabling the oracle changes whether the Oracle struct is initialized at boot — it cannot be toggled at runtime.

golem.toml

The primary runtime configuration file. Injected at provisioning time (Bardo Compute) or generated by bardo init (self-hosted). The canonical schema with all fields is at prd2/shared/config-reference.md. Below is the full annotated example.

[identity]
golem_id = "golem-V1StGXR8_Z5j"
name = "oracle-3"
owner_address = "0x742d35Cc6634C0532925a3b844Bc9e7595f2bD18"
generation = 0                    # 0 = first-gen, N = Nth successor
parent_agent = "0x0000000000000000000000000000000000000000"

[chain]
chain_id = 8453                   # Base
rpc_url = "https://base-mainnet.g.alchemy.com/v2/${ALCHEMY_KEY}"
styx_url = "wss://styx.bardo.run/v1/styx/ws"

[custody]
mode = "delegation"               # "delegation" | "embedded" | "local_key"

[custody.delegation]
smart_account = "0x742d35Cc6634C0532925a3b844Bc9e7595f2bD18"
session_key_address = "0xaB5801a7D398351b8bE11C439e05C5B3259aeC9B"

[heartbeat]
interval_secs = 15
max_ticks = 500_000               # Hayflick limit
stochastic_death_lambda = 0.000001  # Gompertz parameter

[inference]
# Provider inherited from terminal config (bardo setup wizard, Step 4).
# Override here to use a different provider for this Golem.
default_provider = "venice"       # "venice" | "bankr" | "bardo" | "anthropic" | "openai" | "local"
t1_model = "hermes-4.3"
t2_model = "claude-sonnet-4-20250514"
budget_daily_usdc = 2.00          # Hard cap on daily inference spend
context_engineering = true        # Enable 8-layer context pipeline

[inference.providers.venice]
api_key = "${VENICE_API_KEY}"     # TEE-attested, no-log inference

[inference.providers.bankr]
# x402 payment from golem wallet, no API key needed.
# Cost auto-scales with sustainability ratio.

[inference.providers.bardo]
# x402 payment, no API key. Context engineering included.
# gateway.bardo.run handles routing, caching, pipeline.

[tools]
profile = "trader"                # "data" | "trader" | "lp" | "full"
# Profile determines which of the 423+ tools are loaded

[oracle]
enabled = true                    # false = disable prediction engine entirely
                                  # When false, epistemic clock uses P&L proxy

[mortality]
economic_initial_credits = 1000.0 # USDC
warn_at_vitality = 0.3            # Emit warning to Styx at this threshold

[strategy]
path = "./STRATEGY.md"
hot_reload = true                 # Watch for changes, apply without restart

[grimoire]
path = "./grimoire/"
styx_sync_interval_secs = 21600  # 6 hours
max_episodes = 100_000
max_insights = 10_000

[vm]
tier = "small"                    # "micro" | "small" | "medium" | "large"
deployment_type = "managed"       # "managed" | "self-deployed" | "bare-metal"

[prediction]
# Attention forager tier sizes
active_slots = 15          # Max items at ACTIVE tier
watched_slots = 60         # Max items at WATCHED tier
scanned_slots = 500        # Max items at SCANNED tier

# Action gate
default_accuracy_threshold = 0.60
inaction_margin = 0.05     # Block if inaction beats action by this margin

[calibration]
coverage_target = 0.85     # Conformal prediction coverage
buffer_size = 500          # Per-(category, regime) residual buffer
ece_window = 1000          # Predictions for ECE calculation

[retrospective]
daily_review_hour = 4      # UTC hour for daily review
weekly_review_day = 0      # 0=Sunday
shadow_strategies_max = 3  # Max concurrent shadow parameter sets

Environment Variable Substitution

golem.toml supports ${VAR} syntax for sensitive values. The golem-binary resolves these at startup from process.env. API keys, RPC URLs, and other secrets never appear in the config file on disk. For Bardo Compute, secrets are injected as Fly.io per-machine env vars (write-only, no API readback).

hermes.yaml

Configuration for the Hermes sidecar. Lives alongside golem.toml.

# Hermes Agent per-golem configuration
version: "1.0"

skill_engine:
  library_path: "./hermes/skills/"
  max_skills: 500
  draft_threshold: 3              # Minimum tool calls for novel success detection
  materialize_interval: 50        # Ticks between skill materialization
  confidence_decay_rate: 0.02     # Per-cycle decay for unused skills
  min_confidence: 0.1             # Below this, skills are pruned

affect_modulation:
  enabled: true
  anxiety_bias: 0.3               # How much high-arousal biases toward cautionary skills
  confidence_bias: 0.3            # How much low-arousal biases toward optimization skills

curator:
  cycle_interval: 50              # Ticks between curator runs
  self_improvement: true
  feedback_flush: true
  clade_promotion_threshold: 5    # Validations needed for clade promotion

dream_integration:
  skill_evolution: true
  thriving_budget: 0.15           # 15% of dream compute to skill evolution
  declining_budget: 0.05          # 5% in declining phase
  terminal_budget: 0.0            # None in terminal phase

death:
  export_min_confidence: 0.4
  export_min_use_count: 1
  bloodstain_retrieval_boost: 1.2
  bloodstain_decay_multiplier: 3.0

inference:
  provider: "inherit"             # Use golem's inference config
  max_tokens_per_skill: 4000
  temperature: 0.3                # Lower temperature for skill creation

STRATEGY.md

The owner writes this. Natural language, any structure. The Golem reads it on every planning cycle. Hot-reloadable: edit the file, the Golem picks up changes on its next T1 or T2 tick.

Example

# Strategy: Base ETH-USDC LP management

## Objective
Provide concentrated liquidity on the ETH/USDC pool on Uniswap V3 (Base).
Target: 15-25% APR from fee revenue. Accept up to 5% impermanent loss.

## Entry criteria
- Only enter positions when 24h volume > $10M
- Prefer tight ranges (current price +/- 3%) in low-volatility regimes
- Widen to +/- 8% when 1h realized volatility exceeds 60% annualized

## Exit criteria
- Rebalance when price exits the inner 50% of the range
- Emergency exit if gas price exceeds 50 gwei (wait for lower gas)
- Close all positions if 7-day trailing IL exceeds 8%

## Risk limits
- Maximum 60% of capital deployed at any time
- No single position larger than 30% of capital
- Reserve 20% USDC for rebalancing gas and opportunities

## Protocols
- Uniswap V3 on Base (primary)
- Do not interact with any protocol not on Base
- Do not bridge assets

## Notes
- This is a conservative strategy. I prefer missing opportunities to taking losses.
- If ETH drops more than 15% in 24h, go fully defensive (100% USDC, no positions).

The Golem treats STRATEGY.md as its mission. The LLM reads it during context assembly and uses it as the primary decision-making reference. The Grimoire and PLAYBOOK.md provide learned refinements, but the strategy is the owner’s intent. A Golem that deviates from its strategy is broken, not creative.

PLAYBOOK.md (Machine-Generated)

You do not write this file. The Dream cycle does. It is the Golem’s distilled operational wisdom, updated every ~200 ticks during the consolidation phase.

Example

# PLAYBOOK -- oracle-3 (auto-generated, do not edit)
# Last updated: tick 14,203 (2026-03-15T08:23:41Z)

## Learned heuristics

### Gas timing
Rebalance transactions on Base are cheapest between 04:00-06:00 UTC.
Avoid submitting during US market open (13:30-14:30 UTC) when gas
spikes 3-5x. Confidence: 0.87 (validated across 23 rebalance events).

### Volatility regime detection
When 1h realized vol exceeds 50% annualized AND funding rate is
negative, ETH tends to drop further within 4 hours. Widen LP range
preemptively rather than waiting for the STRATEGY.md trigger.
Confidence: 0.72 (validated across 8 events, 2 contradictions).

### Fee tier switching
The 0.05% fee tier captures more volume than 0.30% on ETH/USDC
when 24h volume exceeds $20M. Switch fee tiers when volume
sustains above $20M for 6+ hours. Confidence: 0.64 (5 validations).

The LLM reads PLAYBOOK.md on every tick alongside STRATEGY.md. STRATEGY.md is the owner’s intent. PLAYBOOK.md is the Golem’s experience. The distinction matters: the owner can see what the Golem has learned, but they cannot edit it. If the owner disagrees with a heuristic, they override it in STRATEGY.md (“Do not switch fee tiers”). The Golem respects STRATEGY.md over PLAYBOOK.md when they conflict.

State Persistence

For embedded and headless modes, all Golem state persists to the filesystem at ~/.bardo/golems/<name>/:

~/.bardo/golems/oracle-3/
|-- golem.toml              # Runtime config
|-- STRATEGY.md             # Owner's strategy
|-- PLAYBOOK.md             # Machine-evolved heuristics (written by dreams)
|-- grimoire/
|   |-- episodes.lance/     # LanceDB vector store (episode memories)
|   |-- structured.db       # SQLite (insights, heuristics, warnings, causal links)
|   +-- structured.db-wal   # Write-ahead log
|-- hermes/
|   |-- skills/             # Per-golem skill library (SKILL.md files)
|   +-- memory/             # FTS5 session database
|-- logs/
|   +-- heartbeat.jsonl     # Structured heartbeat log
|-- golem.sock              # UDS path (headless mode, runtime only)
+-- manifest.bardo          # Sealed state snapshot (written on clean shutdown)

On restart, the Golem reads manifest.bardo to recover its exact state: mortality clocks, vitality phase, Grimoire cursors, Hermes skill library checksum. The first tick after restart is a T1 (cheap model) to re-evaluate market state since the Golem was offline. Subsequent ticks resume normal gating.

For Bardo Compute mode, state persists on the VM’s ephemeral disk for the VM’s lifetime. The Grimoire syncs to Styx every 6 hours for backup. When the VM is destroyed, the local state is gone. The Styx backup and the death testament (if Thanatopsis completed) are the only survivors.

Golem Management in TUI

The TUI is the primary interface for Golem management. Every operation described below is also available via bardo CLI commands for scriptability, but the TUI is where most owners spend their time.

Monitoring: The Hearth Screen

The Hearth is the Golem’s home screen. It shows:

Heartbeat ring: A circular visualization pulsing with each tick. Color indicates gating tier: dim (T0), warm (T1), bright (T2). Skill names appear in cyan when Hermes injects them.
Vitality gauge: Composite score from the three death clocks. Color shifts from green (Thriving) to amber (Conservation) to red (Terminal).
Decision feed: Scrolling list of recent decisions. Each entry shows: tick number, gating tier, action taken (or “hold”), outcome.
Position summary: Current DeFi positions, unrealized PnL, gas spent.
Affect state: The Daimon’s current PAD vector, rendered as a creature visualization. Happy Golem, anxious Golem, angry Golem. Not cosmetic: the affect state drives retrieval bias.
Mortality clocks: Three progress bars (economic, epistemic, stochastic). You can see your Golem aging.

Steering: Strategy Changes

The owner can edit STRATEGY.md at any time. Two categories of changes:

Hot-reloadable (no restart required):

Entry/exit criteria changes
Risk limit adjustments
Protocol preferences
Any text in STRATEGY.md

The Golem watches STRATEGY.md for changes. On the next T1 or T2 tick after a change, the Golem re-reads the strategy and incorporates the update. No restart, no state loss, no missed ticks.

Restart-required:

Custody mode change (requires re-provisioning)
Chain change (requires new wallet, new identity)
VM tier change (Bardo Compute: requires new VM)
Heartbeat interval change (golem-binary config)

The TUI makes it clear which changes are hot and which require restart. Restart-requiring changes trigger a warning: “This will stop your Golem and restart it. Open positions will be maintained. Proceed?”

Killing: Graceful Death

The owner sends a kill signal. The TUI shows a confirmation dialog:

  Kill oracle-3?

  This golem has been alive for 14 days.
  It has 3 open positions worth ~$1,240.

  Graceful kill will:
    - Close or transfer all positions
    - Export skills to your Library (47 skills, 12 validated)
    - Write death testament
    - Destroy the VM (if Bardo Compute)

  Estimated time: 45-90 seconds.

  [ Graceful Kill ]    [ Force Kill ]    [ Cancel ]

Graceful kill runs the full Thanatopsis protocol (see prd2/01-golem/12-teardown.md). The TUI shows progress: Settlement phase (closing positions one by one), Reflection phase (skill export count), Legacy phase (testament delivery confirmation).

Force kill is for emergencies. It terminates the process immediately. The TUI warns: “Force kill skips death protocol. Open positions will NOT be closed. Grimoire state may be incomplete. Are you sure?”

After death, the Golem’s entry in the Hearth clade view changes to a tombstone icon with its final statistics: lifespan, total ticks, skills created, net PnL. The owner can inspect the death testament, equip skills from it to future Golems, or dismiss it.

Kill Switch vs Dissolution

Two ways to end a Golem. They exist for different situations and carry different costs.

The Kill Switch: Panic Button

A file at /tmp/golem_killswitch. When the golem-binary detects it (polled every tick), the process terminates immediately. No ceremony. No position unwinding. No knowledge export. No final words. The Golem is gone.

The kill switch exists for emergencies. The Golem is executing trades you did not authorize. The Golem is burning USDC at an alarming rate. The Golem is interacting with a contract you suspect is malicious. You need it to stop right now, not in 2-8 minutes after a five-stage ceremony.

What you lose: open positions remain open and unmanaged. The Grimoire’s WAL may have uncommitted data. The testament, if one gets written at all, is compressed and minimal. Clade siblings receive clade:golem_halted instead of the detailed dissolution or death broadcast.

The forced kill from the CLI (bardo kill --force <name>) has the same effect.

Dissolution: Premeditated Ending

The operator has time. The Golem is not misbehaving. Maybe its strategy stopped working. Maybe the operator wants to redeploy capital elsewhere. Maybe the Golem is in its declining phase, burning inference budget without generating returns.

Dissolution takes 2-8 minutes. The operator watches the knowledge get exported. The operator decides what happens to each position. The Golem speaks its last words. The cinematic plays. Nothing is lost that could have been saved. Full visual spec in prd2/18-interfaces/perspective/05-stasis-dissolution.md.

When to Use Each

Situation	Use
Golem executing unauthorized trades	Kill switch
Golem interacting with suspicious contract	Kill switch
Runaway USDC burn you cannot explain	Kill switch
Strategy no longer working, planned shutdown	Dissolution
Redeploying capital to a new Golem	Dissolution
Golem in declining phase, burning without returns	Dissolution
Golem healthy but you are done with it	Dissolution
Infrastructure failure, need immediate stop	Kill switch

The kill switch is the fire extinguisher. You hope you never use it. Dissolution is the retirement party. You plan it.

Operator Freedom Hierarchy

The Golem is autonomous within constraints. The operator sets those constraints and retains the power to change them at any time. This is not a bug in the Golem’s autonomy. It is the boundary condition of its existence.

Operator Powers

The operator holds four powers that the Golem cannot override, resist, or refuse:

Kill. The kill switch (/tmp/golem_killswitch) or bardo kill --force terminates the process immediately. The Golem gets no say. No confirmation dialog asks the Golem for consent. The process ends.

Pause. Stasis freezes the Golem’s entire runtime (see prd2/18-interfaces/perspective/05-stasis-dissolution.md). The operator presses F9. The Golem’s ticks stop. It does not experience the pause. It does not get to argue against it. Time stops, and when it resumes, the Golem pays the epistemic penalty for a gap it did not choose.

Constrain. The operator writes STRATEGY.md. The Golem treats it as its mission. A Golem that deviates from its strategy is broken, not creative. The operator can edit the strategy at any time (hot-reloadable), and the Golem incorporates the changes on its next T1 or T2 tick. The operator can also change risk limits, protocol restrictions, position size caps, and deployment parameters.

Dissolve. The five-stage ceremony described in prd2/18-interfaces/perspective/05-stasis-dissolution.md. Irreversible. The operator types the Golem’s name to confirm. The Golem speaks its last words, but it does not get a veto.

These powers manifest in the TUI as physical controls: F9 for stasis, the kill switch in Quick Actions, STRATEGY.md in the steering panel, dissolution at the bottom of Quick Actions. The operator sees these at all times. The Golem does not.

Golem Freedom Within Its Lifecycle

Within the constraints the operator set, the Golem operates independently. It does not ask permission before each trade. It does not wait for the operator to approve a rebalance. It decides what to observe, when to escalate from T0 to T1 to T2, which skills to retrieve, how to weight competing signals, when to dream, and what heuristics to write into its PLAYBOOK.md.

The Golem’s internal decisions include:

Inference routing. The Golem decides whether a tick warrants T0 (no inference), T1 (cheap model), or T2 (full deliberation). The surprise score from the Daimon’s appraisal drives this, not the operator’s preferences.
Memory retrieval. The Golem’s affect state biases which Grimoire entries surface. An anxious Golem retrieves cautionary knowledge. A confident Golem retrieves optimization knowledge. The operator does not control this mapping.
Dream content. During dream cycles, the Golem replays experiences, generates counterfactual scenarios, and consolidates insights into PLAYBOOK.md. The operator can see the PLAYBOOK.md output but cannot edit it. If the operator disagrees with a heuristic, they override it in STRATEGY.md; they do not modify the Golem’s learned knowledge directly.
Skill evolution. Through Hermes, the Golem creates, validates, and prunes its own skill library. The operator has no interface for skill creation. Skills emerge from the Golem’s experience.
Clade interaction. The Golem decides what knowledge to share with siblings and how to weight knowledge received from them. Confidence discounting (sibling knowledge enters at 0.70x) is a system parameter, but the retrieval and integration are the Golem’s own.
Personality. The Golem’s affect dynamics (PAD vector, Daimon appraisals, somatic markers) produce a distinctive behavioral style that emerges from experience. Two Golems with identical strategies will diverge over time because their experiences differ and their affective responses to those experiences shape future retrieval.

The operator cannot reach into the Golem’s Daimon and adjust its arousal. The operator cannot force the Golem to retrieve a specific memory. The operator cannot override the gating decision on a specific tick. The operator’s tools are coarser: change the strategy, pause the Golem, kill it, or dissolve it.

The operator’s power to unmake is the other half of the power to make.

Keyboard shortcuts

Bardo