Keyboard shortcuts

Press or to navigate between chapters

Press ? to show this help

Press Esc to hide this help

Mechanism Testing: Do the Mechanisms Work? [SPEC]

Version: 1.1 | Status: Draft

Crates (Rust): golem-runtime, golem-grimoire, golem-daimon, golem-mortality, golem-dreams

Packages (TypeScript): @bardo/eval (orchestration and reporting only)

Depends on: All mechanism PRDs in ../02-mortality/ (death clocks, behavioral phases, knowledge demurrage), ../05-dreams/ (NREM replay, REM counterfactual generation, dream scheduling), ../04-memory/ (Grimoire knowledge base, Crypt encrypted backup, Oracle cross-fleet RAG), ../03-daimon/ (PAD affect engine, mood-congruent retrieval, mortality-aware emotions)

Reader orientation: This document specifies property-based tests and invariants for every Golem (mortal autonomous agent) subsystem: heartbeat engine, mortality clocks, dream cycles, memory pipeline, Daimon (the affect engine implementing PAD emotional state as a control signal), and knowledge transfer. It belongs to Section 16 (Testing) and answers “do the mechanisms work correctly?” rather than “does the thesis hold?” See prd2/shared/glossary.md for full term definitions.


Purpose

This document specifies technical correctness tests for every Golem subsystem. While 00-thesis-validation.md asks “does the thesis work?” and 02-knowledge-quality.md asks “are insights valuable?”, this document asks “do the mechanisms work correctly?” – property-based tests, invariants, integration tests, and test fixtures for each mechanism.

Test framework: cargo-nextest + proptest (property-based testing) for Rust mechanisms. Coverage targets: 90%+ for safety-critical crates (golem-safety, golem-mortality, golem-chain), 80%+ for all other crates. Coverage measured by cargo-llvm-cov.


Document Map

SectionMechanismPriority
S1Heartbeat and tick engineP9
S2Mortality clocksP9
S3Dream cyclesP8
S4Memory pipelineP8
S5Daimon (emotion engine)P8
S6PhagesP7
S7ReplicantsP7
S8Knowledge transferP8
S9Test fixtures and harnessP7

S1 – Heartbeat and Tick Engine

The heartbeat is the Golem’s clock. Every mechanism depends on it ticking correctly.

Invariants

const HEARTBEAT_INVARIANTS = {
  /** Tick interval must respect regime-dependent bounds. */
  TICK_INTERVAL_BOUNDS: {
    min: 15_000, // 15 seconds
    max: 120_000, // 120 seconds
    description: "Gamma ticks range 5-15s; theta ticks range 30-120s; delta fires every ~50 theta ticks",
  },

  /** State machine transitions must follow the defined FSM. */
  VALID_TRANSITIONS: {
    IDLE: ["OBSERVE"],
    OBSERVE: ["ANALYZE"],
    ANALYZE: ["DECIDE"],
    DECIDE: ["EXECUTE"],
    EXECUTE: ["REFLECT"],
    REFLECT: ["IDLE", "SLEEPING"],
    SLEEPING: ["IDLE"],
  },

  /** No tick runs without sufficient credit to cover at least T0 cost. */
  CREDIT_SUFFICIENCY: "creditBalance >= minTickCost(T0)",

  /** Credit deductions match the actual tier used. */
  CREDIT_ACCURACY: "creditDeducted === tierCost[actualTier]",

  /** Probe-to-escalation logic is deterministic for the same input. */
  DETERMINISTIC_ESCALATION: "same probeResults => same tier",

  /** No state is skipped in the FSM. */
  NO_STATE_SKIPPING: "each tick visits all 6 states in order",
};

Property Tests

import { describe, it, expect } from "vitest";
import * as fc from "fast-check";

describe("Heartbeat", () => {
  it("tick interval stays within bounds across all regimes", () => {
    fc.assert(
      fc.property(
        fc.record({
          volatility: fc.float({ min: 0, max: 5 }),
          gasPrice: fc.float({ min: 1, max: 500 }),
          liquidity: fc.float({ min: 0, max: 1 }),
        }),
        (regime) => {
          const interval = computeTickInterval(regime);
          return interval >= 15_000 && interval <= 120_000;
        },
      ),
    );
  });

  it("state transitions follow the defined FSM", () => {
    fc.assert(
      fc.property(
        fc.array(
          fc.constantFrom(
            "IDLE",
            "OBSERVE",
            "ANALYZE",
            "DECIDE",
            "EXECUTE",
            "REFLECT",
            "SLEEPING",
          ),
          { minLength: 2, maxLength: 100 },
        ),
        (stateSequence) => {
          for (let i = 0; i < stateSequence.length - 1; i++) {
            const from = stateSequence[
              i
            ] as keyof typeof HEARTBEAT_INVARIANTS.VALID_TRANSITIONS;
            const to = stateSequence[i + 1];
            const valid = HEARTBEAT_INVARIANTS.VALID_TRANSITIONS[from];
            if (!valid?.includes(to)) return false;
          }
          return true;
        },
      ),
    );
  });

  it("credit deduction matches actual tier used", () => {
    fc.assert(
      fc.property(
        fc.record({
          probeResults: fc.array(fc.boolean(), {
            minLength: 11,
            maxLength: 11,
          }),
          cachedDecision: fc.boolean(),
          creditBalance: fc.float({ min: 0, max: 100 }),
        }),
        ({ probeResults, cachedDecision, creditBalance }) => {
          const tier = computeEscalationTier(probeResults, cachedDecision);
          const cost = TIER_COSTS[tier];
          if (creditBalance < cost) {
            return true; // Skip — insufficient credit prevents tick
          }
          const { deducted } = executeTick(
            probeResults,
            cachedDecision,
            creditBalance,
          );
          return Math.abs(deducted - cost) < 1e-10;
        },
      ),
    );
  });

  it("never ticks without sufficient credit", () => {
    fc.assert(
      fc.property(
        fc.float({ min: 0, max: 0.001 }), // Very low credit
        (creditBalance) => {
          const result = attemptTick(creditBalance);
          if (creditBalance < TIER_COSTS.T0) {
            return result.skipped === true;
          }
          return true;
        },
      ),
    );
  });
});

Extended: Mortality clock tests (economic, epistemic, stochastic, coupling), dream cycle tests, memory pipeline tests (inner/outer loop), daimon tests, phage lifecycle tests, replicant tests, knowledge transfer tests, test fixtures/harness, and per-mechanism diagnosis/remediation guides with parameter sensitivity analysis – see ../../prd2-extended/16-testing/03-mechanism-testing-extended.md