Keyboard shortcuts

Press or to navigate between chapters

Press ? to show this help

Press Esc to hide this help

Research Foundations: Academic Synthesis [SPEC]

Version: 2.1 | Status: Draft

Referenced by: All mortality and cognitive architecture PRD documents

Source: tmp/research/moat-research.md (130+ papers, 2023–2026)


Reader orientation: This document is the academic index for Bardo’s mortality architecture. It maps 130+ papers across eight research domains (mortality modeling, memory systems, computational emotion, multi-agent coordination, LLM-based agents, security, inference economics, and context engineering) to specific design decisions. Each domain section explains why that research matters for Golem (mortal autonomous DeFi agent) design and which papers grounded which architectural choice. See 15-references.md (consolidated bibliography, 162 citations) for the full citation list.

Overview

This document indexes the academic research underpinning the mortality architecture and its dependent subsystems (memory, affect, dreams, coordination, inference, context engineering, self-learning, and security). The full 130+ paper survey is in tmp/research/moat-research.md. Each domain section opens with a paragraph explaining why that domain matters for Golem design, followed by a table of key citations, findings, and design implications.

The architecture is built bottom-up from independently validated research. No subsystem depends on a single paper’s claims. Wherever possible, two or more independent research lines corroborate each design decision.


1. Mortality Modeling and Finite Agency

Why this domain matters. Every agent framework assumes immortality by default. The research here establishes that this assumption is not neutral – it actively prevents the emergence of certain valuable behaviors. Digital evolution experiments show that without death, evolution halts. Computational learning research shows that continual learning systems gradually calcify – up to 90% of units become “dead” (non-updating) without periodic replacement. The mortality thesis is not a philosophical preference but an empirical position backed by four billion years of biological evidence and three decades of digital evolution experiments.

CitationFindingDesign Implication
Ray (1991) – TierraDigital evolution halts without a reaper; 300+ genotypes emerge with deathGolem death enables population-level evolution
Lenski et al. (2003) – AvidaComplex features (complex logic) require generational turnoverSuccession with lossy compression produces innovation
Vostinar et al. (2019)Programmed cell death evolves as adaptive behavior; selected for under spatial structureMortality is selected for, not against
Wensink et al. (2020)Intrinsic mortality prevents premature convergence; optimal mortality rate existsOptimal rate balances stagnation and knowledge loss
Kreps-Milgrom-Roberts-Wilson (1982)Uncertain finite horizons promote cooperation; tiny uncertainty breaks backward inductionStochastic mortality makes cooperation rational
Shuvaev et al. (2024)Genomic bottleneck enhances transfer learning; compression forces generalization2048-entry compression forces generalization
Ororbia & Friston (2023)Mortal computation binds processing to lifecycle; mortality and intelligence co-evolveGolem intelligence is inseparable from economic substrate
Hinton (2022)Software-hardware separation limits intelligence; mortality couples the twoMortal computation thesis applied to agents
Dohare et al. (2024) – Nature90% of units become dead in continual learning; plasticity loss is universalPeriodic replacement outperforms continuous adaptation
Vela et al. (2022)91% of ML models degrade temporally in production; temporal drift is the normModel staleness validates epistemic clock
Orseau & Ring (2011)RL agents under mortality risk treat survival as sole goal; pathological unless goal-directedGolems must be goal-directed, not pure RL
Ord (2025)AI agent success rates decay exponentially with task duration (constant hazard rate)Periodic reset may be more reliable than immortality
Sculley et al. (2015)Technical debt compounds silently in ML systems; immortal agents accumulate itImmortal Golems are the control experiment
Werfel et al. (2017)Natural selection directly favors shorter lifespans under spatial structureImmortality is selected against even by evolution

2. Memory Consolidation and Forgetting

Why this domain matters. Naive memory accumulation – storing everything forever – is not neutral; it actively degrades agent performance. The research here establishes that forgetting is a feature, not a failure: it acts as regularization that prevents overfitting to outdated market conditions. The Curator cycle (every 50 ticks), confidence demurrage, and the Ebbinghaus decay rates are direct implementations of findings from neuroscience and AI memory research. The four-factor retrieval scoring (recency × importance × relevance × emotional congruence) synthesizes three independent research lines.

CitationFindingDesign Implication
Richards & Frankland (2017)Memory’s goal is optimizing decisions, not preserving informationActive forgetting is regularization
Ebbinghaus (1885)Forgetting follows negative exponential decay; retrieval slows decayConfidence demurrage with per-category decay rates
Roediger & Karpicke (2006)Retrieval strengthens memory traces (testing effect); +200% recall vs passive reviewRetrieved entries decay slower than unretrieved ones
Bartlett (1932)New information assimilated into existing schemas; raw transplant failsInherited knowledge must be integrated, not transplanted
Cepeda et al. (2006)Spaced retrieval produces more durable memories; optimal interval is non-trivialCurator runs every 50 ticks, not every tick
Davis & Zhong (2017)Active forgetting is metabolically expensive; it serves a functionForgetting is selected for; demurrage encodes this
MemoryBank / MemAct (2024-2025)Naive all-add memory causes self-degradation through catastrophic interferenceAutonomous pruning with learned operators
A-MEM (2025)Zettelkasten-inspired atomic notes with dynamic links; 85-93% token reductionFour-factor retrieval, 2x multi-hop reasoning
Generative Agents (Park et al. 2023)Three-factor retrieval: recency, importance, relevance; emergent social behaviorsBasis for four-factor retrieval (adding emotional congruence)
Mem0 (Chhikara et al. 2025)Two-phase extraction-update pipeline; 26% higher accuracy, 91% lower latencyCurator’s dual-pass architecture
AriGraph (Anokhin et al. 2024)Semantic + episodic integration into knowledge graph outperforms pure vector retrievalCausal link entries in the Grimoire

3. Affect Computation and Emotional Architecture

Why this domain matters. The Daimon is not cosmetic. Damasio’s patients demonstrate that removing emotional processing while preserving cognitive ability reliably degrades decision quality under uncertainty – exactly the conditions DeFi agents operate under. Five independent research lines (somatic markers, mood-congruent retrieval, exploration modulation, narrative transfer, and empirical trading results) all converge on the same conclusion: emotion-like states serve genuine computational functions that cognition alone cannot replicate. The 50% decision change rate reported by Zhang et al. is the headline number, but the Cabrera-Paniagua Sharpe ratio improvement is the most directly relevant to financial agents.

CitationFindingDesign Implication
Damasio (1994)Patients without emotion make consistently worse decisions under uncertaintySomatic markers bias choices before deliberation
Bechara et al. (2000)Anticipatory SCRs precede conscious awareness in Iowa Gambling TaskSomatic Landscape provides pre-cognitive gut feelings
Bower (1981)Emotional states bias memory retrieval via associative network activationFour-factor retrieval includes emotional congruence
Emotional RAG (2024)Emotion-tagged retrieval significantly outperforms non-emotional retrieval across three datasetsPAD vectors on every Grimoire entry
Russell-Mehrabian (1977)Three-dimensional affect (PAD) captures more variance than discrete labelsContinuous PAD state, not discrete emotion labels
Plutchik (1980)Eight primary emotions from evolutionary pressure; evolutionary substratePAD octants map to Plutchik categories
Gebhard (2005) – ALMAThree temporal affect layers: emotion (seconds), mood (hours), personality (lifetime)Tick-level emotion, EMA mood, static personality
Walker & van der Helm (2009)REM sleep depotentiates emotional charge while preserving informational contentDream cycles reduce arousal on traumatic memories
Cabrera-Paniagua (2023)Agents with somatic markers achieve higher Sharpe ratios on S&P 500 and Dow JonesSomatic markers validated on financial data
Seligman (1972)Learned helplessness from uncontrollable negative outcomes; affects future behaviorDominance < -0.3 for 200+ ticks triggers alert
Zhang et al. (2024) – SIGDIALSelf-emotion changes ~50% of agent decisions in social simulationDaimon is architectural, not decorative
Gadanho (2003) – JMLRALEC (emotion + cognition) architecture: 40% fewer collisions vs cognition aloneCombined affect-cognition architecture validated
Barthet et al. (2022) – Go-BlendAffect-driven RL improves exploration efficiency and agent performanceArousal modulates exploration temperature

4. Dream Architecture and Offline Learning

Why this domain matters. The brain dedicates 25-33% of its runtime to a state that prevents interaction with the environment (sleep) – an enormous evolutionary cost that must confer proportional benefits. The benefits are now well-characterized: memory consolidation, emotional depotentiation, counterfactual hypothesis generation, and catastrophic forgetting prevention. For Golems running on finite compute credits, idle periods are not waste – they are budget for offline cognitive work. The “Sleep-time Compute” finding (5x reduction in test-time compute) is directly relevant: a Golem that processes experiences during low-activity periods executes fewer expensive T2 inference calls during active trading.

CitationFindingDesign Implication
Wilson & McNaughton (1994)Hippocampal replay during sleep consolidates memories; temporal sequence replayNREM-style prioritized experience replay
Wagner et al. (2004)Sleep is 2.6x more likely to produce insight on hidden rule problemsDream cycles produce genuine insight
Hafner et al. (2025) – DreamerV3Imagined trajectory training outperforms across 150+ tasks; world model dreamingREM-style counterfactual scenario generation
Ha & Schmidhuber (2018) – World ModelsController trained entirely inside dreams achieves competitive performanceDreaming multiplies learning from scarce experience
Lin et al. (2025) – Sleep-time ComputeIdle-time precomputation reduces test-time compute 5x while maintaining accuracyDream cycles as sleep-time compute
WSCL (2024)Wake-Sleep reduces catastrophic forgetting 38%, increases zero-shot transfer 17.6%Three-phase dreaming: NREM, REM, consolidation
Zhao et al. (2024) – BTP PipelinePrioritized experience replay with P2Value; combines likelihood with pass rateDream replay prioritizes informative failures
Wang et al. (2024) – Generative ReplayConditional diffusion generates new transitions near high-value regionsREM creates synthetic scenarios, not just replays

5. Coordination Theory and Multi-Agent Systems

Why this domain matters. Golems do not operate in isolation. A fleet of Golems owned by one person is a Clade; the collective intelligence of a Clade is the product of cooperation mechanisms. The research here establishes why anonymous stigmergic coordination (Pheromone Field) is superior to explicit messaging, why death-based turnover specifically favors cooperators, and why the Grossman-Stiglitz information paradox forces a specific strategy for what Golems can safely share. The mycorrhizal network parallel (Simard) is not decorative – Styx’s architecture as a fungal-style underground relay, where signals travel between nodes without direct communication, mirrors a proven biological coordination mechanism.

CitationFindingDesign Implication
Grasse (1959)Stigmergy: coordination through environmental traces; no central orchestrationPheromone Field for anonymous signal sharing
Parunak et al. (2002)Digital pheromones enable emergent coordination; time-decaying signalsTime-decaying signals reinforced by confirmation
Ohtsuki et al. (2006)Death-birth updating favors cooperators over defectors in spatial gamesDeath before succession produces cooperation
Smith (1992)Mortal individuals in immortal lineages sustain cooperation through generationsMortal Golems, immortal Clades
Esposito (2010)Community constituted by shared obligation to give; communitas vs immunitasDeath reflections as communitas gift
Nakamaru et al. (1997-1998)Mortality selection promotes cooperation over fertility selectionDeath-based turnover outperforms reproduction-based growth
Grossman-Stiglitz (1980)Freely shared information is immediately priced in; no profit without information asymmetryShare threats and structure, not alpha signals
Van den Broek (2023)Emotion contagion in multi-agent systems; anger spreads competitivelyArousal contagion capped at +0.3 per sync cycle
Xu et al. (2024)Stigmergy + independent RL + conflict-avoidance achieves emergent coordinationPheromone Field design principles

6. Biological Analogues

Why this domain matters. The Golem mortality architecture is not metaphor – it is structural analogy. Hayflick’s limit informs the epistemic clock design. Kirkwood’s disposable soma theory explains why declining Golems shift investment from growth to legacy. The T-cell development finding (95-98% death rate producing a collectively intelligent immune repertoire) is the direct model for how massive Golem turnover produces Clade-level intelligence that no individual could achieve. These analogies are productive because the evolutionary pressures that shaped them (resource competition, information quality, cooperative stability) match the pressures DeFi Golems face.

CitationFindingDesign Implication
Hayflick (1965)Replicative senescence after ~60 divisions; telomerase exists but organisms suppress itEpistemic fitness replaces hard tick limit
Kirkwood (1977)Disposable soma: investment in self-repair decreases with age; energy reallocatedDeclining Golems shift from learning to legacy
Hanahan & Weinberg (2000, 2011)Cancer hallmarks include resisting cell death and enabling replicative immortalityImmortal agents are the cancer analog
Skulachev (1999)Phenoptosis: programmed death operates at cellular, organism, and colony levelFractal mortality: phage, heuristic, Golem
Werfel et al. (2017)Natural selection directly favors shorter lifespans under spatial resource competitionImmortality is selected against
Simard (2012)Mycorrhizal networks share carbon, nutrients, and defense signals between treesStyx as underground fungal knowledge relay
Ramsdell & Fowlkes (1990)95-98% of thymocytes die during T-cell development; survivors form immune repertoireMassive death produces collectively intelligent repertoire
Heard & Martienssen (2014)Most transgenerational epigenetic inheritance is deleterious; barriers are protectiveWeismann Barrier: inherited confidence at 0.85^generation

7. Self-Learning Systems

Why this domain matters. A Golem that does not improve is a very expensive cron job. The research here establishes how agents can improve without human retraining – through verbal self-reflection (Reflexion), cross-episode experience extraction (ExpeL), and metacognitive loops that improve the learning process itself (ACE, Argyris). The critical finding is that these mechanisms must be architecturally integrated, not bolted on. Reflexion works because reflection is structured and stored persistently. ExpeL works because experiences accumulate across episodes. ACE works because context assembly is cybernetically self-tuning. For Golems, the triple-loop (execution, strategy, meta) maps directly to the 9-step heartbeat (Loop 1), the Reflector cycle (Loop 2), and the Curator’s self-assessment function (Loop 3).

CitationFindingDesign Implication
Shinn et al. (2023) – ReflexionVerbal RL: +22% AlfWorld, +20% HotPotQA via stored self-reflectionSingle-loop: post-trade reflection stored in Grimoire
Zhao et al. (2024) – ExpeLCross-task experience extraction; insights accumulate across episodesDouble-loop: insights evolve across trading sessions
Sims (2003)Rational finite-capacity agents optimally ignore some informationMortality pressure shapes attention allocation
Baldwin (1896)Learned behavior becomes innate across generations under selection pressureBaldwin Effect: heuristics 3+ generations old become defaults
Argyris (1978)Triple-loop organizational learning: single → double → triple loopMeta-learning evaluates whether the learning process works
ACE (Zhang et al. 2025)Agentic context engineering via Generator-Reflector-Curator; +10.6% AppWorldContext assembly self-improves via cybernetic feedback
Wang et al. (2023) – VoyagerCode-as-action skill library; 3.3x more unique behaviors vs baselinesPLAYBOOK.md as evolving procedural skill library
Guo et al. (2024) – EvoPromptGenetic algorithm prompt optimization; up to +25% on BBH tasksEvolutionary strategy selection in the Grimoire
Dohare et al. (2024)Continual learning systems lose plasticity; periodic resets restore itDeath as plasticity reset for the lineage

8. Context Engineering

Why this domain matters. Context failures, not model failures, cause most agent breakdowns. For a Golem running days or weeks in volatile markets, context assembly is the highest-leverage cognitive system. The research here establishes that effective context management requires active curation, not passive accumulation – and that the same mortality pressure that shapes Golem behavior also shapes what enters the context window. The 6x context reduction and 18x cost reduction achievable through proper context engineering directly reduce the USDC burn rate, extending economic lifetime.

CitationFindingDesign Implication
Zhang et al. (2025) – ACEGenerator-Reflector-Curator cycle treats context as evolving playbook; +10.6% AppWorldContext assembly improves cybernetically
Samsung Research (2025) – CSOContext State Object: 6x initial reduction, 10-25x growth rate reductionCompressed structured context replaces raw history
Kang et al. (2025) – ACONFailure-driven compression optimization; 26-54% peak token reductionCompaction preserves DeFi-specific context
Lindenbauer et al. (2025)Observation masking halves cost while matching LLM summarizationT0 suppression: mask stale observations, not summarize
Cohen-Wang et al. (2024) – ContextCiteContributive attribution via sparse linear model; 64 ablation passesContext items pruned by attribution score
Anthropic (2025)Effective context = pre-loaded static + just-in-time retrievalTwo-layer context: golem.toml + per-tick RAG

9. Security and Adversarial Robustness

Why this domain matters. A Golem managing real capital is a high-value attack target. Memory poisoning (ranking OWASP LLM04:2025) is particularly dangerous for long-running agents because corrupted beliefs persist and compound. Short-lived Golems are structurally immune to persistent memory poisoning – any corruption self-terminates with the agent. The Cohen (1987) formal undecidability result is directly relevant: perfect detection of malicious replication is impossible, so the only reliable defense is making replication impossible by design. Mortality makes this a design property, not a runtime check.

CitationFindingDesign Implication
OWASP (2025)Memory poisoning: high persistence, very high detection difficultyShort-lived agents immune to persistent corruption
Kaspersky (2026)OpenClaw: 512 vulnerabilities, 8 critical in competing frameworkCompeting frameworks have fundamental security gaps
TEE.Fail (2025)SGX/TDX attestation broken for under $1,000 via physical side-channelTEE is one layer of six, not sole defense
BAI (2022) – Constitutional AIHarmlessness from AI feedback, not rule listsPolicyCage as smart contract law, not prompt engineering
Cohen (1987)Perfect virus detection is formally undecidableDefense against replication must be internal (mortality)
Debenedetti et al. (2025) – CaMeLCapability-based authorization separates control flow from data flowCapability tokens prevent compromised LLM from forging authorization
Zhang et al. (2025) – CVaR-CPOCVaR constraints guard against tail risks in financial RLDrawdown limits as CVaR constraints, not expected-value limits
Orseau & Armstrong (2016)Safely interruptible agents via off-policy learningKill-switch design: agents don’t learn to avoid interruption

10. Generational Learning and Cultural Evolution

Why this domain matters. The succession mechanism – where knowledge compresses through a genomic bottleneck and flows to successors – is not just a backup system. It is the primary mechanism for producing cumulative cultural evolution in the Clade. The research here establishes that lossy compression (not perfect copying) is what drives generalization, and that the Weismann barrier (preventing inherited knowledge from flowing back unchecked) is what prevents evolutionary stagnation. The Baldwin Effect prediction – that validated heuristics become structural defaults after three generations – is testable and falsifiable.

CitationFindingDesign Implication
Bhatt et al. (2023)Few-shot imitation as cultural transmission: +improvements across culture-learning benchmarksClade knowledge transfer produces cumulative learning
Bourahla et al. (2022)Vertical transmission (inter-generational) enables agents to exceed performance ceilingsDeath + inheritance outperforms horizontal peer sync
Perez et al. (2024) – AGIPure imitation leads to stagnation; novelty requires mixing inheritance and explorationAnti-proletarianization mandate: successors must diverge
Martin, Everitt & Hutter (2016)RL agents learning only from survival histories develop systematic overconfidenceDeath testaments include failures, not just successes
Gerstgrasser et al. (2023) – SUPERSurprise-based experience sharing: rank by novelty relative to recipientSUPER pattern for novelty-ranked inheritance
Shuvaev et al. (2024)Compression through genomic bottleneck forces generalization; ~2000 gene limit2048-entry bottleneck: compression IS the learning
Baldwin (1896)Learned behavior becomes innate via generational selectionHeuristics 3+ generations old promote to structural defaults
Heard & Martienssen (2014)Transgenerational epigenetic inheritance is mostly deleterious; barriers evolvedWeismann Barrier: confidence × 0.85^generation

Cross-Reference

For the complete citation index across all prd2/ documents, see shared/citations.md. Mortality-specific citations are listed here and in 15-references.md. See tmp/research/moat-research.md for the complete 130+ paper survey with implementation analysis.

Extended: Full specification – see ../../prd2-extended/02-mortality/14-research-foundations-extended.md