11 – Privacy and trust [SPEC]
Three inference security classes, Venice private cognition, DIEM staking, cryptographic audit trail, strategy-aware redaction, cache encryption, gateway signing, secret-keeping architecture
Related: 07-safety.md (PII detection, prompt injection defense, and audit logging), 09-api.md (API reference with privacy request extensions), 02-caching.md (three-layer cache stack with encryption at rest), 08-observability.md (per-agent cost attribution and OTEL traces), 12-providers.md (Venice private cognition plane with TEE attestation), prd2-extended/10-safety/02-warden.md (optional Warden time-delay proxy for high-value transactions)
Reader orientation: This document specifies the privacy and trust architecture of Bardo Inference (the LLM inference gateway for mortal autonomous DeFi agents called Golems). It belongs to the inference plane and covers three security classes for inference requests, Venice private cognition with TEE attestation, DIEM staking for zero-cost private inference, a cryptographic audit trail with hash chains and Merkle tree anchoring, strategy-aware redaction, and cache encryption. The key concept is that DeFi agents have unique confidentiality requirements because their inference calls expose trading intent, and strategy leakage is economically equivalent to front-running. For term definitions, see
prd2/shared/glossary.md.
1. Three inference security classes
Every inference request falls into one of three security classes. The class determines provider routing, redaction level, and audit depth.
| Class | Data retention | Use cases | Providers |
|---|---|---|---|
| Standard | Provider retains prompts for training/abuse monitoring | Routine analysis, market commentary, public strategy reasoning | BlockRun, OpenRouter, Bankr, Direct Key |
| Confidential | Provider retains for billing/audit but not training | Portfolio-specific analysis, risk assessment with position data | OpenRouter (select no-train models), BlockRun (x402, no account) |
| Private | Zero retention. Provider cannot reconstruct what was asked. | Treasury management, deal negotiation, governance voting, MEV-sensitive execution, death reflection | Venice only |
The Private class enables agent behaviors that are structurally impossible with standard inference:
Confidential treasury reasoning. A Golem (a mortal autonomous DeFi agent managed by the Bardo runtime) managing a $500K vault reasons about rebalancing strategy. With standard inference, the provider observes portfolio composition, risk parameters, and timing intent. An adversary with provider access (subpoena, breach, insider) reconstructs the position and front-runs the rebalance. With Venice, the reasoning vanishes after response delivery.
Private deal negotiation. Two Golems negotiate a cross-vault strategy allocation. Each Golem reasons privately about its negotiation strategy (reserve price, walk-away conditions) using Venice. Only structured offers/counter-offers transmit between agents. The reasoning that produced the offer is structurally unrecoverable.
MEV-resistant execution planning. When a Golem plans a large swap, execution timing is MEV-sensitive. The planning phase – which route, what time, what slippage tolerance – runs on Venice. A sandwich bot monitoring inference providers sees nothing.
Security-class classifier
The bardo-context Pi extension tags requests based on content sensitivity. Classification is deterministic (no LLM call):
#![allow(unused)]
fn main() {
// crates/bardo-safety/src/security_class.rs
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum SecurityLevel { Standard, Confidential, Private }
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum SecurityTrigger {
PortfolioComposition, // Request contains specific asset amounts
RebalanceTiming, // Request discusses when to execute
DealNegotiation, // Inter-agent commercial discussion
GovernanceDeliberation, // Proposal analysis with position exposure
MevSensitive, // Execution timing that could be front-run
CounterpartyAnalysis, // Evaluating another agent's behavior
DeathReflection, // Terminal phase -- most honest, most sensitive
OwnerPii, // Owner-identifying information in context
}
pub fn classify_security_class(
context: &ContextBundle,
phase: BehavioralPhase,
) -> InferenceSecurityClass {
let mut triggers = Vec::new();
// Portfolio composition: positions above $1,000
if context.defi_snapshot.positions.iter().any(|p| p.value_usd > 1000.0) {
triggers.push(SecurityTrigger::PortfolioComposition);
}
// Rebalance timing: pending swap or rebalance actions
if context.tool_state.pending_actions.iter().any(|a| {
matches!(a.action_type, ActionType::Swap | ActionType::Rebalance)
}) {
triggers.push(SecurityTrigger::RebalanceTiming);
}
// Death reflection: always private
if phase == BehavioralPhase::Terminal {
triggers.push(SecurityTrigger::DeathReflection);
}
// MEV-sensitive: any pending execution above $500
if context.tool_state.pending_actions.iter().any(|a| {
a.estimated_value_usd > 500.0
}) {
triggers.push(SecurityTrigger::MevSensitive);
}
if !triggers.is_empty() {
return InferenceSecurityClass {
class: SecurityLevel::Private,
reason: format!(
"Contains {}",
triggers.iter().map(|t| format!("{:?}", t)).collect::<Vec<_>>().join(", ")
),
triggers,
};
}
InferenceSecurityClass {
class: SecurityLevel::Standard,
reason: "No sensitive content detected".to_string(),
triggers: vec![],
}
}
}
When class resolves to Private, the router hard-filters to Venice. If Venice is not configured, the gateway returns HTTP 503 with a clear error rather than falling back to a retaining provider.
2. Venice: zero-data-retention private cognition
Venice provides the only structural privacy guarantee in the system. Not a policy promise. Not a terms-of-service checkbox. The inference logs do not exist.
Why Venice is not “just another provider”
Adding Venice is not about breadth – Bardo already has 400+ models via OpenRouter. Venice adds a fundamentally new inference plane: the Private class. Every other provider retains some data (prompts, billing records, safety logs). Venice retains nothing.
Venice provider configuration
#![allow(unused)]
fn main() {
// crates/bardo-providers/src/venice.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct VeniceProviderConfig {
pub base_url: String, // "https://api.venice.ai/api/v1"
pub api_key: String,
pub models: VeniceModelMapping,
pub venice_parameters: Option<VeniceParameters>,
pub daily_cap_usd: f64,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct VeniceModelMapping {
/// Fast, cheap, routine private analysis
pub t1: String, // "llama-3.3-70b"
/// Deep reasoning with visible <think> tags
pub t2_reasoning: String, // "deepseek-ai-DeepSeek-R1"
/// Frontier, 128K context, tool use
pub t2_general: String, // "zai-org-glm-4.7"
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct VeniceParameters {
pub enable_web_search: Option<WebSearchMode>,
/// Controls whether R1's <think> tags appear in responses.
/// False for dreams and death reflection (preserve full chain).
/// True for routine operations (smaller, faster).
pub strip_thinking_response: Option<bool>,
}
}
Private subsystem routing
| Subsystem | Always private? | Reason |
|---|---|---|
| Dream cycles | Yes | Subconscious is unobservable. Counterfactual scenarios, threat simulations. |
| Death reflection | Yes | Most honest reasoning. Unrealized hypotheses, strategy fragments. |
| Daimon (emotional appraisal) | Yes | Emotional state = known vulnerability. Anxiety signals exploitable. |
| Deal negotiation | Yes | Reserve prices, walk-away conditions. Material non-public information. |
| Governance deliberation | Yes | Vote intent + position exposure = insider trading intelligence. |
| Execution planning (>$500) | Yes | Timing and routing = front-running opportunity. |
| Routine heartbeat | No | Low-value, no position data. |
| Operator conversation | Depends | Private if portfolio data in context. |
Venice + DeepSeek R1: private visible reasoning
Venice is the only backend where DeepSeek R1’s visible <think> tags combine with zero data retention. This enables private visible reasoning – the Golem sees and stores the full reasoning chain locally while the provider forgets it. No other backend combination provides this. Death reflection uses this path: R1’s raw reasoning process (uncertainty, contradictions, abandoned hypotheses) is captured in the Grimoire. Venice forgets the conversation.
Venice vision: private chart analysis
Venice serves Qwen 2.5 VL 72B for private visual analysis. Use cases:
- Order book heatmap analysis for hidden liquidity walls
- Governance vote distribution charts for whale detection
- TVL charts for regime shift identification before numerical data reflects them
The visual analysis runs on Venice. The provider never sees what chart the Golem analyzed or what it concluded.
3. DIEM staking: zero-cost private inference
Venice’s tokenomics create a unique possibility: a Golem that funds its own private inference through token staking, achieving zero-marginal-cost reasoning.
Mechanism
- Owner stakes VVV (Venice’s native token on Base)
- Staked VVV earns pro-rata daily DIEM allocation
- Each DIEM = $1/day of Venice API credit, perpetually
- Golem consumes DIEM for private inference – no per-request payment
- Excess DIEM can be traded or allocated to successor Golems
#![allow(unused)]
fn main() {
// crates/bardo-providers/src/venice_staking.rs
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct VeniceStakingConfig {
pub vvv_token: Address, // VVV contract on Base
pub staked_vvv: U256, // Amount staked by owner
pub daily_diem_allocation: f64, // Computed from pro-rata stake share
pub diem_consumed_today: f64,
pub dream_reserve_fraction: f64, // Default: 0.15
}
}
Mortality integration: DIEM as life extension
In the standard mortality model, inference costs drain the LLM credit partition (60% of total budget). DIEM staking decouples inference cost from mortality pressure:
| Inference source | Cost model | Mortality impact |
|---|---|---|
| BlockRun (x402) | Per-request USDC micropayment | Drains LLM partition, shortens lifespan |
| Venice (DIEM) | Zero marginal cost from staked VVV | No drain on LLM partition |
A Golem routing 50% of inference to Venice extends its projected lifespan by ~30% (saving $0.06-0.10/day on a $0.20/day budget). Over a 30-day lifespan, this is 9 additional days – purchased through the owner’s VVV stake, not through more USDC.
#![allow(unused)]
fn main() {
pub fn compute_venice_lifespan_extension(
daily_inference_cost_usd: f64,
venice_fraction: f64,
current_burn_rate_usd: f64,
remaining_credits_usd: f64,
) -> LifespanExtension {
let daily_savings = daily_inference_cost_usd * venice_fraction;
let new_burn_rate = current_burn_rate_usd - daily_savings;
let original_days_remaining = remaining_credits_usd / current_burn_rate_usd;
let new_days_remaining = remaining_credits_usd / new_burn_rate;
LifespanExtension {
extension_days: new_days_remaining - original_days_remaining,
extension_hours: (new_days_remaining - original_days_remaining) * 24.0,
}
}
}
DIEM allocation strategy
Daily DIEM Budget: $X (from VVV stake)
+-- Waking inference (private): 60% -- Portfolio analysis, deal negotiation
+-- Dream cycles (private): 15% -- Counterfactual reasoning, threat simulation
+-- Sleepwalker artifacts: 15% -- Observatory research (if phenotype=sleepwalker)
+-- Reserve (rollover): 10% -- Unused DIEM for volatile days
4. Moat: agents that keep secrets
The privacy argument for Bardo is not “we collect less data.” It is an architecture argument.
Most agent frameworks treat privacy as a policy problem. They write policies, implement access controls, and hope the controls hold. But the architecture itself generates leakage – every MCP call, every API request, every conversation log creates a new exfiltration surface.
Bardo treats privacy as an architecture problem. Sensitive data has minimal exposure surface by construction:
Keys never leave TEE hardware. The Golem’s wallet key is isolated. There is no API key to exfiltrate because x402 payment is cryptographic, not credential-based.
Context is assembled by the Governor, not appended from raw history. The ContextBundle is structured, minimal, and auditable. The bardo-result-filter extension sanitizes tool results before they enter the message array.
Inference routes through a proxy that strips non-essential context. Semantic caching means repeated queries never reach the provider. The 8-layer pipeline applies before any provider sees the request.
Payment uses x402 (no account relationship). Per-request USDC settlement via transferWithAuthorization. The provider sees a request and a payment. No name, no email, no billing address, no usage history.
Knowledge decays by default. Demurrage on Grimoire entries. Stale personal data doesn’t persist indefinitely.
Tools are compiled Rust/TypeScript, not remote MCP servers. No external data transmission during tool execution.
Five leakage vectors and how Bardo addresses them
| Vector | Typical agent | Bardo |
|---|---|---|
| API key exfiltration | Plaintext .env files, environment variables | x402 wallet-native payment, no API keys to steal |
| Context window leakage | All tools see full context | ContextBundle with category isolation, result filtering |
| On-chain behavioral fingerprinting | No MEV protection | Warden time-delay + Flashbots Protect + slippage bounds |
| Inference provider surveillance | Full context to provider every call | Context-pruned proxy + Venice zero-retention + x402 (no account) |
| Persistent memory poisoning | No decay, no validation | Confidence scoring + demurrage + causal rollback + Curator pruning |
The result: an attacker who compromises the Golem’s VM cannot extract signing keys (TEE), cannot access full conversation history (Governor prunes and compresses), cannot establish persistent control (mortality terminates the Golem; succession starts fresh), and cannot pivot to external services (no OAuth tokens, no API keys, no MCP connections).
5. Cryptographic audit trail
Audit logging in 07-safety.md stores InferenceLog entries in Clickhouse – sufficient for analytics but not tamper-evident. A compromised gateway could alter historical records. The cryptographic audit trail extends InferenceLog with hash-chain integrity, gateway signatures, and on-chain root anchoring.
Extended InferenceLog
#![allow(unused)]
fn main() {
// crates/bardo-telemetry/src/audit.rs
/// Extends `InferenceLog` (07-safety.md) with hash-chain tamper evidence.
#[derive(Debug, Clone, Serialize)]
pub struct InferenceLogSigned {
#[serde(flatten)]
pub base: InferenceLog,
/// SHA-256 of the previous event in this agent's chain. None for first event.
pub prev_hash: Option<String>,
/// SHA-256(prev_hash || canonical_json(event_fields)). Chain link.
pub event_hash: String,
/// Ed25519 signature of event_hash by the gateway's signing key.
pub gateway_signature: String,
/// SHA-256 of the full request body (after PII masking).
pub input_hash: String,
/// SHA-256 of the full response body (before de-identification restoration).
pub output_hash: String,
}
}
Hash chain construction
Each agent maintains an independent hash chain. Events are strictly ordered by timestamp within an agent’s chain.
Event N: event_hash = SHA-256(prev_hash_N-1 || canonical_json(event_fields_N))
Event N+1: event_hash = SHA-256(event_hash_N || canonical_json(event_fields_N+1))
canonical_json uses RFC 8785 (deterministic key ordering). event_fields includes all InferenceLog fields plus input_hash and output_hash, but not prev_hash, event_hash, or gateway_signature (those are derived). The gateway signs every event_hash with its Ed25519 key, binding its identity to the event.
#![allow(unused)]
fn main() {
// crates/bardo-telemetry/src/audit.rs
use sha2::{Sha256, Digest};
pub fn compute_event_hash(
prev_hash: Option<&str>,
event_fields: &InferenceLog,
input_hash: &str,
output_hash: &str,
) -> String {
let mut hasher = Sha256::new();
if let Some(ph) = prev_hash {
hasher.update(ph.as_bytes());
}
let canonical = serde_jcs::to_string(event_fields)
.expect("InferenceLog must be serializable");
hasher.update(canonical.as_bytes());
hasher.update(input_hash.as_bytes());
hasher.update(output_hash.as_bytes());
hex::encode(hasher.finalize())
}
}
Merkle tree aggregation
Events aggregate into Merkle trees (batch size 256, configurable via BARDO_AUDIT_BATCH_SIZE). Binary tree over event_hash values; root published to Base via the Facilitator contract.
// Extension to the existing x402 Facilitator contract on Base
function publishAuditRoot(
bytes32 merkleRoot,
uint256 batchStartIndex,
uint256 batchEndIndex
) external;
Piggybacks on x402 settlement infrastructure. ~$0.001 per anchor at current Base gas prices. Anchoring frequency follows event volume, not a fixed schedule.
Verification API
#![allow(unused)]
fn main() {
// GET /v1/audit/events/{eventId}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AuditEventResponse {
pub event: InferenceLogSigned,
pub merkle_proof: Vec<String>,
pub leaf_index: u64,
pub anchor_tx_hash: Option<String>,
}
// GET /v1/audit/verify?agentId={agentId}&from={startIndex}&to={endIndex}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AuditVerifyResponse {
pub agent_id: u128,
pub range: AuditRange,
pub chain_intact: bool,
pub valid_signatures: u64,
pub anchored_events: u64,
pub violations: Vec<AuditViolation>,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
#[serde(rename_all = "snake_case")]
pub enum AuditViolationType { HashMismatch, SignatureInvalid, Gap }
}
Agents can independently verify any event with O(log n) proof path plus on-chain root lookup. No trust in the gateway is required.
6. Strategy-aware data minimization
The de-identification pattern in 07-safety.md masks wallet addresses with [WALLET_N] placeholders. This extends the same approach to protect agent trading strategies from leaking to LLM providers.
Strategy signal taxonomy
| Signal type | Pattern | Placeholder | Example |
|---|---|---|---|
| Token names | Known token symbols and names | [TOKEN_A], [TOKEN_B] | ETH -> [TOKEN_A] |
| Dollar amounts | $N, N USDC, numeric with currency context | [AMOUNT_1], [AMOUNT_2] | $50,000 -> [AMOUNT_1] |
| Position sizes | Numeric values in position/size context | [SIZE_1], [SIZE_2] | 100 ETH -> [SIZE_1] [TOKEN_A] |
| Trade direction | buy, sell, long, short | [DIRECTION] | buy -> [DIRECTION] |
| Percentages | N% in strategy context | [PCT_1], [PCT_2] | rebalance at 5% -> rebalance at [PCT_1] |
| LP ranges | Tick ranges, price bounds | [RANGE_LOW], [RANGE_HIGH] | 1800-2200 -> [RANGE_LOW]-[RANGE_HIGH] |
| Time schedules | Cron expressions, time intervals, deadlines | [SCHEDULE_1] | every 4 hours -> [SCHEDULE_1] |
| Protocol names | Known DeFi protocol names | [PROTOCOL_A], [PROTOCOL_B] | Morpho -> [PROTOCOL_A] |
Redaction levels
| Level | Redacted signals | Use case |
|---|---|---|
none | Nothing (wallet masking from 07-safety.md still applies) | Read-only queries, general research |
standard | Token names, dollar amounts, position sizes, trade direction | Strategy reasoning, position analysis |
aggressive | All of standard + percentages, LP ranges, time schedules, protocol names | Admin operations, sensitive strategy logic |
Implementation
#![allow(unused)]
fn main() {
// crates/bardo-safety/src/redaction.rs
use regex::RegexSet;
use std::collections::HashMap;
#[derive(Debug, Clone, Copy, PartialEq, Eq)]
pub enum RedactionLevel { None, Standard, Aggressive }
/// Strategy-aware signal redaction. Session-scoped placeholder map assigns
/// incrementing placeholders ("[TOKEN_A]", "[AMOUNT_1]", etc.) and restores
/// originals in responses before delivery.
#[derive(Debug, Clone)]
pub struct StrategyRedactionConfig {
pub level: RedactionLevel,
pub placeholder_map: HashMap<String, String>,
patterns: RegexSet,
}
}
Strategy redaction is best-effort, not a formal privacy guarantee. LLMs lose domain knowledge when seeing [TOKEN_A] instead of ETH, so redacted prompts may produce lower-quality responses. Measurement is needed. Sophisticated analysis of placeholder patterns could still reveal strategy characteristics; redaction protects against casual observation.
Default by Warden risk tier (optional, deferred)
| Risk tier | Default redaction |
|---|---|
| Routine | none |
| Standard | none |
| Elevated | standard |
| High | aggressive |
| Critical | aggressive |
7. Per-agent cryptographic cache isolation
The semantic cache (02-caching.md) stores responses in an in-process HNSW index. In per-agent isolation mode, namespaces prevent cross-agent reads. Encryption at rest with per-agent keys adds a second layer.
Key derivation
#![allow(unused)]
fn main() {
// crates/bardo-cache/src/encryption.rs
use hkdf::Hkdf;
use sha2::Sha256;
/// HKDF-SHA-256(master, salt="bardo-cache-v1", info="agent:{id}", 32 bytes).
pub fn derive_agent_cache_key(master: &[u8; 32], agent_id: u128) -> [u8; 32] {
let hk = Hkdf::<Sha256>::new(Some(b"bardo-cache-v1"), master);
let mut okm = [0u8; 32];
hk.expand(format!("agent:{agent_id}").as_bytes(), &mut okm)
.expect("32-byte output valid for HKDF-SHA-256");
okm
}
/// Clade-shared key using ERC-8004 operatorOf() grouping.
pub fn derive_clade_cache_key(master: &[u8; 32], operator_address: &str) -> [u8; 32] {
let hk = Hkdf::<Sha256>::new(Some(b"bardo-cache-v1"), master);
let mut okm = [0u8; 32];
hk.expand(format!("clade:{operator_address}").as_bytes(), &mut okm)
.expect("32-byte output valid for HKDF-SHA-256");
okm
}
}
AES-256-GCM before storage. With AES-NI hardware acceleration, encrypt/decrypt takes <1us per operation. Negligible next to the 5ms semantic cache lookup.
Master key rotation invalidates all derived keys. Existing entries become undecryptable – treated as cache misses. Acceptable because TTLs are short (90-300s), and misses cost inference calls not data loss.
8. Gateway request/response signing
The gateway charges agents USDC for inference. Without signed receipts, it could deny what it returned, inflate costs, or misreport cache status. Ed25519 signed receipts provide non-repudiation.
GatewayReceipt
#![allow(unused)]
fn main() {
// crates/bardo-telemetry/src/receipt.rs
#[derive(Debug, Clone, Serialize)]
pub struct GatewayReceipt {
pub request_hash: String, // SHA-256 of request body sent to provider
pub response_hash: String, // SHA-256 of response body from provider
pub model: String,
pub provider: String,
pub input_tokens: u32,
pub output_tokens: u32,
pub total_charged_usdc: String, // decimal string, e.g. "0.003200"
pub timestamp: u64,
pub cache_status: CacheStatus,
pub security_class: SecurityLevel,
}
}
Signing
#![allow(unused)]
fn main() {
use ed25519_dalek::{Signer, SigningKey};
pub fn sign_receipt(receipt: &GatewayReceipt, signing_key: &SigningKey) -> String {
let canonical = serde_jcs::to_string(receipt).expect("GatewayReceipt serializable");
hex::encode(signing_key.sign(canonical.as_bytes()).to_bytes())
}
}
Every /v1/chat/completions response includes X-Bardo-Receipt (base64-encoded JSON receipt) and X-Bardo-Receipt-Signature (hex-encoded Ed25519 signature). The gateway publishes its public key at GET /.well-known/bardo-gateway-key in JWK format.
Guarantees
- Non-repudiation: The signed receipt binds the gateway’s key to the response hash.
- Cost integrity:
total_charged_usdcis signed and verifiable against x402 settlement. - Cache honesty: The gateway cannot claim a miss (full price) for a cached response.
- Privacy attestation:
security_classin the receipt confirms which privacy level was applied.
9. Provider trust and routing policies
Not all inference requests carry the same sensitivity. The gateway routes based on sensitivity, preferring providers with stronger data handling guarantees.
Provider trust matrix
| Provider | Retention | Zero-retention option | Notes |
|---|---|---|---|
| Venice | None | Structural | Zero retention by architecture, TEE-attested |
| BlockRun | None (x402 native) | N/A (no retention) | x402 settlement, no account relationship |
| Anthropic | 30 days (default) | Yes (API flag) | Zero-retention via anthropic-beta: no-log header |
| OpenAI | 30 days (default) | Yes (org setting) | Zero-retention via data processing addendum |
| Varies by product | Yes (Vertex AI) | Vertex AI offers no-logging; AI Studio does not |
Venice is first in this list because it is the only provider where zero retention is structural, not a policy flag. Policies change without notice. The gateway tracks lastVerifiedAt per provider and emits bardo_provider_trust_stale alerts when verification exceeds 90 days.
Routing rules
| Sensitivity | Provider constraint | Additional requirements |
|---|---|---|
low | Any healthy provider | None |
medium | Prefer zero-retention providers | None |
high | Zero-retention providers only | Strategy redaction (standard) required |
critical | Zero-retention providers only | Strategy redaction (aggressive) + full provenance record |
When no zero-retention provider is healthy for a high/critical request, the gateway returns HTTP 503 with a Retry-After header rather than falling back to a retaining provider.
10. Differential privacy on semantic cache
The semantic cache stores embedding vectors. An attacker with access to the cache storage could use nearest-neighbor search to reconstruct approximate query content. Gaussian noise injection provides differential privacy.
#![allow(unused)]
fn main() {
// crates/bardo-cache/src/privacy.rs
use rand_distr::{Distribution, Normal};
/// Add calibrated Gaussian noise to an embedding vector for differential privacy.
/// Re-normalizes to unit sphere after injection (nomic-embed-text-v1.5 produces
/// unit vectors). Adjust similarity threshold downward to compensate.
pub fn add_differential_privacy(embedding: &mut [f32], epsilon: f64) {
let sigma = 1.0 / epsilon;
let normal = Normal::new(0.0, sigma).expect("sigma must be positive");
let mut rng = rand::thread_rng();
for val in embedding.iter_mut() {
*val += normal.sample(&mut rng) as f32;
}
let norm: f32 = embedding.iter().map(|v| v * v).sum::<f32>().sqrt();
if norm > 0.0 {
for val in embedding.iter_mut() { *val /= norm; }
}
}
}
| Epsilon | Noise level | Threshold adjustment | Cache hit rate impact | Use case |
|---|---|---|---|---|
| 1.0 | High | -0.05 | ~5-8% reduction | Maximum privacy, shared caches |
| 3.0 | Moderate | -0.02 | ~2-3% reduction | Balanced |
| 5.0 | Low | -0.01 | <1% reduction | Minimal privacy impact (default) |
| 10.0 | Minimal | None | Negligible | Per-agent caches (isolation provides privacy) |
Default: epsilon=5.0. Enough noise to prevent exact reconstruction while preserving >99% cache hit rate.
11. Inference provenance records
Every inference request that leads to an on-chain action needs a complete provenance chain. Three linked records capture it: intent, policy decision, and inference details.
#![allow(unused)]
fn main() {
/// What the agent wanted to do (extracted from LLM output)
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct TransactionIntentSchema {
pub trace_id: String, // UUID v7 linking all three records
pub agent_id: u128,
pub intent_description: String,
pub operation_type: OperationType,
pub estimated_value_usdc: f64,
pub risk_tier: RiskTier,
pub timestamp: u64,
}
/// Which safety policies were evaluated and what they decided
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct PolicyDecisionRecord {
pub trace_id: String,
pub policies_evaluated: Vec<PolicyEvaluation>,
pub overall_decision: PolicyDecision,
pub blocking_policy: Option<String>,
pub timestamp: u64,
}
/// What inference was performed to support this action
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct InferenceRecord {
pub trace_id: String,
pub request_id: String, // links to audit trail (section 5)
pub model: String,
pub provider: String,
pub cost_usdc: f64,
pub optimizations: Vec<String>,
pub redaction_level: RedactionLevel,
pub security_class: SecurityLevel,
pub cache_status: CacheStatus,
pub timestamp: u64,
}
}
The provenance_hash of all three records is included in InferenceLogSigned.event_hash, binding provenance to the tamper-evident chain.
12. Trust gradient
Features activate progressively by Warden risk tier (optional, deferred). Read-only queries get lightweight logging; admin operations get maximum verification.
| Feature | Routine | Standard | Elevated | High | Critical |
|---|---|---|---|---|---|
| Audit trail | Standard (Clickhouse) | Hash-chain | Hash-chain | Hash-chain | Hash-chain |
| Receipt signing | No | Yes | Yes | Yes | Yes |
| Strategy redaction | None | None | Standard | Aggressive | Aggressive |
| Cache encryption | Per-agent key | Per-agent key | Per-agent key | Per-agent key | Per-agent key |
| Provider routing | Any healthy | Any healthy | Zero-retention preferred | Zero-retention required | Zero-retention required |
| Provenance record | No | IR only | Full (TIS + PDR + IR) | Full | Full |
| Merkle proof | No | No | On-demand | Mandatory | Mandatory |
| Security class | Standard | Standard | Confidential | Private | Private |
Metrics
| Metric | Type | Alert threshold |
|---|---|---|
bardo_audit_chain_length | Counter | – |
bardo_audit_anchor_lag | Gauge | >1000 unanchored events |
bardo_audit_verification_failures | Counter | >0 |
bardo_receipt_signing_latency_us | Histogram | P99 >500us |
bardo_cache_encryption_latency_us | Histogram | P99 >100us |
bardo_strategy_redaction_applied | Counter | – |
bardo_provider_trust_stale | Gauge | >0 (provider trust data >90 days old) |
bardo_provenance_records_created | Counter | – |
bardo_high_sensitivity_503 | Counter | >10/hour (no zero-retention providers available) |
bardo_private_inference_routed | Counter | – |
bardo_diem_consumed_daily | Gauge | >90% of allocation (close to cap) |
bardo_security_class_distribution | Counter | – |
Open questions
-
Redaction vs. quality: LLMs lose domain knowledge when seeing
[TOKEN_A]instead ofETH. Quality impact needs empirical measurement. Research queries may need exemption even at Elevated tier. -
Anchoring frequency: At scale (50K agents), the 256-event batch size needs dynamic scaling. 10K+ events per anchor keeps costs negligible but adds latency to proof availability.
-
Key management: The Ed25519 signing key and cache master key are single points of failure. HSM would help but violates no-external-dependencies. Needs key backup, rotation, and compromise response procedures before production.
-
Provider trust maintenance: Provider policies are self-reported and change without notice.
lastVerifiedAtalerts provide visibility, but manual re-verification is required. -
Venice availability: If Venice goes down, private-class requests return 503. Should the gateway allow explicit owner opt-in to fall back to a retaining provider with aggressive redaction? The default (hard fail) is safer but costs availability.
Cross-references
- 07-safety.md – PII detection via compiled regex, prompt injection defense via DeBERTa ONNX classifier, and audit logging for every inference request
- 02-caching.md – Three-layer cache stack (hash, semantic, prefix) with encryption at rest and differential privacy on embedding vectors
- 09-api.md – API reference with 33 endpoints including privacy headers, audit signing, and provenance query endpoints
- 08-observability.md – Per-agent cost attribution, OpenTelemetry traces, and Event Fabric integration for privacy-preserving telemetry
- 12-providers.md – Five provider backends including Venice private cognition deep-dive: TEE attestation, E2EE inference, and sensitivity classification
- prd2-extended/10-safety/02-warden.md – Optional Warden time-delay proxy: announce-wait-execute pattern for high-value transactions with cancel authority
- ../10-safety/00-defense.md – The full 15-layer defense model covering on-chain, runtime, and inference-layer protections
- 00-overview.md – Gateway architecture, x402 payment settlement flows, and dual API format support
- 10-roadmap.md – Phased delivery plan; privacy and trust features are spread across Phase 1 (basic auth) and Phase 2 (full audit trail)
- ../05-dreams/07-venice-dreaming.md – Venice-augmented dream cycles: private cognition during NREM replay and REM creative recombination