SPECIFICATION · v2.0

The memory protocol
behind every AI Worker.

Audit-grade. Model-independent. Exportable on demand.

Memory

Context

Intent

Response

Memory Context & Intent Response

Protocol stages

Memory layers

Durability rails

Training on customer memory

TL;DR

MCIR is an 8-stage memory protocol (Capture, Classify, Index, Retrieve, Validate, Apply, Expire, Audit) that runs before generation and rewrites every prompt with intent-scoped memory drawn from four layers: episodic, semantic, procedural, and the MCIR 2.0 Memory Chain. MCIR 2.0 adds conversation-level synthesis, open-loop tracking, agent meta-memories, and five durability rails (Bayesian shrinkage, per-model attribution, hard floors on user-declared facts, versioned replayable scoring, decay on everything) so the system stays correct under adversarial feedback, model swaps, and time.

THE BASE PIPELINE

Four stages run before every response.

MCIR-base is the protocol path every Theo turn travels before a response model is invoked. Each stage produces a typed, inspectable artifact that the next stage consumes.

01
Intent resolution
Every turn is parsed into a structured TaskIntent (verb, object, modifiers, temporal scope, target artifact, and a confidence score). Canonical verbs (redo, resume, expand, edit, send, schedule, recall, forget) hit a sub-1 ms regex fast path; everything else falls back to a structured-JSON classifier with ~300 ms latency. Temporal cues like “yesterday” or “earlier today” are resolved to concrete ISO windows so retrieval is bounded.
src/lib/server/mcir/intent-resolver.ts
02
Memory context assembly
Retrieval is intent-scoped, not keyword-scoped. The assembler walks four layers (episodic conversation turns, semantic facts, procedural preferences, and the new MCIR 2.0 chain layer), pulling only the fragments the resolved verb actually needs. Fragments are provenance-tagged with memoryId / messageId / artifactId / conversationId / createdAt, ranked by salience, and capped at 16 per turn.
src/lib/server/mcir/memory-context-assembler.ts
03
Context fusion
The fuser produces a structured MCIREnvelope: originalPrompt, intent, ranked memoryFragments[], a reconstructedPrompt that adds minimal [ref: …] / [when: …] anchors, and a natural-language systemBlock for models that haven't been trained on the protocol natively. The envelope is the unit of truth that downstream generation reads.
src/lib/server/mcir/context-fuser.ts
04
Response generation
The orchestrator swaps the original prompt for the reconstructedPrompt, appends the systemBlock to skill prompt extensions, and dispatches to the appropriate engine. The protocol is model-independent: same envelope, any qualified routing target, with full failover and a per-provider circuit breaker.
src/lib/server/chat/orchestrator.ts

THE 8-STAGE PROTOCOL

Capture → Classify → Index → Retrieve → Validate → Apply → Expire → Audit.

The same eight stages govern every memory the system touches, from the first message a customer sends to the last audit event written when a memory expires.

01Capture
Every interaction surfaces facts, preferences, decisions, and unresolved questions. Captured the moment they happen, without manual tagging.
02Classify
Each memory lands on the right shelf: user preference, business rule, task, relationship, risk signal, or compliance note. Never one-bucket-fits-all.
03Index
Memory is scoped to the right owner (user, account, organization, workflow) with strict access controls and tenant isolation enforced at the row level.
04Retrieve
Only memories relevant to the current task surface. Intent-scoped retrieval beats keyword search: pull what the AI needs for this turn, not everything we know.
05Validate
Is this memory still fresh, still allowed, still trusted? Stale or contradicted entries are flagged before they touch a single response.
06Apply
Memories are applied with provenance: as facts, preferences, warnings, or instructions. Never blended into a soup. Every use is explainable.
07Expire
Bayesian decay, revalidation windows, hard floors on user-declared facts. Memory ages out on schedule, not because it leaked.
08Audit
Every retrieve, use, ignore, create, and update is logged with hash-anchored provenance. A full receipt for every AI decision.

FOUR MEMORY LAYERS

Not a vector blob. A typed graph.

Each layer answers a different question. Each has its own shape, retention rules, and retrieval semantics. All governed by the same protocol.

Episodic

What happened, and when?

Conversation turns and the artifacts they produced. Timeline-indexed so phrases like “the PDF from last week” resolve to a real ISO window before retrieval begins.

e.g. User uploaded a policy PDF on 2026-05-09 at 14:22 UTC.

Semantic

What do we know?

Distilled facts about customers, accounts, and the business. Source-tagged, confidence-scored, freshness-aware. Promoted automatically when the same entity appears in two or more chains.

e.g. Household of four · prefers SMS · active ICHRA policy.

Procedural

How do we work?

Learned workflows and business rules. Repeated patterns get promoted into durable procedural memory so AI Workers learn how things are done here, not how they're done in general.

e.g. Always offer two carriers before recommending a plan.

Memory Chain

What threads are still open?

Conversation-level synthesis. Shape: { kind, title, summary, entities, outcome }. Open loops and agent meta-memories ride here so a fresh thread feels like Theo already knows the user, the context, and what's unresolved.

e.g. Open loop: broker promised follow-up before open enrollment.

MCIR 2.0 / MEMORY CHAIN

Conversation-level synthesis with open loops.

MCIR 2.0 compresses every conversation into a compact memory token, tracks unresolved decisions as open loops, and lets a fresh thread inherit context that a vector store can't represent. It rides alongside MCIR-base behind a single flag and depends on no self-hosted infrastructure.

MEMORY TOKEN

{
  kind:    "quest" | "task" | "qa" | "service" | "search",
  title:   "≤ 120 chars",
  summary: "≤ 800 chars",
  entities: [ { kind, value }, ... ],
  outcome: "resolved" | "partial" | "open" | "abandoned"
}

memory_chains

One row per conversation. Holds kind, title, summary, entities, outcome, topic vector, and a Bayesian salience posterior keyed per model.

open_loops

Unresolved decisions, questions, promises, and pending actions extracted from chains. Auto-closes superseded loops on each synthesis pass.

agent_meta_memories

Theo-side learnings about how to behave with this user: style preferences, format preferences, interaction patterns, failure modes. Supports immutable rows that the scoring loop must skip.

memory_outcome_events

Append-only replay log of accept / reject / silence / regenerate / edit signals. The source of truth for salience. Chains, loops, and meta rows store derived posteriors only.

mcir_score_snapshots

Versioned, replayable derived salience for admin observability. Every recompute writes a fresh row; nothing is ever overwritten.

Recall scoring

score = entityOverlap × recencyMultiplier × effectiveSalience × kindBoost

  entityOverlap       ∈ [0, 1]      // normalized so 20-entity chains aren't unfairly favored
  recencyMultiplier   exp decay     // 14-day half-life
  effectiveSalience   0.7·base + 0.3·learned[currentModelId]   (clamped to [0, 1])
  kindBoost           1.2× when verb prefers chain.kind, else 1.0
  salience floor      drop chains scoring < 0.15

Open loops always pull the top 2 with status=open, ordered by due-by hint then salience, surfaced proactively even when the prompt didn't ask. Agent meta-memories pull the top 5 with confidence ≥ 0.6 or immutable=true.

FIVE DURABILITY RAILS

Why MCIR doesn't drift, poison, or lie.

Persistent memory is only valuable if it stays correct. Five rails keep adversarial feedback, model swaps, and time decay from corrupting the graph.

Bayesian shrinkage to a prior

Single events can never move a memory's salience by more than 1 / (N + 8). Adversarial “fine, whatever” feedback cannot poison a chain.

Δsalience ≤ 1 / (N + 8)

Per-model attribution

Every outcome event is tagged with the model that produced the response. Salience is keyed per modelId; when routing swaps brains, recall reads the matching track. Posteriors don't bleed across models.

effectiveSalience = 0.7·base + 0.3·learned[model]

Hard floor on user-declared facts

Memories marked immutable by the user (or explicitly set through the Settings UI) are contractually exempt from the scoring loop. The system cannot drift away from what you told it.

Versioned, replayable scoring

The outcome event log is the source of truth. Scoring is versioned; when the function changes, the new version re-runs over the full event log and writes fresh snapshots. Audits can always reproduce a number.

MCIR_SCORING_VERSION → snapshots, never overwrites

Decay on everything

Posteriors multiply by a weekly decay factor (default 0.98) per week of inactivity. Recency multiplier on recall uses a 14-day half-life. Memory that no one revisits naturally drifts toward the prior.

recall recency: half-life ≈ 14 days

WHY MCIR IS DIFFERENT

Compared to four common memory shapes.

The market has converged on four memory patterns. None of them give you provenance, replayable scoring, open-loop tracking, and per-model attribution at once. MCIR 2.0 does.

Capability	MCIR 2.0Theo	Closed chatbot memoryConsumer assistant	Project-scoped memoryTeam workspace	Vector-only memoryChunk store	Agent memory storesPer-agent notes
Memory shape	Typed graph (episodic / semantic / procedural / chain) + open loops	Opaque per-user notes	Per-project transcript window	Undifferentiated chunk store	Key/value notes per agent
Retrieval	Intent-scoped (verb + temporal + artifact)	Keyword/semantic blend	Whole-project recall	Cosine-similarity over chunks	Tool call returns a note
Cross-conversation continuity	Memory chains + open loops proactively resurface	Single thread only	Single project only	Vector match per turn	Manual
Per-model attribution	Salience posteriors keyed by modelId	None	None	None	Optional, agent-defined
Adversarial-feedback robustness	Bayesian shrinkage 1 / (N+8) per event	Opaque	Opaque	None. All chunks equal	Agent-implementation specific
Replayable scoring	Versioned outcome log + snapshot table	No	No	No	Implementation dependent
Cross-channel reach	Web, API, playground, voice, embed, MCP, Telegram, WhatsApp	Channel-locked to vendor surface	Channel-locked	Whatever you wire	Whatever you wire
Portability	Open schema; one-call export of chains, loops, meta, outcomes	None	None	Yours, but unstructured	Yours, but unstructured
Training-on-memory	Never. Audit stores SHA-256 hashes only.	Often	Sometimes	Yours to decide	Yours to decide

Categories describe capability shapes, not specific products. Closed chatbot memory = the per-user notes most consumer assistants keep. Project-scoped memory = team-workspace memory bound to a single project. Vector-only memory = retrieval over an undifferentiated chunk store. Agent memory stores = key/value notes maintained by an individual agent.

CROSS-CHANNEL

One protocol. Every channel your customers use.

The MCIR envelope is the same whether the turn arrives from a REST call, a live voice session, a widget on a partner site, or a Telegram bot. Memory continuity survives the channel jump because the protocol, not the surface, owns the graph.

REST API

POST /api/v1/completions runs MCIR ahead of generation. Same envelope, branded model IDs, OpenAI-compatible streaming format available.

Dashboard playground

Authenticated playground hits the same orchestrator path. Memory Chain tool runs surface inline so devs can see what the protocol pulled.

Voice (real-time)

Voice sessions consume the same memory graph through a tool-bridged contract. Skills, agent meta, and open loops feed the voice persona.

Embeddable iframes

White-label widgets share the parent API key's memory graph subject to per-iframe scopes. Same provenance, different surface.

Open-source adapters

@hitheo/sdk, @hitheo/telegram, @hitheo/whatsapp, and @hitheo/mcp are open source. Memory continuity survives the channel jump.

ROADMAP

What's shipped, what's next, what's planned.

Phase 0.5a
Memory Chain schema + synthesizer
Five new tables, debounced BullMQ synthesizer, recall scoring, fuser sections, orchestrator integration, Memory chain tool run, gated behind MCIR_CHAIN_ENABLED.
SHIPPED
Phase 0.5b
Outcome capture + score v1
Accept / reject / silence / regenerate / edit signals captured across every surface. score_v1 writes mcir_score_snapshots nightly. Salience reads in retrieval go live.
NEXT
Phase 0.5c
Admin training view
Conversation Health, Drill-in, Score A/B, Health Aggregates, and Magic Moments dashboards. User-side excludeFromTrainingView Settings toggle.
PLANNED
Phase 1
Vector recall + hierarchical compression
topic_vector promoted to a real vector column. pullRelevantChains switches from entity-overlap to cosine similarity. 30 / 180-day compression keeps the token budget bounded as memory grows.
PLANNED
Phase 2
Probabilistic belief-state memory
Beta posteriors per fact (not just per chain). Federated, consented cross-user memory (“your spouse's Theo told you they prefer SMS”). Paper write-up.
PLANNED

FREQUENTLY ASKED

The questions developers and compliance teams actually ask.

What is MCIR?

MCIR is Theo's Memory Context & Intent Response protocol. It runs in front of the response generator and rewrites every prompt with intent-driven memory context before the target model ever sees it. The protocol is model-independent. Any qualified response model can consume the envelope.

What is MCIR 2.0?

MCIR 2.0 is the Memory Chain layer that sits alongside MCIR-base. It compresses each conversation into a compact memory token (kind, title, summary, entities, outcome) and surfaces relevant prior tokens, open loops, and agent meta-memories proactively in new threads. Five new tables back it: memory_chains, open_loops, agent_meta_memories, memory_outcome_events, and mcir_score_snapshots.

How is MCIR different from ordinary chatbot memory?

Ordinary chatbot memory is an opaque per-user note. MCIR is a typed graph with four layers (episodic, semantic, procedural, chain), intent-scoped retrieval, open loops that surface proactively, per-model salience posteriors, replayable scoring, and an open export schema.

What is a memory chain?

A memory chain is a conversation-level synthesis row: { kind, title, summary, entities, outcome }. Chains are upserted on conversationId after a debounced trigger (3-turn burst or 5-minute idle), and recall scores them by entity overlap, a 14-day half-life recency multiplier, an effective salience, and a kind boost when the verb matches.

What is an open loop?

An open loop is an unresolved decision, question, promise, or pending action extracted from a chain. Open loops surface proactively on new turns (even if the user didn't ask) and auto-close as superseded when a new synthesis pass invalidates them.

How does MCIR handle adversarial feedback?

Through Bayesian shrinkage to a prior. Any single outcome event can move a memory's salience by at most 1 / (N + 8), where N is the existing evidence count. A burst of bad signals cannot poison a chain. Per-model attribution further isolates outcomes; outcomes from one model don't pollute the recall track of another.

Can memory be exported?

Yes. One endpoint returns every chain, open loop, agent meta-memory, and outcome event for your account in a documented schema. No vendor lock-in. The same surface supports destructive delete: primary stores and replicas, no shadow copies.

Does Theo train models on customer memory?

No. The training-on-data count is zero, by design. Theo monetizes orchestration, not training. The audit ledger stores SHA-256 hashes of prompts, not the prompts themselves. The MCIR envelope, knowledge files, and the memory graph remain customer-owned.

Is the protocol model-independent?

Yes. MCIR produces a structured envelope (originalPrompt, intent, ranked memoryFragments[], reconstructedPrompt, systemBlock) that any qualified routing target can consume. The systemBlock provides a natural-language form for models that haven't been trained on the protocol natively.

How is MCIR's scoring replayable?

memory_outcome_events is append-only and tagged with MCIR_SCORING_VERSION. mcir_score_snapshots stores derived posteriors per version, per entity. When the scoring function changes, the new version re-runs over the full event log and writes a fresh snapshot. Old versions remain queryable.

Ship AI Workers with a memory you can audit.

Get your API key in 30 seconds. MCIR ships on by default, governed from day one, exportable on demand.

START FREE →BUILD AN AI WORKER

The memory protocolbehind every AI Worker.