Audit-grade. Model-independent. Exportable on demand.
Memory Context & Intent Response
MCIR is an 8-stage memory protocol (Capture, Classify, Index, Retrieve, Validate, Apply, Expire, Audit) that runs before generation and rewrites every prompt with intent-scoped memory drawn from four layers: episodic, semantic, procedural, and the MCIR 2.0 Memory Chain. MCIR 2.0 adds conversation-level synthesis, open-loop tracking, agent meta-memories, and five durability rails (Bayesian shrinkage, per-model attribution, hard floors on user-declared facts, versioned replayable scoring, decay on everything) so the system stays correct under adversarial feedback, model swaps, and time.
THE BASE PIPELINE
MCIR-base is the protocol path every Theo turn travels before a response model is invoked. Each stage produces a typed, inspectable artifact that the next stage consumes.
Every turn is parsed into a structured TaskIntent (verb, object, modifiers, temporal scope, target artifact, and a confidence score). Canonical verbs (redo, resume, expand, edit, send, schedule, recall, forget) hit a sub-1 ms regex fast path; everything else falls back to a structured-JSON classifier with ~300 ms latency. Temporal cues like “yesterday” or “earlier today” are resolved to concrete ISO windows so retrieval is bounded.
Retrieval is intent-scoped, not keyword-scoped. The assembler walks four layers (episodic conversation turns, semantic facts, procedural preferences, and the new MCIR 2.0 chain layer), pulling only the fragments the resolved verb actually needs. Fragments are provenance-tagged with memoryId / messageId / artifactId / conversationId / createdAt, ranked by salience, and capped at 16 per turn.
The fuser produces a structured MCIREnvelope: originalPrompt, intent, ranked memoryFragments[], a reconstructedPrompt that adds minimal [ref: …] / [when: …] anchors, and a natural-language systemBlock for models that haven't been trained on the protocol natively. The envelope is the unit of truth that downstream generation reads.
The orchestrator swaps the original prompt for the reconstructedPrompt, appends the systemBlock to skill prompt extensions, and dispatches to the appropriate engine. The protocol is model-independent: same envelope, any qualified routing target, with full failover and a per-provider circuit breaker.
THE 8-STAGE PROTOCOL
The same eight stages govern every memory the system touches, from the first message a customer sends to the last audit event written when a memory expires.
Every interaction surfaces facts, preferences, decisions, and unresolved questions. Captured the moment they happen, without manual tagging.
Each memory lands on the right shelf: user preference, business rule, task, relationship, risk signal, or compliance note. Never one-bucket-fits-all.
Memory is scoped to the right owner (user, account, organization, workflow) with strict access controls and tenant isolation enforced at the row level.
Only memories relevant to the current task surface. Intent-scoped retrieval beats keyword search: pull what the AI needs for this turn, not everything we know.
Is this memory still fresh, still allowed, still trusted? Stale or contradicted entries are flagged before they touch a single response.
Memories are applied with provenance: as facts, preferences, warnings, or instructions. Never blended into a soup. Every use is explainable.
Bayesian decay, revalidation windows, hard floors on user-declared facts. Memory ages out on schedule, not because it leaked.
Every retrieve, use, ignore, create, and update is logged with hash-anchored provenance. A full receipt for every AI decision.
FOUR MEMORY LAYERS
Each layer answers a different question. Each has its own shape, retention rules, and retrieval semantics. All governed by the same protocol.
Conversation turns and the artifacts they produced. Timeline-indexed so phrases like “the PDF from last week” resolve to a real ISO window before retrieval begins.
Distilled facts about customers, accounts, and the business. Source-tagged, confidence-scored, freshness-aware. Promoted automatically when the same entity appears in two or more chains.
Learned workflows and business rules. Repeated patterns get promoted into durable procedural memory so AI Workers learn how things are done here, not how they're done in general.
Conversation-level synthesis. Shape: { kind, title, summary, entities, outcome }. Open loops and agent meta-memories ride here so a fresh thread feels like Theo already knows the user, the context, and what's unresolved.
MCIR 2.0 / MEMORY CHAIN
MCIR 2.0 compresses every conversation into a compact memory token, tracks unresolved decisions as open loops, and lets a fresh thread inherit context that a vector store can't represent. It rides alongside MCIR-base behind a single flag and depends on no self-hosted infrastructure.
{
kind: "quest" | "task" | "qa" | "service" | "search",
title: "≤ 120 chars",
summary: "≤ 800 chars",
entities: [ { kind, value }, ... ],
outcome: "resolved" | "partial" | "open" | "abandoned"
}One row per conversation. Holds kind, title, summary, entities, outcome, topic vector, and a Bayesian salience posterior keyed per model.
Unresolved decisions, questions, promises, and pending actions extracted from chains. Auto-closes superseded loops on each synthesis pass.
Theo-side learnings about how to behave with this user: style preferences, format preferences, interaction patterns, failure modes. Supports immutable rows that the scoring loop must skip.
Append-only replay log of accept / reject / silence / regenerate / edit signals. The source of truth for salience. Chains, loops, and meta rows store derived posteriors only.
Versioned, replayable derived salience for admin observability. Every recompute writes a fresh row; nothing is ever overwritten.
score = entityOverlap × recencyMultiplier × effectiveSalience × kindBoost entityOverlap ∈ [0, 1] // normalized so 20-entity chains aren't unfairly favored recencyMultiplier exp decay // 14-day half-life effectiveSalience 0.7·base + 0.3·learned[currentModelId] (clamped to [0, 1]) kindBoost 1.2× when verb prefers chain.kind, else 1.0 salience floor drop chains scoring < 0.15
Open loops always pull the top 2 with status=open, ordered by due-by hint then salience, surfaced proactively even when the prompt didn't ask. Agent meta-memories pull the top 5 with confidence ≥ 0.6 or immutable=true.
FIVE DURABILITY RAILS
Persistent memory is only valuable if it stays correct. Five rails keep adversarial feedback, model swaps, and time decay from corrupting the graph.
Single events can never move a memory's salience by more than 1 / (N + 8). Adversarial “fine, whatever” feedback cannot poison a chain.
Every outcome event is tagged with the model that produced the response. Salience is keyed per modelId; when routing swaps brains, recall reads the matching track. Posteriors don't bleed across models.
Memories marked immutable by the user (or explicitly set through the Settings UI) are contractually exempt from the scoring loop. The system cannot drift away from what you told it.
The outcome event log is the source of truth. Scoring is versioned; when the function changes, the new version re-runs over the full event log and writes fresh snapshots. Audits can always reproduce a number.
Posteriors multiply by a weekly decay factor (default 0.98) per week of inactivity. Recency multiplier on recall uses a 14-day half-life. Memory that no one revisits naturally drifts toward the prior.
WHY MCIR IS DIFFERENT
The market has converged on four memory patterns. None of them give you provenance, replayable scoring, open-loop tracking, and per-model attribution at once. MCIR 2.0 does.
| Capability | MCIR 2.0Theo | Closed chatbot memoryConsumer assistant | Project-scoped memoryTeam workspace | Vector-only memoryChunk store | Agent memory storesPer-agent notes |
|---|---|---|---|---|---|
| Memory shape | Typed graph (episodic / semantic / procedural / chain) + open loops | Opaque per-user notes | Per-project transcript window | Undifferentiated chunk store | Key/value notes per agent |
| Retrieval | Intent-scoped (verb + temporal + artifact) | Keyword/semantic blend | Whole-project recall | Cosine-similarity over chunks | Tool call returns a note |
| Cross-conversation continuity | Memory chains + open loops proactively resurface | Single thread only | Single project only | Vector match per turn | Manual |
| Per-model attribution | Salience posteriors keyed by modelId | None | None | None | Optional, agent-defined |
| Adversarial-feedback robustness | Bayesian shrinkage 1 / (N+8) per event | Opaque | Opaque | None. All chunks equal | Agent-implementation specific |
| Replayable scoring | Versioned outcome log + snapshot table | No | No | No | Implementation dependent |
| Cross-channel reach | Web, API, playground, voice, embed, MCP, Telegram, WhatsApp | Channel-locked to vendor surface | Channel-locked | Whatever you wire | Whatever you wire |
| Portability | Open schema; one-call export of chains, loops, meta, outcomes | None | None | Yours, but unstructured | Yours, but unstructured |
| Training-on-memory | Never. Audit stores SHA-256 hashes only. | Often | Sometimes | Yours to decide | Yours to decide |
Categories describe capability shapes, not specific products. Closed chatbot memory = the per-user notes most consumer assistants keep. Project-scoped memory = team-workspace memory bound to a single project. Vector-only memory = retrieval over an undifferentiated chunk store. Agent memory stores = key/value notes maintained by an individual agent.
CROSS-CHANNEL
The MCIR envelope is the same whether the turn arrives from a REST call, a live voice session, a widget on a partner site, or a Telegram bot. Memory continuity survives the channel jump because the protocol, not the surface, owns the graph.
POST /api/v1/completions runs MCIR ahead of generation. Same envelope, branded model IDs, OpenAI-compatible streaming format available.
Authenticated playground hits the same orchestrator path. Memory Chain tool runs surface inline so devs can see what the protocol pulled.
Voice sessions consume the same memory graph through a tool-bridged contract. Skills, agent meta, and open loops feed the voice persona.
White-label widgets share the parent API key's memory graph subject to per-iframe scopes. Same provenance, different surface.
@hitheo/sdk, @hitheo/telegram, @hitheo/whatsapp, and @hitheo/mcp are open source. Memory continuity survives the channel jump.
ROADMAP
Five new tables, debounced BullMQ synthesizer, recall scoring, fuser sections, orchestrator integration, Memory chain tool run, gated behind MCIR_CHAIN_ENABLED.
Accept / reject / silence / regenerate / edit signals captured across every surface. score_v1 writes mcir_score_snapshots nightly. Salience reads in retrieval go live.
Conversation Health, Drill-in, Score A/B, Health Aggregates, and Magic Moments dashboards. User-side excludeFromTrainingView Settings toggle.
topic_vector promoted to a real vector column. pullRelevantChains switches from entity-overlap to cosine similarity. 30 / 180-day compression keeps the token budget bounded as memory grows.
Beta posteriors per fact (not just per chain). Federated, consented cross-user memory (“your spouse's Theo told you they prefer SMS”). Paper write-up.
FREQUENTLY ASKED
MCIR is Theo's Memory Context & Intent Response protocol. It runs in front of the response generator and rewrites every prompt with intent-driven memory context before the target model ever sees it. The protocol is model-independent. Any qualified response model can consume the envelope.
MCIR 2.0 is the Memory Chain layer that sits alongside MCIR-base. It compresses each conversation into a compact memory token (kind, title, summary, entities, outcome) and surfaces relevant prior tokens, open loops, and agent meta-memories proactively in new threads. Five new tables back it: memory_chains, open_loops, agent_meta_memories, memory_outcome_events, and mcir_score_snapshots.
Ordinary chatbot memory is an opaque per-user note. MCIR is a typed graph with four layers (episodic, semantic, procedural, chain), intent-scoped retrieval, open loops that surface proactively, per-model salience posteriors, replayable scoring, and an open export schema.
A memory chain is a conversation-level synthesis row: { kind, title, summary, entities, outcome }. Chains are upserted on conversationId after a debounced trigger (3-turn burst or 5-minute idle), and recall scores them by entity overlap, a 14-day half-life recency multiplier, an effective salience, and a kind boost when the verb matches.
An open loop is an unresolved decision, question, promise, or pending action extracted from a chain. Open loops surface proactively on new turns (even if the user didn't ask) and auto-close as superseded when a new synthesis pass invalidates them.
Through Bayesian shrinkage to a prior. Any single outcome event can move a memory's salience by at most 1 / (N + 8), where N is the existing evidence count. A burst of bad signals cannot poison a chain. Per-model attribution further isolates outcomes; outcomes from one model don't pollute the recall track of another.
Yes. One endpoint returns every chain, open loop, agent meta-memory, and outcome event for your account in a documented schema. No vendor lock-in. The same surface supports destructive delete: primary stores and replicas, no shadow copies.
No. The training-on-data count is zero, by design. Theo monetizes orchestration, not training. The audit ledger stores SHA-256 hashes of prompts, not the prompts themselves. The MCIR envelope, knowledge files, and the memory graph remain customer-owned.
Yes. MCIR produces a structured envelope (originalPrompt, intent, ranked memoryFragments[], reconstructedPrompt, systemBlock) that any qualified routing target can consume. The systemBlock provides a natural-language form for models that haven't been trained on the protocol natively.
memory_outcome_events is append-only and tagged with MCIR_SCORING_VERSION. mcir_score_snapshots stores derived posteriors per version, per entity. When the scoring function changes, the new version re-runs over the full event log and writes a fresh snapshot. Old versions remain queryable.
Get your API key in 30 seconds. MCIR ships on by default, governed from day one, exportable on demand.