Introduction

Overview

Current AI memory solutions face significant scalability challenges. Most rely on "explicit modeling", requiring humans to continuously specify which information is important and which is not. This approach fundamentally limits an AI system's ability to understand what truly matters to each user, and makes it difficult for the system to retain the most critical, user-specific information.

In addition, existing solutions often adopt a "one-size-fits-all" strategy, applying the same memory mechanism across all scenarios. Conversation memory, workspace knowledge, and learned skills are three different treatments of source material with three different read models — forcing them through one pipeline and one store creates coupling, wasted work, and provenance gaps.

What is MemU?

MemU is an agentic memory framework designed for LLMs and AI agents. It organizes memory into three independent lineschat, workspace, and skill — each fed by its own source, owning its own store, and rendering its own markdown output. All three lines share a single layered kernel: record stores, embedding, hybrid search (embedding + BM25), ranking, and markdown rendering.

Every line transforms its raw sources into human-readable markdown — MEMORY.md for conversations, INDEX.md for workspace files, SKILL.md for execution traces — while keeping fine-grained, searchable items in the store. Every retrieved result can be traced back through "item → document → raw source", so memory stays transparent and auditable.

Three Independent Memory Lines

Each line is end-to-end self-contained: source → ingest → store → read → output. Lines never trigger each other — a new conversation never rebuilds the workspace index or re-synthesizes skills.

LineSourceOutputWhat it captures
ChatConversation logsMEMORY.mdUser facts, preferences, events — classified into memory categories
WorkspaceWorkspace files (multimodal)INDEX.mdDocuments, images, audio, video — captioned and indexed
SkillExecution / tool tracesSKILL.mdReusable skills distilled from agent runs

The L0 / L1 / L2 Layer Model

Every line runs the same three representation layers, each derived from the one below (L0 → L1 → L2):

LayerRoleDescription
L0ResourceThe raw source — chat corpus, multimodal files, agent-run logs
L1DocumentA coarse, readable document derived from L0 — memory category file, caption paragraph, or skill markdown
L2ItemFine slices/extracts of the L1 document — the embed/search unit

The layers maintain full traceability: every L2 item is a slice of an L1 document, and every L1 document derives from an L0 resource. Retrieval hits L2 items and rolls up to the L1 document (and its L0 resource) for the result — high transparency, interpretability, and robust provenance.

Core Processes

Each line runs the same core processes over its own store:

  • Memorization — Runs the line's ingest pipeline: preprocess (raw L0 → L1 document) → slice/extract (L1 → L2 items) → embed. Fully autonomous, no manual labeling.
  • Retrieval — A single hybrid pass over the L2 items: cosine embedding similarity and BM25 keyword scores are normalized and fused into one rank; top hits roll up to their L1 document.
  • Independent Triggers — Each line watches its own source manifest. A change under one line's source rebuilds only that line.

MemU vs. Traditional Memory Systems

Different memory approaches serve different purposes. Here's how MemU compares:

FeatureTraditional SystemsMemU
StructureOne store for everythingThree independent lines, each with its own store and markdown output
Memory FormationExplicit modeling (manual)Autonomous per-line ingest pipelines
RetrievalEmbedding search onlyHybrid: embedding + BM25 keyword, fused in one pass
MultimodalLimited supportFull multimodal (text, image, audio, video)
TraceabilityNoFull item → document → source roll-up
Change HandlingGlobal rebuildsPer-line manifest diff — only the affected line rebuilds
Skill MemoryNoDedicated skill line with its own provenance and deletion

Hybrid Retrieval (Embedding + BM25)

All three lines retrieve the same way — one shared mechanism, no per-line forks:

  • Embedding score — Cosine similarity between the query vector and each L2 item's embedding captures semantic matches
  • BM25 score — Keyword ranking over the same items captures exact-term matches (identifiers, names, error codes)
  • Fusion — Both scores are min-max normalized across candidates and fused into a single rank; top items roll up to their L1 document

This is a standard, cheap, single-pass design: no graph traversal, no multi-hop loops, no per-query LLM ranking cost. It covers both semantic and exact-term queries while staying fast and deterministic.

Next Steps

Read Memory Structure to understand the three lines and the L0/L1/L2 layer model in detail, or jump to Getting Started to begin using MemU.