Deep Analysis

MemU是为LLM和AI智能体设计的记忆框架，通过分层文件系统管理多模态数据，支持RAG和LLM双重检索。

Recommended

Core Features

分层文件系统

三层架构：资源-项-类别，实现完整可追溯的数据组织

双重检索方法

RAG快速检索与LLM深度语义理解相结合

多模态支持

处理对话、文档、图像、音频、视频等多种数据格式

自适应记忆进化

根据使用模式自动调整和优化记忆结构

Technical Implementation

Architecture:三层分层架构：Resource(原始多模态数据仓库) → Item(离散记忆单元) → Category(聚合文本记忆)

Execution Flow:

接收多模态输入

对话、文档、图像等

模态识别与预处理

格式转换、内容解析

结构化提取

使用视觉/NLM模型提取语义信息

分层组织

构建Resource-Item-Category树状结构

双引擎检索

RAG向量搜索或LLM语义匹配

Key Components:

Vision Models图像/视频处理

Embedding ModelsRAG向量化

LLM语义理解和生成

Highlights

支持5种模态(文本、图像、视频、音频、对话)的统一处理框架
双检索机制平衡速度(RAG)与理解深度(LLM)
自演进能力，记忆结构动态优化适应用户行为
完整的云服务方案(memu.so)降低部署门槛

Use Cases

个人助手：长期记忆个人偏好、技能、习惯和关系网络
AI客服系统：累积客户历史信息提供个性化服务
知识管理：从多渠道文档提取并组织企业知识库
多轮对话系统：为AI智能体提供持久的上下文记忆支撑

Limitations

仅支持Python 3.13+，生态限制
多模态处理依赖外部模型性能

Tech Stack

Python 3.13+Vision ModelsEmbedding ModelsLLM

README

View on GitHub

MemU

A Future-Oriented Agentic Memory System

MemU is an agentic memory framework for LLM and AI agent backends. It receives multimodal inputs (conversations, documents, images), extracts them into structured memory, and organizes them into a hierarchical file system that supports both embedding-based (RAG) and non-embedding (LLM) retrieval.

⭐️ Star the repository

If you find memU useful or interesting, a GitHub Star ⭐️ would be greatly appreciated.

MemU is collaborating with four open-source projects to launch the 2026 New Year Challenge. 🎉Between January 8–18, contributors can submit PRs to memU and earn cash rewards, community recognition, and platform credits. 🎁Learn more & get involved

✨ Core Features

Feature	Description
🗂️ Hierarchical File System	Three-layer architecture: Resource → Item → Category with full traceability
🔍 Dual Retrieval Methods	RAG (embedding-based) for speed, LLM (non-embedding) for deep semantic understanding
🎨 Multimodal Support	Process conversations, documents, images, audio, and video
🔄 Self-Evolving Memory	Memory structure adapts and improves based on usage patterns

🗂️ Hierarchical File System

MemU organizes memory using a three-layer architecture inspired by hierarchical storage systems:

Layer	Description	Examples
Resource	Raw multimodal data warehouse	JSON conversations, text documents, images, videos
Item	Discrete extracted memory units	Individual preferences, skills, opinions, habits
Category	Aggregated textual memory with summaries	`preferences.md`, `work_life.md`, `relationships.md`

Key Benefits:

Full Traceability: Track from raw data → items → categories and back
Progressive Summarization: Each layer provides increasingly abstracted views
Flexible Organization: Categories evolve based on content patterns

🎨 Multimodal Support

MemU processes diverse content types into unified memory:

Modality	Input	Processing
`conversation`	JSON chat logs	Extract preferences, opinions, habits, relationships
`document`	Text files (.txt, .md)	Extract knowledge, skills, facts
`image`	PNG, JPG, etc.	Vision model extracts visual concepts and descriptions
`video`	Video files	Frame extraction + vision analysis
`audio`	Audio files	Transcription + text processing

All modalities are unified into the same three-layer hierarchy, enabling cross-modal retrieval.

🚀 Quick Start

Option 1: Cloud Version

Try MemU instantly without any setup:

👉 memu.so - Hosted cloud service with full API access

For enterprise deployment and custom solutions, contact info@nevamind.ai

Cloud API (v3)

Base URL	`https://api.memu.so`
Auth	`Authorization: Bearer YOUR_API_KEY`

Method	Endpoint	Description
`POST`	`/api/v3/memory/memorize`	Register a memorization task
`GET`	`/api/v3/memory/memorize/status/{task_id}`	Get task status
`POST`	`/api/v3/memory/categories`	List memory categories
`POST`	`/api/v3/memory/retrieve`	Retrieve memories (semantic search)

📚 Full API Documentation

Option 2: Self-Hosted

Installation

pip install -e .

Basic Example

Requirements: Python 3.13+ and an OpenAI API key

Test with In-Memory Storage (no database required):

export OPENAI_API_KEY=your_api_key
cd tests
python test_inmemory.py

Test with PostgreSQL Storage (requires pgvector):

# Start PostgreSQL with pgvector
docker run -d \
  --name memu-postgres \
  -e POSTGRES_USER=postgres \
  -e POSTGRES_PASSWORD=postgres \
  -e POSTGRES_DB=memu \
  -p 5432:5432 \
  pgvector/pgvector:pg16

# Run the test
export OPENAI_API_KEY=your_api_key
cd tests
python test_postgres.py

Both examples demonstrate the complete workflow:

Memorize: Process a conversation file and extract structured memory
Retrieve (RAG): Fast embedding-based search
Retrieve (LLM): Deep semantic understanding search

See tests/test_inmemory.py and tests/test_postgres.py for the full source code.

Custom LLM and Embedding Providers

MemU supports custom LLM and embedding providers beyond OpenAI. Configure them via llm_profiles:

from memu import MemUService

service = MemUService(
    llm_profiles={
        # Default profile for LLM operations
        "default": {
            "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
            "api_key": "your_api_key",
            "chat_model": "qwen3-max",
            "client_backend": "sdk"  # "sdk" or "http"
        },
        # Separate profile for embeddings
        "embedding": {
            "base_url": "https://api.voyageai.com/v1",
            "api_key": "your_voyage_api_key",
            "embed_model": "voyage-3.5-lite"
        }
    },
    # ... other configuration
)

📖 Core APIs

`memorize()` - Extract and Store Memory

Processes input resources and extracts structured memory:

result = await service.memorize(
    resource_url="path/to/file.json",  # File path or URL
    modality="conversation",            # conversation | document | image | video | audio
    user={"user_id": "123"}             # Optional: scope to a user
)

# Returns:
{
    "resource": {...},      # Stored resource metadata
    "items": [...],         # Extracted memory items
    "categories": [...]     # Updated category summaries
}

`retrieve()` - Query Memory

Retrieves relevant memory based on queries. MemU supports two retrieval strategies:

RAG-based Retrieval (`method="rag"`)

Fast embedding vector search using cosine similarity:

✅ Fast: Pure vector computation
✅ Scalable: Efficient for large memory stores
✅ Returns scores: Each result includes similarity score

LLM-based Retrieval (`method="llm"`)

Deep semantic understanding through direct LLM reasoning:

✅ Deep understanding: LLM comprehends context and nuance
✅ Query rewriting: Automatically refines query at each tier
✅ Adaptive: Stops early when sufficient information is found

Comparison

Aspect	RAG	LLM
Speed	⚡ Fast	🐢 Slower
Cost	💰 Low	💰💰 Higher
Semantic depth	Medium	Deep
Tier 2 scope	All items	Only items in relevant categories
Output	With similarity scores	Ranked by LLM reasoning

Both methods support:

Context-aware rewriting: Resolves pronouns using conversation history
Progressive search: Categories → Items → Resources
Sufficiency checking: Stops when enough information is retrieved

Usage

result = await service.retrieve(
    queries=[
        {"role": "user", "content": {"text": "What are their preferences?"}},
        {"role": "user", "content": {"text": "Tell me about work habits"}}
    ],
    where={"user_id": "123"}  # Optional: scope filter
)

# Returns:
{
    "categories": [...],     # Relevant categories (with scores for RAG)
    "items": [...],          # Relevant memory items
    "resources": [...],      # Related raw resources
    "next_step_query": "..." # Rewritten query for follow-up (if applicable)
}

Scope Filtering: Use where to filter by user model fields:

where={"user_id": "123"} - exact match
where={"agent_id__in": ["1", "2"]} - match any in list
Omit where to retrieve across all scopes

📚 For complete API documentation, see SERVICE_API.md - includes all methods, CRUD operations, pipeline configuration, and configuration types.

💡 Use Cases

Example 1: Conversation Memory

Extract and organize memory from multi-turn conversations:

export OPENAI_API_KEY=your_api_key
python examples/example_1_conversation_memory.py

What it does:

Processes multiple conversation JSON files
Extracts memory items (preferences, habits, opinions, relationships)
Generates category markdown files (preferences.md, work_life.md, etc.)

Best for: Personal AI assistants, customer support bots, social chatbots

Example 2: Skill Extraction from Logs

Extract skills and lessons learned from agent execution logs:

export OPENAI_API_KEY=your_api_key
python examples/example_2_skill_extraction.py

What it does:

Processes agent logs sequentially
Extracts actions, outcomes, and lessons learned
De

NevaMind-AI/memU

Deep Analysis

Core Features

分层文件系统

双重检索方法

多模态支持

自适应记忆进化

Technical Implementation

MemU

A Future-Oriented Agentic Memory System

⭐️ Star the repository

✨ Core Features

🗂️ Hierarchical File System

🎨 Multimodal Support

🚀 Quick Start

Option 1: Cloud Version

Cloud API (v3)

Option 2: Self-Hosted

Installation

Basic Example

Custom LLM and Embedding Providers

📖 Core APIs

`memorize()` - Extract and Store Memory

`retrieve()` - Query Memory

RAG-based Retrieval (`method="rag"`)

LLM-based Retrieval (`method="llm"`)

Comparison

Usage

💡 Use Cases

Example 1: Conversation Memory

Example 2: Skill Extraction from Logs

wshobson/agents

ComposioHQ/awesome-claude-skills

code-yeongyu/oh-my-opencode

thedotmack/claude-mem

OthmanAdi/planning-with-files

yusufkaraaslan/Skill_Seekers

🔍Deep Analysis

Core Features

分层文件系统

双重检索方法

多模态支持

自适应记忆进化

🔧Technical Implementation

MemU

A Future-Oriented Agentic Memory System

⭐️ Star the repository

✨ Core Features

🗂️ Hierarchical File System

🎨 Multimodal Support

🚀 Quick Start

Option 1: Cloud Version

Cloud API (v3)

Option 2: Self-Hosted

Installation

Basic Example

Custom LLM and Embedding Providers

📖 Core APIs

memorize() - Extract and Store Memory

retrieve() - Query Memory

RAG-based Retrieval (method="rag")

LLM-based Retrieval (method="llm")

Comparison

Usage

💡 Use Cases

Example 1: Conversation Memory

Example 2: Skill Extraction from Logs

Related Skills

wshobson/agents

ComposioHQ/awesome-claude-skills

code-yeongyu/oh-my-opencode

thedotmack/claude-mem

OthmanAdi/planning-with-files

yusufkaraaslan/Skill_Seekers

Deep Analysis

Technical Implementation

`memorize()` - Extract and Store Memory

`retrieve()` - Query Memory

RAG-based Retrieval (`method="rag"`)

LLM-based Retrieval (`method="llm"`)