NevaMind-AI/memU
Memory infrastructure for LLMs and AI agents
Deep Analysis
MemU是为LLM和AI智能体设计的记忆框架,通过分层文件系统管理多模态数据,支持RAG和LLM双重检索。
Core Features
分层文件系统
三层架构:资源-项-类别,实现完整可追溯的数据组织
双重检索方法
RAG快速检索与LLM深度语义理解相结合
多模态支持
处理对话、文档、图像、音频、视频等多种数据格式
自适应记忆进化
根据使用模式自动调整和优化记忆结构
Technical Implementation
对话、文档、图像等
格式转换、内容解析
使用视觉/NLM模型提取语义信息
构建Resource-Item-Category树状结构
RAG向量搜索或LLM语义匹配
- 支持5种模态(文本、图像、视频、音频、对话)的统一处理框架
- 双检索机制平衡速度(RAG)与理解深度(LLM)
- 自演进能力,记忆结构动态优化适应用户行为
- 完整的云服务方案(memu.so)降低部署门槛
- 个人助手:长期记忆个人偏好、技能、习惯和关系网络
- AI客服系统:累积客户历史信息提供个性化服务
- 知识管理:从多渠道文档提取并组织企业知识库
- 多轮对话系统:为AI智能体提供持久的上下文记忆支撑
- 仅支持Python 3.13+,生态限制
- 多模态处理依赖外部模型性能

MemU is an agentic memory framework for LLM and AI agent backends. It receives multimodal inputs (conversations, documents, images), extracts them into structured memory, and organizes them into a hierarchical file system that supports both embedding-based (RAG) and non-embedding (LLM) retrieval.
⭐️ Star the repository
If you find memU useful or interesting, a GitHub Star ⭐️ would be greatly appreciated.
MemU is collaborating with four open-source projects to launch the 2026 New Year Challenge. 🎉Between January 8–18, contributors can submit PRs to memU and earn cash rewards, community recognition, and platform credits. 🎁Learn more & get involved
✨ Core Features
| Feature | Description |
|---|---|
| 🗂️ Hierarchical File System | Three-layer architecture: Resource → Item → Category with full traceability |
| 🔍 Dual Retrieval Methods | RAG (embedding-based) for speed, LLM (non-embedding) for deep semantic understanding |
| 🎨 Multimodal Support | Process conversations, documents, images, audio, and video |
| 🔄 Self-Evolving Memory | Memory structure adapts and improves based on usage patterns |
🗂️ Hierarchical File System
MemU organizes memory using a three-layer architecture inspired by hierarchical storage systems:
| Layer | Description | Examples |
|---|---|---|
| Resource | Raw multimodal data warehouse | JSON conversations, text documents, images, videos |
| Item | Discrete extracted memory units | Individual preferences, skills, opinions, habits |
| Category | Aggregated textual memory with summaries | preferences.md, work_life.md, relationships.md |
Key Benefits:
- Full Traceability: Track from raw data → items → categories and back
- Progressive Summarization: Each layer provides increasingly abstracted views
- Flexible Organization: Categories evolve based on content patterns
🎨 Multimodal Support
MemU processes diverse content types into unified memory:
| Modality | Input | Processing |
|---|---|---|
conversation |
JSON chat logs | Extract preferences, opinions, habits, relationships |
document |
Text files (.txt, .md) | Extract knowledge, skills, facts |
image |
PNG, JPG, etc. | Vision model extracts visual concepts and descriptions |
video |
Video files | Frame extraction + vision analysis |
audio |
Audio files | Transcription + text processing |
All modalities are unified into the same three-layer hierarchy, enabling cross-modal retrieval.
🚀 Quick Start
Option 1: Cloud Version
Try MemU instantly without any setup:
👉 memu.so - Hosted cloud service with full API access
For enterprise deployment and custom solutions, contact info@nevamind.ai
Cloud API (v3)
| Base URL | https://api.memu.so |
|---|---|
| Auth | Authorization: Bearer YOUR_API_KEY |
| Method | Endpoint | Description |
|---|---|---|
POST |
/api/v3/memory/memorize |
Register a memorization task |
GET |
/api/v3/memory/memorize/status/{task_id} |
Get task status |
POST |
/api/v3/memory/categories |
List memory categories |
POST |
/api/v3/memory/retrieve |
Retrieve memories (semantic search) |
Option 2: Self-Hosted
Installation
pip install -e .
Basic Example
Requirements: Python 3.13+ and an OpenAI API key
Test with In-Memory Storage (no database required):
export OPENAI_API_KEY=your_api_key
cd tests
python test_inmemory.py
Test with PostgreSQL Storage (requires pgvector):
# Start PostgreSQL with pgvector
docker run -d \
--name memu-postgres \
-e POSTGRES_USER=postgres \
-e POSTGRES_PASSWORD=postgres \
-e POSTGRES_DB=memu \
-p 5432:5432 \
pgvector/pgvector:pg16
# Run the test
export OPENAI_API_KEY=your_api_key
cd tests
python test_postgres.py
Both examples demonstrate the complete workflow:
- Memorize: Process a conversation file and extract structured memory
- Retrieve (RAG): Fast embedding-based search
- Retrieve (LLM): Deep semantic understanding search
See tests/test_inmemory.py and tests/test_postgres.py for the full source code.
Custom LLM and Embedding Providers
MemU supports custom LLM and embedding providers beyond OpenAI. Configure them via llm_profiles:
from memu import MemUService
service = MemUService(
llm_profiles={
# Default profile for LLM operations
"default": {
"base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1",
"api_key": "your_api_key",
"chat_model": "qwen3-max",
"client_backend": "sdk" # "sdk" or "http"
},
# Separate profile for embeddings
"embedding": {
"base_url": "https://api.voyageai.com/v1",
"api_key": "your_voyage_api_key",
"embed_model": "voyage-3.5-lite"
}
},
# ... other configuration
)
📖 Core APIs
memorize() - Extract and Store Memory
Processes input resources and extracts structured memory:
result = await service.memorize(
resource_url="path/to/file.json", # File path or URL
modality="conversation", # conversation | document | image | video | audio
user={"user_id": "123"} # Optional: scope to a user
)
# Returns:
{
"resource": {...}, # Stored resource metadata
"items": [...], # Extracted memory items
"categories": [...] # Updated category summaries
}
retrieve() - Query Memory
Retrieves relevant memory based on queries. MemU supports two retrieval strategies:
RAG-based Retrieval (method="rag")
Fast embedding vector search using cosine similarity:
- ✅ Fast: Pure vector computation
- ✅ Scalable: Efficient for large memory stores
- ✅ Returns scores: Each result includes similarity score
LLM-based Retrieval (method="llm")
Deep semantic understanding through direct LLM reasoning:
- ✅ Deep understanding: LLM comprehends context and nuance
- ✅ Query rewriting: Automatically refines query at each tier
- ✅ Adaptive: Stops early when sufficient information is found
Comparison
| Aspect | RAG | LLM |
|---|---|---|
| Speed | ⚡ Fast | 🐢 Slower |
| Cost | 💰 Low | 💰💰 Higher |
| Semantic depth | Medium | Deep |
| Tier 2 scope | All items | Only items in relevant categories |
| Output | With similarity scores | Ranked by LLM reasoning |
Both methods support:
- Context-aware rewriting: Resolves pronouns using conversation history
- Progressive search: Categories → Items → Resources
- Sufficiency checking: Stops when enough information is retrieved
Usage
result = await service.retrieve(
queries=[
{"role": "user", "content": {"text": "What are their preferences?"}},
{"role": "user", "content": {"text": "Tell me about work habits"}}
],
where={"user_id": "123"} # Optional: scope filter
)
# Returns:
{
"categories": [...], # Relevant categories (with scores for RAG)
"items": [...], # Relevant memory items
"resources": [...], # Related raw resources
"next_step_query": "..." # Rewritten query for follow-up (if applicable)
}
Scope Filtering: Use where to filter by user model fields:
where={"user_id": "123"}- exact matchwhere={"agent_id__in": ["1", "2"]}- match any in list- Omit
whereto retrieve across all scopes
📚 For complete API documentation, see SERVICE_API.md - includes all methods, CRUD operations, pipeline configuration, and configuration types.
💡 Use Cases
Example 1: Conversation Memory
Extract and organize memory from multi-turn conversations:
export OPENAI_API_KEY=your_api_key
python examples/example_1_conversation_memory.py
What it does:
- Processes multiple conversation JSON files
- Extracts memory items (preferences, habits, opinions, relationships)
- Generates category markdown files (
preferences.md,work_life.md, etc.)
Best for: Personal AI assistants, customer support bots, social chatbots
Example 2: Skill Extraction from Logs
Extract skills and lessons learned from agent execution logs:
export OPENAI_API_KEY=your_api_key
python examples/example_2_skill_extraction.py
What it does:
- Processes agent logs sequentially
- Extracts actions, outcomes, and lessons learned
- De
Related Skills
wshobson/agents
wshobsonIntelligent automation and multi-agent orchestration for Claude Code
The most comprehensive Claude Code plugin ecosystem, covering full-stack development scenarios with a three-tier model strategy balancing performance and cost.
ComposioHQ/awesome-claude-skills
ComposioHQA curated list of awesome Claude Skills, resources, and tools for customizing Claude AI workflows
The most comprehensive Claude Skills resource list; connect-apps is a killer feature.
code-yeongyu/oh-my-opencode
code-yeongyuThe Best Agent Harness. Meet Sisyphus: The Batteries-Included Agent that codes like you.
Powerful multi-agent coding tool, but note OAuth limitations.
thedotmack/claude-mem
thedotmackA Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future sessions.
A practical solution for Claude's memory issues.
OthmanAdi/planning-with-files
OthmanAdiClaude Code skill implementing Manus-style persistent markdown planning — the workflow pattern behind the $2B acquisition.
Context engineering best practices; an open-source implementation of Manus mode.
yusufkaraaslan/Skill_Seekers
yusufkaraaslanConvert documentation websites, GitHub repositories, and PDFs into Claude AI skills with automatic conflict detection
An automation powerhouse for skill creation, dramatically improving efficiency.

